Metrika članka

  • citati u SCindeksu: 0
  • citati u CrossRef-u:0
  • citati u Google Scholaru:[=>]
  • posete u prethodnih 30 dana:9
  • preuzimanja u prethodnih 30 dana:8
članak: 1 od 1  
Telfor Journal
2018, vol. 10, br. 2, str. 123-128
jezik rada: engleski
vrsta rada: neklasifikovan
doi:10.5937/telfor1802123A


Automatic complaint classification system using classifier ensembles
(naslov ne postoji na srpskom)
Brawijaya University, Faculty of Computer Science, Malang, Indonesia

e-adresa: moch.ali.fauzi@ub.ac.id

Sažetak

(ne postoji na srpskom)
Sambat Online is an online complaint system run by the city government of Malang, Indonesia. Because most citizens do not know to which work units (Satuan Kerja Pemerintah Daerah [SKPDs]) their complaints should be sent, the system administrator must manually sort and classify all of the incoming complaints with respect to the appropriate SKPDs. This study empirically evaluated the application of an automated system to replace the manual classification process. The experiments, which used Sambat Online data, involved five individual classification algorithms- Naïve Bayes, Maximum Entropy, K-Nearest Neighbors, Random Forest, and Support Vector Machines-and two ensemble strategies-hard voting and soft voting. The results show that the Multinomial Naïve Bayes classifier achieved the best performance, an 80.7% accuracy value, of the five individual classifiers. The results also indicate that generally all of the ensemble methods performed better than the individual classifiers. Almost all of them had the same accuracy level of 81.2%. In addition, the soft voting strategy had slightly higher accuracy than the hard one when all five classifiers were used. However, when the three best classifier combinations were used, both had the same level of accuracy.

Ključne reči

Ensemble Learning; E-Government; Machine Learning; Hard Voting; Soft Voting; Complaint classification

Reference

Naknadno pridodat članak: provera, normiranje i linkovanje referenci u toku.
M. A. Hearst, 'Untangling text data mining,' Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, pp. 3-10, 1999.
M. A. M. García, R. P. Rodríguez, M. V. Ferro, and L. A. Rifón, 'Wikipedia-Based Hybrid Document Representation for Textual News Classification,' Proceedings of the 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI), pp. 148-153, 2016.
K. Watanabe, 'Newsmap: A semi-supervised approach to geographical news classification,' Digital Journalism, pp. 1-16, 2017.
M. A. Fauzi, A. Z. Arifin, and A. Yuniarti, 'Arabic Book Retrieval using Class and Book Index Based Term Weighting,' International Journal of Electrical and Computer Engineering (IJECE), vol. 7, no. 6, pp.3705-3710, 2017.
M. A. Fauzi, A. Arifin, and A. Yuniarti, 'Term Weighting Berbasis Indeks Buku dan Kelas untuk Perangkingan Dokumen Berbahasa Arab,' Lontar Komputer: Jurnal Ilmiah Teknologi Informasi, vol 5, no. 2,pp. 435-442, 2014.
E. S. Pramukantoro and M. A. Fauzi, 'Comparative analysis of string similarity and corpus-based similarity for automatic essay scoring system on e-learning gamification,' Proceedings of the 2016 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 149-155, 2016.
M. A. Fauzi, D. C. Utomo, E. S. Pramukantoro, and B. D. Setiawan, 'Automatic Essay Scoring System Using N-Gram and Cosine Similarity for Gamification Based E-Learning,' Proceedings of the International Conference on Advances in Image Processing (ICAIP), pp. 151-155, 2017.
N. Shelke, S. Deshpande, and V. Thakare, 'Domain independent approach for aspect oriented sentiment analysis for product reviews,' Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, pp. 651-659, 2017.
J. Wehrmann, W. Becker, H. E. Cagnini, and R. C. Barros, 'A character-based convolutional neural network for language-agnostic Twitter sentiment analysis,' Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2384-2391, 2017.
M. A. Fauzi, R. F. N. Firmansyah, T. Afirianto, 'Improving sentiment analysis of short informal Indonesian product reviews using synonym based feature expansion,' Telkomnika (Telecommunication Computing Electronics and Control), vol. 16, no. 3, pp. 1345-1350, 2018.
A. N. Nguyen, M. J. Lawley, D. P. Hansen et al., 'Symbolic rulebased classification of lung cancer stages from free-text pathology reports,' Journal of the American Medical Informatics Association, vol. 17, no. 4, pp. 440-445, 2010.
J. J. G. Adeva, J. M. P. Atxa, M. U. Carrillo, and E. A. Zengotitabengoa, 'Automatic text classification to support systematic reviews in medicine,' Expert Systems with Applications, vol. 41, no. 4, pp. 1498-1508, 2014.
O. D. Vel, A. Anderson, M. Corney, and G. Mohay, 'Mining e-mail content for author identification forensics,' ACM Sigmod Record, vol. 30, no. 4, pp.55-64, 2001.
I. Pop, 'An approach of the Naive Bayes classifier for the document classification,' General Mathematics, vol. 14, no. 4, pp. 135-138, 2016.
A. M. El-Halees, 'Arabic text classification using maximum entropy,' IUG Journal of Natural Studies, vol. 15, no. 1, pp. 157- 167, 2015.
A. K. Nikhath, K. Subrahmanyam, and R. Vasavi, 'Building a KNearest Neighbor Classifier for Text Categorization,' International Journal of Computer Science and Information Technologies vol. 7, no. 1 pp. 254-256, 2016.
Q. Wu, Y. Ye, H. Zhang, M. K. Ng, and S. Ho, 'ForesTexter: an efficient random forest algorithm for imbalanced text categorization,' Knowledge-Based Systems, vol. 67, pp. 105-116, 2014.
M. A. Fauzi, 'Random Forest Approach for Sentiment Analysis in Indonesian Language,' Indonesian Journal of Electrical Engineering and Computer Science vol. 12, no.1. 2018
B. S. Kumar and V. Ravi, 'Text Document Classification with PCA and One-Class SVM,' Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, pp. 107-115, 2017.
L. Li, Y. Zhang, L. Zou et al., 'An ensemble classifier for eukaryotic protein subcellular location prediction using gene ontology categories and amino acid hydrophobicity,' PLoS One vol. 7, no. 1, Article ID e31057.
B. S. Kumar and V. Ravi, 'Text Document Classification with PCA and One-Class SVM,' Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, pp. 107-115, 2017.
J. Kittler, 'Multiple classifier systems,' Soft Computing Approach to Pattern Recognition and Image Processing, pp. 3-22, 2002.
L. S. Larkey, and W. B. Croft, 'Combining classifiers in text categorization,' Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 289-297, 1996.
Y. Dong, and K. Han, 'A comparison of several ensemble methods for text categorization,' Proceedings of the 2004 IEEE International Conference on Services Computing (SCC 2004), pp. 419-422, 2004.
K. F. H. Holle, A. Z. Arifin, and D. Purwitasari, 'Preference based term weighting for arabic fiqh document ranking,' Jurnal Ilmu Komputer dan Informasi, vol. 8, no. 1, pp. 45-52, 2015.
G. Salton, and C. Buckley, 'Term-weighting approaches in automatic text retrieval,' Information processing & management, vol. 24, no. 5, pp. 513-523, 1988.
A. Goel, J. Gautam, and S. Kumar, 'Real time sentiment analysis of tweets using Naive Bayes,' Proceedings of the 2nd International Conference on Next Generation Computing Technologies (NGCT), pp. 257-261, 2016.
M. A. Fauzi, A. Z. Arifin, and S. C. Gosaria. 'Indonesian News Classification Using Naïve Bayes and Two-Phase Feature Selection Model,' Indonesian Journal of Electrical Engineering and Computer Science, vol. 8, no. 3, pp. 610-615, 2017.
A. McCallum, and K. Nigam. 'A comparison of event models for naive bayes text classification,' AAAI-98 workshop on learning for text categorization, vol. 752, pp. 41-48, 1998.
A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra. 'A maximum entropy approach to natural language processing,' Computational linguistics, vol. 22, no. 1, pp. 39-71, 1996.
L. Breiman, 'Random forests,' Machine learning, vol. 45, no. 1, pp. 5-32, 2001.
B. Liu, 'Web data mining: exploring hyperlinks, contents, and usage data,' Springer Science & Business Media, 2007.
C. J. C. Burges, 'A tutorial on support vector machines for pattern recognition,' Data mining and knowledge discovery, vol. 2, no. 2, pp. 121-167, 1998.
Z. Zhou, 'Ensemble methods: foundations and algorithms,' CRC press, 2012.
M. A. Fauzi, A. Yuniarti, 'Ensemble method for indonesian twitter hate speech detection,' Indonesian Journal of Electrical Engineering and Computer Science, vol. 11, no. 1, pp. 294-299, 2018.
T. G. Dietterich, 'Ensemble methods in machine learning,' International workshop on multiple classifier systems, pp. 1-15, 2000.
N. F. F. Da Silva, E. R. Hruschka, and E. R. Hruschka, 'Tweet sentiment analysis with classifier ensembles,' Decision Support Systems vol. 66, pp. 170-179, 2014.
F. Pedregosa, G. Varoquaux, A. Gramfort et al., 'Scikit-learn: Machine learning in Python,' Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
Y, Yang, and X. Liu, 'A re-examination of text categorization methods,' Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 42-49, 1999.