Application Of K-Nearest Neighbor Algoritma for Customer Review Sentiment Analysis at Ngeboel Vapestore Shop
##plugins.themes.bootstrap3.article.main##
Abstract
This study applies the K-Nearest Neighbor (K-NN) algorithm to classify customer sentiments from online reviews about Ngeboel Vapestore, a local MSME in the vape industry. A total of 175 reviews from Google Review and Instagram were processed using standard NLP techniques and TF-IDF for feature extraction. The best K-NN model (k=3) achieved 85.4% accuracy. Although Logistic Regression achieved higher accuracy (92.6%), it failed to detect negative sentiment. The findings highlight the potential and limitations of K-NN for sentiment analysis in underexplored MSME contexts like vape retail. The study recommends further model improvements and broader MSME applications.
##plugins.themes.bootstrap3.article.details##
Section
Articles
References
[1] J. Zou and H. Li, “Precise Marketing of E‐Commerce Products Based on KNN Algorithm,” Comput Intell Neurosci, vol. 2022, no. 1, p. 4966439, 2022.
[2] K. Chen, J. Jin, and J. Luo, “Big consumer opinion data understanding for Kano categorization in new product development,” J Ambient Intell Humaniz Comput, pp. 1–20, 2022.
[3] N. W. Purnawati et al., Sistem Informasi: Teori dan Implementasi Sistem Informasi di berbagai Bidang. PT. Sonpedia Publishing Indonesia, 2024.
[4] K. Naithani and Y. P. Raiwani, “Realization of natural language processing and machine learning approaches for text‐based sentiment analysis,” Expert Syst, vol. 40, no. 5, p. e13114, 2023.
[5] N. Rezki, M. Mansouri, and R. Oucheikh, “Deciphering Customer Satisfaction: A Machine Learning-Oriented Method Using Agglomerative Clustering for Predictive Modeling and Feature Selection,” Management Systems in Production Engineering, 2025.
[6] V. P. Ramadhan and G. M. Namung, “Klasterisasi Komentar Cyberbullying Masyarakat di Instagram berdasarkan K-Means Clustering,” J-INTECH, vol. 11, no. 1, pp. 32–39, Jul. 2023, doi: 10.32664/j-intech.v11i1.846.
[7] A. F. N. Azizah and V. P. Ramadhan, “Komparasi Naïve Bayes dan K-NN Dalam Analisis Sentimen di Twitter Terhadap Kemenangan Paslon 02,” J-INTECH, vol. 12, no. 02, pp. 228–237, Dec. 2024, doi: 10.32664/j-intech.v12i02.1305.
[8] R. K. Halder, M. N. Uddin, M. A. Uddin, S. Aryal, and A. Khraisat, “Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications,” J Big Data, vol. 11, no. 1, p. 113, 2024.
[9] S. J. Basha, S. R. Madala, K. Vivek, E. S. Kumar, and T. Ammannamma, “A review on imbalanced data classification techniques,” in 2022 International conference on advanced computing technologies and applications (ICACTA), IEEE, 2022, pp. 1–6.
[10] S. Zhang, “Challenges in KNN classification,” IEEE Trans Knowl Data Eng, vol. 34, no. 10, pp. 4663–4675, 2021.
[11] C. P. Chai, “Comparison of text preprocessing methods,” Nat Lang Eng, vol. 29, no. 3, pp. 509–553, 2023.
[12] W. Ahmad, H. U. Khan, F. K. Alarfaj, and M. Alreshoodi, “Aspect-Base Sentiment Analysis: A Comprehensive Review and Open Research Challenges,” IEEE Access, 2025.
[13] M. Alfreihat, O. S. Almousa, Y. Tashtoush, A. AlSobeh, K. Mansour, and H. Migdady, “Emo-SL framework: emoji sentiment lexicon using text-based features and machine learning for sentiment analysis,” IEEE Access, vol. 12, pp. 81793–81812, 2024.
[14] I. Safder et al., “Sentiment analysis for Urdu online reviews using deep learning models,” Expert Syst, vol. 38, no. 8, p. e12751, 2021.
[15] K. Kangas, “Text analysis of handwritten production deviations,” 2021, Turku: Master of Science Thesis.
[16] A. Fitri, N. Azizah, and V. P. Ramadhan, “Komparasi Naïve Bayes dan K-NN Dalam Analisis Sentimen di Twitter Terhadap Kemenangan Paslon 02”.
[17] F. Itoo, Meenakshi, and S. Singh, “Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection,” International Journal of Information Technology, vol. 13, no. 4, pp. 1503–1511, 2021.
[18] R. G. Poola and L. Pl, “COVID-19 diagnosis: A comprehensive review of pre-trained deep learning models based on feature extraction algorithm,” Results in Engineering, vol. 18, p. 101020, 2023.
[19] A. Shokrzade, M. Ramezani, F. A. Tab, and M. A. Mohammad, “A novel extreme learning machine based kNN classification method for dealing with big data,” Expert Syst Appl, vol. 183, p. 115293, 2021.
[20] B. Al Sari et al., “Sentiment analysis for cruises in Saudi Arabia on social media platforms using machine learning algorithms,” J Big Data, vol. 9, no. 1, p. 21, 2022.
[21] M. Rezapour, “Sentiment classification of skewed shoppers’ reviews using machine learning techniques, examining the textual features,” Engineering Reports, vol. 3, no. 1, p. e12280, 2021.
[22] N. S. M. Nafis and S. Awang, “An enhanced hybrid feature selection technique using term frequency-inverse document frequency and support vector machine-recursive feature elimination for sentiment classification,” Ieee Access, vol. 9, pp. 52177–52192, 2021.
[23] H. Q. Abonizio, E. C. Paraiso, and S. Barbon, “Toward Text Data Augmentation for Sentiment Analysis,” IEEE Transactions on Artificial Intelligence, vol. 3, no. 5, pp. 657–668, 2022, doi: 10.1109/TAI.2021.3114390.
[24] T. Mahmud, M. Ptaszynski, J. Eronen, and F. Masui, “Cyberbullying detection for low-resource languages and dialects: Review of the state of the art,” Inf Process Manag, vol. 60, no. 5, p. 103454, 2023.
[2] K. Chen, J. Jin, and J. Luo, “Big consumer opinion data understanding for Kano categorization in new product development,” J Ambient Intell Humaniz Comput, pp. 1–20, 2022.
[3] N. W. Purnawati et al., Sistem Informasi: Teori dan Implementasi Sistem Informasi di berbagai Bidang. PT. Sonpedia Publishing Indonesia, 2024.
[4] K. Naithani and Y. P. Raiwani, “Realization of natural language processing and machine learning approaches for text‐based sentiment analysis,” Expert Syst, vol. 40, no. 5, p. e13114, 2023.
[5] N. Rezki, M. Mansouri, and R. Oucheikh, “Deciphering Customer Satisfaction: A Machine Learning-Oriented Method Using Agglomerative Clustering for Predictive Modeling and Feature Selection,” Management Systems in Production Engineering, 2025.
[6] V. P. Ramadhan and G. M. Namung, “Klasterisasi Komentar Cyberbullying Masyarakat di Instagram berdasarkan K-Means Clustering,” J-INTECH, vol. 11, no. 1, pp. 32–39, Jul. 2023, doi: 10.32664/j-intech.v11i1.846.
[7] A. F. N. Azizah and V. P. Ramadhan, “Komparasi Naïve Bayes dan K-NN Dalam Analisis Sentimen di Twitter Terhadap Kemenangan Paslon 02,” J-INTECH, vol. 12, no. 02, pp. 228–237, Dec. 2024, doi: 10.32664/j-intech.v12i02.1305.
[8] R. K. Halder, M. N. Uddin, M. A. Uddin, S. Aryal, and A. Khraisat, “Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications,” J Big Data, vol. 11, no. 1, p. 113, 2024.
[9] S. J. Basha, S. R. Madala, K. Vivek, E. S. Kumar, and T. Ammannamma, “A review on imbalanced data classification techniques,” in 2022 International conference on advanced computing technologies and applications (ICACTA), IEEE, 2022, pp. 1–6.
[10] S. Zhang, “Challenges in KNN classification,” IEEE Trans Knowl Data Eng, vol. 34, no. 10, pp. 4663–4675, 2021.
[11] C. P. Chai, “Comparison of text preprocessing methods,” Nat Lang Eng, vol. 29, no. 3, pp. 509–553, 2023.
[12] W. Ahmad, H. U. Khan, F. K. Alarfaj, and M. Alreshoodi, “Aspect-Base Sentiment Analysis: A Comprehensive Review and Open Research Challenges,” IEEE Access, 2025.
[13] M. Alfreihat, O. S. Almousa, Y. Tashtoush, A. AlSobeh, K. Mansour, and H. Migdady, “Emo-SL framework: emoji sentiment lexicon using text-based features and machine learning for sentiment analysis,” IEEE Access, vol. 12, pp. 81793–81812, 2024.
[14] I. Safder et al., “Sentiment analysis for Urdu online reviews using deep learning models,” Expert Syst, vol. 38, no. 8, p. e12751, 2021.
[15] K. Kangas, “Text analysis of handwritten production deviations,” 2021, Turku: Master of Science Thesis.
[16] A. Fitri, N. Azizah, and V. P. Ramadhan, “Komparasi Naïve Bayes dan K-NN Dalam Analisis Sentimen di Twitter Terhadap Kemenangan Paslon 02”.
[17] F. Itoo, Meenakshi, and S. Singh, “Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection,” International Journal of Information Technology, vol. 13, no. 4, pp. 1503–1511, 2021.
[18] R. G. Poola and L. Pl, “COVID-19 diagnosis: A comprehensive review of pre-trained deep learning models based on feature extraction algorithm,” Results in Engineering, vol. 18, p. 101020, 2023.
[19] A. Shokrzade, M. Ramezani, F. A. Tab, and M. A. Mohammad, “A novel extreme learning machine based kNN classification method for dealing with big data,” Expert Syst Appl, vol. 183, p. 115293, 2021.
[20] B. Al Sari et al., “Sentiment analysis for cruises in Saudi Arabia on social media platforms using machine learning algorithms,” J Big Data, vol. 9, no. 1, p. 21, 2022.
[21] M. Rezapour, “Sentiment classification of skewed shoppers’ reviews using machine learning techniques, examining the textual features,” Engineering Reports, vol. 3, no. 1, p. e12280, 2021.
[22] N. S. M. Nafis and S. Awang, “An enhanced hybrid feature selection technique using term frequency-inverse document frequency and support vector machine-recursive feature elimination for sentiment classification,” Ieee Access, vol. 9, pp. 52177–52192, 2021.
[23] H. Q. Abonizio, E. C. Paraiso, and S. Barbon, “Toward Text Data Augmentation for Sentiment Analysis,” IEEE Transactions on Artificial Intelligence, vol. 3, no. 5, pp. 657–668, 2022, doi: 10.1109/TAI.2021.3114390.
[24] T. Mahmud, M. Ptaszynski, J. Eronen, and F. Masui, “Cyberbullying detection for low-resource languages and dialects: Review of the state of the art,” Inf Process Manag, vol. 60, no. 5, p. 103454, 2023.