Analysis Of Features That Affect The Number Of Youtube Subscribers Using The Naive Bayes Classifier Algorithm


Meliyana Rahayu Yoanita Hendry Setiawan Paulus Lucky Tirma Irawan


One of them is technology in the field of communication is a social media platform. Social media Youtube is one of the most widely used social media in Indonesia. The benefits gained by Content Creators or Youtubers come from AdSense. Youtube has several features provided such as likes, dislikes, views and comments (comments with negative or positive sentiments). An automatic classification system for YouTube comment sentiment is needed to classify positive comments and negative comments, while analyzing features that affect the number of subscribers so that Content Creators can find out features that can affect the number of subscribers. In this research a comment sentiment classification system will automatically be created using the Naive Bayes (NB) algorithm so that the process of classifying positive and negative comments can be done easily, the data used in the analysis are 53 Youtube channels with vlog video types. In addition, the data used as classification training data were 4166 positive sentiments and 4166 negative sentiments, after which an analysis of features affecting the number of subscribers was performed using chi square. The results of the analysis with chi square found there are 4 features that have an influence on the number of subscribers, namely the number of views with a chi square value of 23,105, dislike with a chi square value of 13,745, the number of positive sentiment comments with a chi square value of 18,123 and the number of likes with a chi square value of 13,745. The accuracy of the automatic classification system using Naive Bayes (NB) is 81%.


[1]EVANS, W (2016). Analisis Konten Pada Media Sosial Video Youtube Untuk Mendukung Strategi Kampanye Politik Dengan Menggunakan Analisis Sentimen
[2]FEBRUARIYANTI, H., & ZULIARSO, E. (2012). Klasifikasi dokumen berita teks bahasa indonesia menggunakan ontologi. Dinamik, 17(1).
[3]GO, A., HUANG, L., & BHAYANI, R. (2009). Twitter Sentiment Analysis. Final Project Report, Stanford University, Department of Computer Science.
[4]KONCZ, P., & PARALIC, J. (2011). An approach to feature selection for sentiment analysis. In 2011 15th IEEE International Conference on Intelligent Engineering Systems (pp. 357–362).
[5]LIANTONI, F., NUGROHO, H. (2015). Klasifikasi Daun Herbal Menggunakan Metode Naïve Bayes Classifier Dan Knearest.
[6]LESMANA, P. I. (2013). Analisis Sentimen Pengguna Layanan Media Sosial Twitter Di Indonesia.
[7]MONARIZQA, N., NUGROHO, L. E., & HANTONO, B. S. (2014). Penerapan Analisis Sentimen Pada Twitter Berbahasa Indonesia Sebagai Pemberi Rating. Jurnal Penelitian Teknik Elektro dan Teknologi Informasi, 1(3).
[8]NEIGHBOR. LUQYANA, W. A., CHOLISSODIN, I., PERDANA, R. S., (2018). Analisis Sentimen Cyberbullying pada Komentar Instagram dengan Metode Klasifikasi Support Vector Machine.
[9]PRASETYO, E. (2012), Data Mining konsep dan Aplikasi menggunakan Matlab.
[10]RETNAWIYATI, E., M.KOM, F. M., & NEGARA, M.KOM, E. S. (2016). Analisis Sentimen Pada Data Twitter dengan Menggunakan Text Mining terhadap Suatu Produk.
[11]VIJAYARANI, S., & DHAYANAND, S. (2015). Data Mining Classification Algorithms for Kidney Disease Prediction.
[12]International Journal on Cybernetics & Informatics (IJCI), 4(4), 13-25.
[13]HARTANTO S. (2018), Motif Subscriber Menonton Channel Youtube (Studi Deskriptif Kualitatif Motif Subscriber Menonton Channel Youtube Presiden Jokowi).