Classification of Fake and Real using Vision Transformer and EfficientNet-B0 on Real and AI-generated Images

##plugins.themes.bootstrap3.article.main##

M. Syahrul Anwar Aria Cepy Slamet Muhammad Deden Firdaus

Abstract

Advances in artificial intelligence (AI) technology have enabled the creation of synthetic images that resemble real images, posing challenges in detecting and classifying such images. This study aims to develop an EfficientNet-B0 and Vision Transformer (ViT) based classification model to distinguish between real images and images generated by generative AI. The data used consists of 30,401 original images from the MSCOCO 2017 dataset and 30,401 generative AI-generated images from the SyntheticEye AI-Generated Images Dataset on Kaggle. The results showed that the ViT model achieved 98% accuracy and EfficientNet-B0 achieved 96% accuracy in classifying the images. The conclusion of this research is that both models have great potential in detecting digital media manipulation, with ViT showing superior performance. The practical implication of this research is the development of more advanced technologies for detecting generative images, which can be used in various real applications such as digital security and media verification.

##plugins.themes.bootstrap3.article.details##

Section
Articles
References
[1] Keith McAleer, “AI is the New Electricity’: Insights from Dr. Andrew Ng - UC Berkeley Sutardja Center,” Berkeley SCET . Accessed: Mar. 21, 2025. [Online]. Available: https://scet.berkeley.edu/ai-is-the-new-electricity-insights-from-dr-andrew-ng/
[2] I. R. Dewi, “Foto Mesum Taylor Swift Buatan AI Viral, Begini Kronologinya,” CNBC Indonesia. Accessed: Feb. 29, 2024. [Online]. Available: https://www.cnbcindonesia.com/tech/20240129113341-37-509781/foto-mesum-taylor-swift-buatan-ai-viral-begini-kronologinya
[3] A. Clark, “A real photo took two honors in an AI competition. Here’s the inside story. - CBS News,” CBS NEWS. Accessed: Sep. 22, 2025. [Online]. Available: https://www.cbsnews.com/news/real-photo-ai-competition-flamingone-miles-astray/
[4] A. Pocol, L. Istead, S. Siu, S. Mokhtari, and S. Kodeiri, “Seeing is No Longer Believing: A Survey on the State of Deepfakes, AI-Generated Humans, and Other Nonveridical Media,” 2024, pp. 427–440. doi: 10.1007/978-3-031-50072-5_34.
[5] J. Mu, M. Adrezo, and A. N. Haikal, “Identifikasi Wajah Asli dan Buatan Deepfake Menggunakan Metode Convolutional Neural Network,” Teknika, vol. 13, no. 1, pp. 45–50, Jan. 2024, doi: 10.34148/teknika.v13i1.705.
[6] R. A. Prawiratama, S. Sumarno, and I. A. Kautsar, “RANCANG BANGUN APLIKASI UJI KEMIRIPAN GAMBAR AI GENERATIVE DAN GAMBAR BUATAN TANGAN MENGGUNAKAN METODE DEEP LEARNING,” Jurnal Teknik Informasi dan Komputer (Tekinkom), vol. 7, no. 1, p. 114, Jun. 2024, doi: 10.37600/tekinkom.v7i1.1192.
[7] J. J. Bird and A. Lotfi, “CIFAKE: Image Classification and Explainable Identification of AI-Generated Synthetic Images,” IEEE Access, vol. 12, pp. 15642–15650, 2024, doi: 10.1109/ACCESS.2024.3356122.
[8] A. Raza, K. Munir, and M. Almutairi, “A Novel Deep Learning Approach for Deepfake Image Detection,” Applied Sciences, vol. 12, no. 19, p. 9820, Sep. 2022, doi: 10.3390/app12199820.
[9] M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” ArXiv, 2019.
[10] D. Putri Ayuni, Jasril, M. Irsyad, F. Yanto, and S. Sanjaya, “AUGMENTASI DATA PADA IMPLEMENTASI CONVOLUTIONAL NEURAL NETWORK ARSITEKTUR EFFICIENTNET-B3 UNTUK KLASIFIKASI PENYAKIT DAUN PADI,” ZONAsi: Jurnal Sistem Informasi, vol. 5, no. 2, pp. 239–249, May 2023, doi: 10.31849/zn.v5i2.13874.
[11] S. Aras, A. Setyanto, and Rismayani, “Deep Learning Untuk Klasifikasi Motif Batik Papua Menggunakan EfficientNet dan Trasnfer Learning,” Insect (Informatics and Security): Jurnal Teknik Informatika, vol. 8, no. 1, pp. 11–20, Oct. 2022, doi: 10.33506/insect.v8i1.1865.
[12] I. Rizka Fadhillah, M. Muharrom Al Haromainy, and H. Maulana, “IMPLEMENTASI MODEL TRANSFER LEARNING EFFICIENTNET UNTUK PENDETEKSIAN BAHASA ISYARAT INDONESIA (BISINDO) PADA PERANGKAT ANDROID,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 8, no. 4, pp. 7816–7822, Aug. 2024, doi: 10.36040/jati.v8i4.10463.
[13] D. Adam and H. Santoso, “IMAGE CLASSIFICATION OF HOUSEHOLD BENEFICIARIES OF DIRECT CASH ASSISTANCE USING EFFICIENTNET IN DKI JAKARTA PROVINCE,” Jurnal Teknik Informatika (Jutif), vol. 5, no. 4, pp. 665–671, Aug. 2024, doi: 10.52436/1.jutif.2024.5.4.2121.
[14] A. N. Fajrina, Z. H. Pradana, S. I. Purnama, and S. Romadhona, “Penerapan Arsitektur EfficientNet-B0 Pada Klasifikasi Leukimia Tipe Acute Lymphoblastik Leukimia,” Jurnal Riset Rekayasa Elektro, vol. 6, no. 1, p. 59, Jun. 2024, doi: 10.30595/jrre.v6i1.22090.
[15] W. R. PERDANI, R. MAGDALENA, and N. K. CAECAR PRATIWI, “Deep Learning untuk Klasifikasi Glaukoma dengan menggunakan Arsitektur EfficientNet,” ELKOMIKA: Jurnal Teknik Energi Elektrik, Teknik Telekomunikasi, & Teknik Elektronika, vol. 10, no. 2, p. 322, Apr. 2022, doi: 10.26760/elkomika.v10i2.322.
[16] A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=YicbFdNTTy
[17] R. Uthama, Yuhandri, and Billy Hendrik, “Vision Transformer untuk Identifikasi 15 Variasi Citra Ikan Koi,” Jurnal CoSciTech (Computer Science and Information Technology), vol. 5, no. 1, pp. 159–168, May 2024, doi: 10.37859/coscitech.v5i1.6711.
[18] A. Pangestu, B. Purnama, and R. Risnandar, “Vision Transformer untuk Klasifikasi Kematangan Pisang,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 11, no. 1, pp. 75–84, Feb. 2024, doi: 10.25126/jtiik.20241117389.
[19] T. Febriyanto and S. Syofian, “Implementasi Deep Learning Menggunakan Vision Transformer Untuk Klasifikasi Penyakit Daun Padi,” Journal TIFDA (Technology Information and Data Analytic), vol. 1, no. 2, pp. 34–39, Dec. 2024, doi: 10.70491/tifda.v1i2.47.
[20] J. A. Figo, N. Yudistira, and A. W. Widodo, “Deteksi Covid-19 dari Citra X-ray menggunakan Vision Transformer,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 7, no. 3, 2023.
[21] I. C. Sukandar, F. Tri Anggraeny, and M. Hanindia Prami Swari, “Effect of Optimisation in Brain Tumour Classification with CNN-VIT Hybrid,” Antivirus : Jurnal Ilmiah Teknik Informatika, vol. 18, no. 1, pp. 112–124, Jun. 2024, doi: 10.35457/antivirus.v18i1.3557.
[22] COCO, “COCO - Common Objects in Context,” COCO. Accessed: Nov. 24, 2024. [Online]. Available: https://cocodataset.org/#home
[23] J. Heldt, “SyntheticEye AI-Generated Images Dataset,” Kaggle. Accessed: Nov. 05, 2024. [Online]. Available: https://www.kaggle.com/datasets/jacobheldt/syntheticeue-ai-generated-images-dataset
[24] A. A. Handoko, M. A. Rosid, and U. Indahyanti, “Implementasi Convolutional Neural Network (CNN) Untuk Pengenalan Tulisan Tangan Aksara Bima,” SMATIKA JURNAL, vol. 14, no. 01, pp. 96–110, Jul. 2024, doi: 10.32664/smatika.v14i01.1196.
[25] EVIDENTY AI, “How to interpret a confusion matrix for a machine learning model,” EVIDENTY AI. Accessed: Nov. 16, 2024. [Online]. Available: https://www.evidentlyai.com/classification-metrics/confusion-matrix