Viseme Identification For Clustering-Based Madurese Phonemes Based on Facial Landmark Points

##plugins.themes.bootstrap3.article.main##

Pyepit Rinekso Andriyanto Joan San Endang Setyati

Abstract

The most effective form of language in communicating is spoken or spoken language. When speaking humans will move their mouth and lips to say certain words. This mouth and lip movement model describes a viseme (visual-phonem), namely a group of phonemes that have a visual or almost the same appearance. Madurese language is a unique language and has certain characteristics. In addition to having a language level, Madurese has aspirated phonemes or exhaled word pronunciations such as: /bh/, /dh/, /Dh/, /gh/ and /jh/ which do not exist in other languages. This research discusses the identification of viseme classes for Madurese phonemes based on clustering based on facial landmark points. Of the 47 Madurese language phonemes, 9 Madurese language visemes were obtained from the K-Means clustering process. The clustering process uses feature extraction based on facial landmark points so that the distance calculation for each feature is obtained. The features used are geometric features. The Madurese viseme model is used to build 2D mouth animations in uttering Madurese words or sentences based on input in the form of text. The benefit of this research is for learning purposes in pronouncing Madurese words or sentences, because Madurese has different writing and pronunciation.

##plugins.themes.bootstrap3.article.details##

Section
Articles
References
Bear, H. L., & Harvey, R. (2016). Decoding visemes: Improving machine lip-reading Helen L. Bear and Richard Harvey. In Icassp 2016, 2009–2013.
Bozkurt, E., Erdem, Ç. E., Erzin, E., Erdem, T., & Özkan, M. (2007). Comparison of phoneme and viseme based acoustic units for speech driven realistic lip animation. In Proceedings of 3DTVCON. https ://doi.org/10.1109/3DTV.2007.43794 17
Luca Cappelletta and Naomi Harte, “Phoneme-to Viseme Mapping for Visual Speech Recognition,” Proceeding of the 2012 International Conference on Pattern Recognition Applications and Methods (ICPRAM 2012), February 7, 2012.
Setyati, E., Sumpeno, S., Purnomo, M. H., Mikami, K., Kakimoto, M., & Kondo, K. (2015). Phoneme-viseme mapping for Indonesian language based on blend shape animation. IAENG International Journal of Computer Science,42(3), 1–12.
Arifin, Muljono, Sumpeno, S., & Hariadi, M. 2013. Towards Building Indonesian Viseme: A Clustering-Based Approach. IEEE. 978-1-4673-6053-1/ 3/$31.00.
Kumar, K. T. Bibish, Kumar, R. K. S., Sandesh, E. P. A., Sourabh, S., & Lajish, V. L. 2019. Viseme set identification from Malayalam phonemes and allophones. International Journal of Speech Technology. https://doi.org/10.1007/s10772-019-09655-0
Brahme, A., & Bhadade, U. (2017). Phoneme visem mapping for Marathi language using linguistic approach. In ProceedingsInternational conference on global trends in signal processing, information computing and communication, ICGTSPICC 2016 (pp. 152–157).
Damien, P., Wakim, N., & Egéa, M. (2009). Phoneme-viseme mapping for modern, classical arabic language. In 2009 international conference on advances in computational tools for engineering applications, ACTEA 2009 (Vol. 2(1), pp. 547–552)
Bianca Aschenberner and Christian Weiss, “Phoneme-Viseme Mapping for German Video-Realistic Audio-Visual-Speech- Synthesis,” IKP-Working Paper NF 11, Institut für Kommunikations for Schung und Phonetik, Universität Bonn, 2005.
Pawitra, A. Kamus Lengkap Bahasa Madura Indonesia. Jakarta: Dian Rakyat. 2009.
Gleason, H. A. 1962. An Introduction to Descriptive Linguistics. New York: Holt, Reinehart and Winston
Han, S., Yang, Z., Li, Q., & Chen, Y. (2019). Deformed landmark fitting for sequential faces. Journal of Visual Communication and Image Representation, 381 - 393.
Z. Zhang, M. J. Lyons, M. Schuster, dan S. Akamatus, “Comparison between geometry-based and Gabor-wavelets-based facial expression recognition using multi-layer perceptron,” IEEE International Conference on Automatic Face & Gesture Recognition, no. November 2012, 1998.