Comparison of 80-20 Split and Walk-Forward Validation Techniques in Predicting COVID-19 Cases in Indonesia using the ARIMA Model.

##plugins.themes.bootstrap3.article.main##

Divanda Arya Inasta Asrul Arief Andy Soebroto

Abstract

This study presents a comparative analysis of the 80-20 split and walk-forward validation techniques for forecasting daily COVID-19 cases in Indonesia using the ARIMA model. Building on previous research, the ARIMA model has proven effective in various epidemiological contexts; however, this study highlights the critical importance of selecting the appropriate validation technique. The study uses data from January 3, 2020, to October 18, 2023, to develop a predictive model evaluated using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The findings indicate that the walk-forward validation technique outperforms the 80-20 split, with MAE of 137.32 and RMSE of 198.23, compared to the 80-20 split MAE of 4190.92 and RMSE of 4479.15. These results suggest that walk-forward validation provides more accurate and reliable predictions, particularly for dynamic and non-stationary data scenarios. This study underscores the significant impact of validation technique selection on ARIMA model performance, contributing new insights into forecasting methodologies in epidemiology.

##plugins.themes.bootstrap3.article.details##

Section
Articles
References
Aji, B. S., Indwiarti, & Rohmawati, A. A. (2021). Forecasting Number of COVID-19 Cases in Indonesia with ARIMA and ARIMAX Models. 2021 9th International Conference on Information and Communication Technology (ICoICT), 71–75. https://doi.org/10.1109/ICoICT52021.2021.9527453
Benvenuto, D., Giovanetti, M., Vassallo, L., Angeletti, S., & Ciccozzi, M. (2020). Application of the ARIMA model on the COVID-2019 epidemic dataset. Data in Brief, 29, 105340. https://doi.org/10.1016/j.dib.2020.105340
de Araújo Morais, L. R., & da Silva Gomes, G. S. (2022). Forecasting daily Covid-19 cases in the world with a hybrid ARIMA and neural network model. Applied Soft Computing, 126, 109315. https://doi.org/10.1016/j.asoc.2022.109315
Hasri, H., Mohd Aris, S. A., & Ahmad, R. (2023). Comparison of Auto ARIMA and Auto SARIMA Performance in COVID-19 Prediction. 2023 IEEE 2nd National Biomedical Engineering Conference (NBEC), 106–110. https://doi.org/10.1109/NBEC58134.2023.10352616
Ismail, L., Alhmoudi, S., & Alkatheri, S. (2020). Time Series Forecasting of COVID-19 Infections in United Arab Emirates using ARIMA. 2020 International Conference on Computational Science and Computational Intelligence (CSCI), 801–806. https://doi.org/10.1109/CSCI51800.2020.00150
Jin, Y.-C., Cao, Q., Wang, K.-N., Zhou, Y., Cao, Y.-P., & Wang, X.-Y. (2023). Prediction of COVID-19 Data Using Improved ARIMA-LSTM Hybrid Forecast Models. IEEE Access, 11, 67956–67967. https://doi.org/10.1109/ACCESS.2023.3291999
Mustafa, H. I., & Fareed, N. Y. (2020). COVID-19 Cases in Iraq; Forecasting Incidents Using Box - Jenkins ARIMA Model. 2020 2nd Al-Noor International Conference for Science and Technology (NICST), 22–26. https://doi.org/10.1109/NICST50904.2020.9280304
Pane, S. F., Adiwijaya, Sulistiyo, M. D., & Gozali, A. A. (2022). LSTM and ARIMA for Forecasting COVID-19 Positive and Mortality Cases in DKI Jakarta and West Java. 2022 Seventh International Conference on Informatics and Computing (ICIC), 1–6. https://doi.org/10.1109/ICIC56845.2022.10006959
Ratu, J. A., Masud, Md. A., Hossain, Md. M., & Samsuzzaman, Md. (2021). Forecasting the COVID-19 Pandemic in Bangladesh Using ARIMA Model. 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI), 1–6. https://doi.org/10.1109/STI53101.2021.9732576
Rob J Hyndman. (2018). Forecasting: principles and practice. OTexts.
Sahai, A. K., Rath, N., Sood, V., & Singh, M. P. (2020). ARIMA modelling & forecasting of COVID-19 in top five affected countries. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 14(5), 1419–1427. https://doi.org/10.1016/j.dsx.2020.07.042
Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and Statistical Modeling with Python. 92–96. https://doi.org/10.25080/Majora-92bf1922-011
Shi, Y., Wu, K., & Zhang, M. (2022). COVID-19 Pandemic Trend Prediction in America Using ARIMA Model. 2022 International Conference on Big Data, Information and Computer Network (BDICN), 72–79. https://doi.org/10.1109/BDICN55575.2022.00022
Singh, S., Mittal, S., & Singh, S. (2023). Analysis and Forecasting of COVID-19 Pandemic Using ARIMA Model. 2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), 143–148. https://doi.org/10.1109/ACCESS57397.2023.10199278
Subagyo, A., Sunyoto, A., & Prasetio, A. B. (2022). Prediction of the Spread of Covid-19 in Indonesia Using the SEIRD Model and Hybrid Model with ARIMA Correction. 2022 1st International Conference on Smart Technology, Applied Informatics, and Engineering (APICS), 199–204. https://doi.org/10.1109/APICS56469.2022.9918765