Comparison of 80-20 Split and Walk-Forward Validation Techniques in Predicting COVID-19 Cases in Indonesia using the ARIMA Model.
This study presents a comparative analysis of the 80-20 split and walk-forward validation techniques for forecasting daily COVID-19 cases in Indonesia using the ARIMA model. Building on previous research, the ARIMA model has proven effective in various epidemiological contexts; however, this study highlights the critical importance of selecting the appropriate validation technique. The study uses data from January 3, 2020, to October 18, 2023, to develop a predictive model evaluated using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). The findings indicate that the walk-forward validation technique outperforms the 80-20 split, with MAE of 137.32 and RMSE of 198.23, compared to the 80-20 split MAE of 4190.92 and RMSE of 4479.15. These results suggest that walk-forward validation provides more accurate and reliable predictions, particularly for dynamic and non-stationary data scenarios. This study underscores the significant impact of validation technique selection on ARIMA model performance, contributing new insights into forecasting methodologies in epidemiology.
Benvenuto, D., Giovanetti, M., Vassallo, L., Angeletti, S., & Ciccozzi, M. (2020). Application of the ARIMA model on the COVID-2019 epidemic dataset. Data in Brief, 29, 105340.
de Araújo Morais, L. R., & da Silva Gomes, G. S. (2022). Forecasting daily Covid-19 cases in the world with a hybrid ARIMA and neural network model. Applied Soft Computing, 126, 109315.
Hasri, H., Mohd Aris, S. A., & Ahmad, R. (2023). Comparison of Auto ARIMA and Auto SARIMA Performance in COVID-19 Prediction. 2023 IEEE 2nd National Biomedical Engineering Conference (NBEC), 106–110.
Ismail, L., Alhmoudi, S., & Alkatheri, S. (2020). Time Series Forecasting of COVID-19 Infections in United Arab Emirates using ARIMA. 2020 International Conference on Computational Science and Computational Intelligence (CSCI), 801–806.
Jin, Y.-C., Cao, Q., Wang, K.-N., Zhou, Y., Cao, Y.-P., & Wang, X.-Y. (2023). Prediction of COVID-19 Data Using Improved ARIMA-LSTM Hybrid Forecast Models. IEEE Access, 11, 67956–67967.
Mustafa, H. I., & Fareed, N. Y. (2020). COVID-19 Cases in Iraq; Forecasting Incidents Using Box - Jenkins ARIMA Model. 2020 2nd Al-Noor International Conference for Science and Technology (NICST), 22–26.
Pane, S. F., Adiwijaya, Sulistiyo, M. D., & Gozali, A. A. (2022). LSTM and ARIMA for Forecasting COVID-19 Positive and Mortality Cases in DKI Jakarta and West Java. 2022 Seventh International Conference on Informatics and Computing (ICIC), 1–6.
Ratu, J. A., Masud, Md. A., Hossain, Md. M., & Samsuzzaman, Md. (2021). Forecasting the COVID-19 Pandemic in Bangladesh Using ARIMA Model. 2021 3rd International Conference on Sustainable Technologies for Industry 4.0 (STI), 1–6.
Rob J Hyndman. (2018). Forecasting: principles and practice. OTexts.
Sahai, A. K., Rath, N., Sood, V., & Singh, M. P. (2020). ARIMA modelling & forecasting of COVID-19 in top five affected countries. Diabetes & Metabolic Syndrome: Clinical Research & Reviews, 14(5), 1419–1427.
Seabold, S., & Perktold, J. (2010). Statsmodels: Econometric and Statistical Modeling with Python. 92–96.
Shi, Y., Wu, K., & Zhang, M. (2022). COVID-19 Pandemic Trend Prediction in America Using ARIMA Model. 2022 International Conference on Big Data, Information and Computer Network (BDICN), 72–79.
Singh, S., Mittal, S., & Singh, S. (2023). Analysis and Forecasting of COVID-19 Pandemic Using ARIMA Model. 2023 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), 143–148.
Subagyo, A., Sunyoto, A., & Prasetio, A. B. (2022). Prediction of the Spread of Covid-19 in Indonesia Using the SEIRD Model and Hybrid Model with ARIMA Correction. 2022 1st International Conference on Smart Technology, Applied Informatics, and Engineering (APICS), 199–204.