Performance Comparison of Lossless Data Compression Algorithms Using Compression Ratios and Space Savings

##plugins.themes.bootstrap3.article.main##

Aswar Hanif Endang Wahyudi Harna Adianto Lilik Martanto

Abstract

Data growth is a sizeable challenge. The goal of data compression is to reduce the size of data needed to still represent useful information. Data compression can be used to increase the efficiency of data storage, transmission and protection. Lossless algorithms can precisely reconstruct the original data from the compressed data. Lossless compression is often used for data that needs to be stored or transmitted accurately. Several lossless compression methods and algorithms include the Lempel–Ziv–Markov chain algorithm (LZMA), Prediction by partial matching (PPM), Burrows-Wheeler block sorting text compression algorithm and Huffman coding (BZip2), and Deflate. Even though all compression systems are based on the same principles, there should still be differences in performance. Because of that, a general guide is needed to help determine the most appropriate data compression algorithm to use. This study aims to determine the data compression algorithm that has the best performance, based on a comparison using the Compression Ratio and Space Saving values. The research phase begins with determining the compression algorithm used, data preparation, performance testing, to then be discussed and conclusions drawn. The results show that the compression ratio and space savings that can be achieved specifically will depend on the data used. Although the range of average values of compression performance is not that big, in general LZMA2 shows the best results with a compression ratio of 1.457 and a space saving of 15.00%. Hopefully, the results of this test can be used as an overview in helping to choose a lossless data compression algorithm.

##plugins.themes.bootstrap3.article.details##

Section
Articles
References
Akoğuz, A., Bozkurt, S., Gözütok, A., Alp, G., Turan, E., Bogaz, M., & Kent, S. (2016). Comparison Of Open Source Compression Algorithms On Vhr Remote Sensing Images For Efficient Storage HIERARCHY. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 3–9. https://doi.org/http://dx.doi.org/10.5194/isprsarchives-XLI-B4-3-2016
Delaunay, X., Courtois, A., & Gouillon, F. (2019). Evaluation of lossless and lossy algorithms for the compression of scientific datasets in netCDF-4 or HDF5 files. Geoscientific Model Development (GMD), 12(9), 4099–4113. https://doi.org/https://doi.org/10.5194/gmd-12-4099-2019
Fitriya, L. A., Purboyo, T. W., & Prasasti, A. L. (2017). A Review of Data Compression Techniques. International Journal of Applied Engineering Research, 12(19), 8956–8963.
Gupta, A., Bansal, A., & Khanduja, V. (2017). Modern lossless compression techniques: Review, comparison and analysis. 2017 Second International Conference on Electrical, Computer and Communication Technologies (ICECCT). https://doi.org/https://doi.org/10.1109/ICECCT.2017.8117850
Haque, M. J., & Huda, M. N. (2017). Study on Data Compression Technique. International Journal of Computer Applications, 159(5), 6–13. https://doi.org/http://dx.doi.org/10.5120/ijca2017912416
Hosseini, M. (2012). A Survey of Data Compression Algorithms and their Applications. Applications of Advanced Algorithm. https://doi.org/http://dx.doi.org/10.13140/2.1.4360.9924
Hutagalung, R. (2018). Implementasi Algoritma Prediction By Partial Matching (Ppm) Pada Kompresi File Teks Terenkripsi Elgamal. JURIKOM (Jurnal Riset Komputer), 5(6), 611–620. https://doi.org/http://dx.doi.org/10.30865/jurikom.v5i6.1196
Jayasankar, U., Thirumal, V., & Ponnurangam, D. (2021). A survey on data compression techniques: From the perspective of data quality, coding schemes, data type and applications. Journal of King Saud University - Computer and Information Sciences, 33(2), 119–140. https://doi.org/https://doi.org/10.1016/j.jksuci.2018.05.006
K.Muthuchamy. (2018). A Study on Various Data Compression Types and Techniques. International Journal of Research and Analytical Reviews (IJRAR), 5(3), 945–950.
Oswal, S., Singh, A., & Kumari, K. (2016). Deflate Compression Algorithm. International Journal of Engineering Research and General Science, 4(1), 430–436.
Sayood, K. (2012). Dictionary Techniques. In Introduction to Data Compression (4th ed., pp. 135–136). Elsevier.
Taylor, P. (2022). Total data volume worldwide 2010-2025 | Statista. Statista. https://www.statista.com/statistics/871513/worldwide-data-created/
Usama, M., Malluhi, Q. M., Zakaria, N., Razzak, I., & Iqbal, W. (2021). An efficient secure data compression technique based on chaos and adaptive Huffman coding. Peer-to-Peer Networking and Applications, 14(3), 2651–2664. https://doi.org/https://doi.org/10.1007/s12083-020-00981-8
Yang, Y., Mandt, S., & Theis, L. (2023). An Introduction to Neural Data Compression. Foundations and Trends in Computer Graphics and Vision, 15(2), 113–200. https://doi.org/https://doi.org/10.1561/0600000107