How to compress data using Python
A ZIP file is a compressed archive type that lets us bundle and compress several files or directories into one file with a .zip extension. To simplify transferring or storing various files, the main goal of producing a ZIP file is to lower the overall file size. Here are a few benefits of making ZIP files:
File compression: ZIP file uses compression algorithms to reduce the size of files included in that bundle. This compression makes transferring the files over the internet more convenient, as smaller files will take less bandwidth.
File organization: A ZIP file will create a single bundle of all the files, making it easy to maintain a tidy but organized file structure.
Backup: Creating ZIP files is a convenient way to create backups of multiple files. It simplifies the backup process by consolidating multiple files into a single archive, making it easier to manage and store.
Reducing storage space: Compressing files into ZIP archives can save disk space, especially when dealing with many files. This is particularly relevant for archival purposes.
Compressing files with the zipfile module
It would be best if we first built a ZipFile object (notice the capital letters Z and F) to read the contents of a ZIP file. Similar in concept to the File objects returned by the open() function, ZipFile objects are values that allow the program to communicate with the file. The zipfile module in Python is a standard library designed for working with ZIP files. This file format is commonly used in industry regarding digital data compression and archiving. It can be used to group multiple relevant files. To compress the file, use the following process:
Call the
ZipFile()function from thezipfilemodule to create an emptyZipFileObject. This object will accept the following parameters:file: This parameter accepts the file name as a string.mode: This parameter accepts the following character tags:r: This tag is used to read the existing file.w: This tag is used to truncate and write a new file.a: This tag is used to append an existing file.x: This tag is used to exclusively create and write a new file.
Use the
write()method fromZipFilethe object to add and compress the files.Use the
getinfo()method to get the information of a compressed file. This method will accept the file name as a parameter.
Code example
So, let’s start by creating a .zip file using the Python programming language. For this code, we will create a .zip file containing a text file and then check the file size before and after compression.
Code explanation
Let’s see the code in detail:
Line 1: This line is used to import the modules.
Line 3: This line uses the
ZipFile()method to create an empty ZIP file in our current directory.Line 5: This line uses the
write()method to compress thefile.txtfile.Line 7: This line uses the
getinfo()method to get the information of the file.Line 13: This line uses the
close()method to finally close the opened ZIP file.
Note: To extract the data from the ZIP file using python, you might need to create the
ZipFileobject using the existing file and call theextractall()method to get all the files available in the ZIP file.
Free Resources