Compressing a PDF File

Introduction

PDF compression involves optimizing the document structure by shrinking its pictures, graphics, and objects, while maintaining its quality to a certain extent.

The primary reason for compressing a PDF file is to decrease its size in order to optimize its storage, and to improve its transfer via email or other online communication channels. This is mainly done for files that have size constraints related to email attachments, or to the data transferred via messenger applications.

What are PDF compression types

The PDF support two types of compression:

  • Lossy: This type of compression applies to images and graphics included within a PDF file, as it decreases their size by sacrificing bits of their information and lowering their resolution. However, we should note that this ends up with an output of lower quality.

  • Lossless: This compression algorithm creates reference points for textual patterns and stores them in a catalog. This allows others to reconstruct the original data from the compressed data without any loss of quality. It best fits PDF documents containing text.

Scope

This lesson aims to show us how to develop a custom-built PDF compression tool, using a lightweight command-line-based utility developed in the Python programming language.

Get hands-on with 1400+ tech skills courses.