Image compression using fast Fourier transform in Python

Image compression saves storage space and speeds up data transfer. It helps websites load faster and reduces costs for storing and sharing images. For image processing, the Fourier transform stands as a pivotal tool, facilitating the decomposition of signals into their frequency components.

Fourier transform

This mathematical technique unveils the underlying frequency spectrum inherent within an image, shedding light on its intricate patterns. The Fourier transform operates by expressing a signal—such as an image—as a combination of sinusoidal functions, thereby offering a unique perspective that aids in various applications, including image compression.

Image compression using Fourier transform

Here’s a walkthrough on the process of image compression using FFT:

Begin by importing necessary libraries like NumPy, Matplotlib, and PIL.
Load the original image into memory.
Compute the FFT of the original image using NumPy’s fft2 function to convert it into the frequency domain.
Determine a threshold value based on a percentage of the highest magnitude FFT coefficients. This threshold separates significant frequency components from insignificant ones.
Apply the threshold to the FFT coefficients, setting insignificant coefficients to zero. This effectively removes high-frequency noise while retaining essential image information.
To obtain the compressed image in the spatial domain, perform an inverse FFT (ifft2) on the thresholded FFT coefficients.
Calculate the sizes of both the original and compressed images to evaluate the compression ratio.
Experiment with different retention percentages and compression techniques to find the optimal balance between compression ratio and image quality.

By following these steps and guidelines, we can effectively utilize FFT for image compression, achieving significant reductions in file size while maintaining acceptable image quality.

Code example: Image compression using FFT

Let’s implement the aforementioned steps for image compression using FFT in the following code widget:

import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from io import BytesIO
original_image = plt.imread("/camera-man.png")
fft_original_image = np.fft.fft2(original_image)
sorted_magnitudes = np.sort(np.abs(fft_original_image.reshape(-1)))
compression_ratio = 0.8
threshold_index = int(np.floor((1 - compression_ratio) * len(sorted_magnitudes)))
threshold_value = sorted_magnitudes[threshold_index]
threshold_mask = np.abs(fft_original_image) > threshold_value
compressed_fft = fft_original_image * threshold_mask
compressed_image = np.fft.ifft2(compressed_fft).real 
# Calculating the size of the original image
buffer1 = BytesIO()
original_image_pil = Image.fromarray(original_image.astype(np.uint8))
original_image_pil.save(buffer1, format='PNG')
original_image_size_bytes = len(buffer1.getvalue())
original_image_size_kilobytes = original_image_size_bytes / 1024
# Calculating the size of the compressed image
buffer2 = BytesIO()
compressed_image_pil = Image.fromarray(compressed_image.astype(np.uint8))
compressed_image_pil.save(buffer2, format='PNG')
compressed_image_size_bytes = len(buffer2.getvalue())
compressed_image_size_kilobytes = compressed_image_size_bytes / 1024
# Plotting
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
# Plotting the original image in the first subplot
axes[0].imshow(original_image, cmap="gray")
axes[0].set_title(f'Original Image, Size: {original_image_size_kilobytes:.2f} KBs')
# Plotting the compressed image in the second subplot
axes[1].imshow(compressed_image, cmap="gray")
axes[1].set_title(f'Compressed Image, Size: {compressed_image_size_kilobytes:.2f} KBs')
# Turn off axes for better visualization
for ax in axes:
    ax.axis('off')
# Adjusting layout to prevent overlap
plt.tight_layout()
# Saving the plot as an image file
plt.savefig(f'output/fig.png')
# Showing the plot
plt.show()

Code explanation

Lines 1–4: This segment imports necessary libraries for plotting, numerical computations, image processing, and I/O operations.
Line 6: The original image, camera-man.png, is loaded into memory using Matplotlib’s imread function.
Line 8: The two-dimensional Fast Fourier Transform (FFT) of the original image is computed using NumPy’s fft2 function, resulting in the ft_original_image variable, which is the frequency-domain representation of the image.
Line 9: The magnitudes of the FFT coefficients are sorted in ascending order to facilitate thresholding.
Lines 11–12: A retention threshold is determined based on the specified retention percentage (e.g., 80%) of the highest magnitudes.
Line 13: Indexes corresponding to insignificant frequency components are identified based on the computed threshold.
Lines 14–15: The insignificant frequency components are thresholded out by element-wise multiplication with the binary mask threshold_mask, resulting in the compressed frequency-domain representation compressed_fft.
Line 16: An inverse FFT (ifft2) is performed on compressed_fft to obtain the compressed image compressed_image. The .real attribute extracts the real part of the result.
Lines 19–23: The size of the original image is calculated by converting the image array to a PIL Image object, saving it to a buffer in PNG format, and subsequently measuring the size of the buffer in bytes and kilobytes.
Lines 26–30: Similarly, the size of the compressed image is determined using the same methodology applied to the original image.
Lines 33–54: This is for just displaying the images.

In summary, image compression with FFT offers an effective method to reduce file sizes while retaining image quality. By adhering to these steps and principles, one can efficiently decrease file sizes without sacrificing crucial details, enhancing digital efficiency.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments