How to extract text from an image in Python

Python requires optical character recognition (OCR) technology to extract image text. OCR is a method for transforming scanned or photographed text pictures into text that is machine readable. Python allows you to construct OCR algorithms, which examine the image, identify individual characters, and then extract the text that each character represents.

You can automate the extraction of text from photos using OCR techniques in Python. This makes it possible to perform tasks like digitizing printed documents, extracting information from photographs, and enabling text search inside image-based material.

Importance of OCR

OCR technology’s ability to extract text from photos makes it essential in many industries. Here are some significant arguments in favor of OCR.

  • Digitization and archiving: OCR makes it possible to convert written documents from physical forms to digital ones. This lessens the need for human data entry and physical document management by enabling adequate information storage, retrieval, and sharing.

  • Processing and automation of documents: OCR uses pertinent data extracted from images to process documents automatically. This can speed up procedures, automate data entry, and make it possible to analyze documents effectively.

  • Accessibility: OCR converts printed or handwritten text into machine-readable formats that screen readers may read aloud or convert into braille, making text information accessible to those with visual impairments.

  • Data extraction and analysis: OCR can extract text from pictures like bills, receipts, or forms, making it possible to extract data for additional analysis, automated data entry, or system integration.

  • Content searchability: OCR technology effectively searches, indexes, and retrieves information from picture collections or scanned documents by converting image-based text into searchable and indexable digital text.

Installing dependencies

The dependencies required include the following.

  • pytesseract: The pytesseract library is used for OCR. You can install it using the following command:

pip install pytesseract
Installing pytesseract
  • Pillow: The PIL (Python Imaging Library) package is used for image handling. You can install it using the following command:

pip install pillow
Installing pillow
  • Tesseract OCR Engine: pytesseract requires a separate installation of the Tesseract OCR Engine. Depending on your operating system, you can install it:

  • Windows: Users should launch the Tesseract installer after downloading it from the official GitHub source.

  • Linux: Make use of the distribution-specific package manager. For instance, you can run the following command on Ubuntu or Debian.

sudo apt-get install tesseract-ocr
Installing tesseract-ocr on Linux
  • macOS: Run the following to install Tesseract using Homebrew.

brew install tesseract
Installing tesseract-ocr on macOS

Before running your Python code to use OCR to extract text from an image, make sure these prerequisites are installed.

Extracting text from an image

Once the dependencies have been installed, you can write the code to extract text from this image.

widget

Click the "Run" button to execute the program.

import pytesseract
from PIL import Image

# Open the image file
image = Image.open('image.png')

# Perform OCR using PyTesseract
text = pytesseract.image_to_string(image)

# Print the extracted text
print(text)
Code to extract text from image

Code explanation

The provided code demonstrates how to extract text from an image using the pytesseract library in Python. Here's an explanation of each step.

  • Line 1–2 (importing libraries): import pytesseract imports the pytesseract library, a Python wrapper for the Tesseract OCR engine. The from PIL import Image imports the Image class from the PIL (Python Imaging Library) package, used for image manipulation and processing.

  • Line 5 (opening the image file): image = Image.open('image.png') opens the image file named "image.png" using the Image.open() function from the PIL library. Ensure the image file is in the same directory as your Python script, or provide the full path to the image file.

You can provide images of any format, png or jpeg. It won't affect the working of the code.

  • Line 8 (performing OCR): text = pytesseract.image_to_string(image) uses the image_to_string() function from pytesseract to perform OCR on the image. It extracts the text from the image and assigns it to the variable text. The image_to_string() function takes the image as an argument.

  • Line 11 (printing the extracted text): print(text) outputs the extracted text to the console. The code opens an image file, performs OCR on it using pytesseract, and prints the extracted text. Ensure that you have installed the required dependencies, including pytesseract, pillow, and the Tesseract OCR engine before running the code.

Transform your ideas into high-quality videos with Sora’s AI-powered video generation. Ready to start creating? Visit our Answer on Sora for more details!

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved