...

/

Building an OCR script for Documents using Read API

Building an OCR script for Documents using Read API

Learn to extract text from PDF documents using Computer Vision's Read API.

Introduction

In the previous lesson, we’ve seen how to extract the text from an image. Now, we’ll have a look at how to extract the text from a PDF document.

You can download the sample PDF that we are going to use in this lesson for extracting the text below:

printed_handwritten.pdf

Implementation for Documents

Now that you have the sample PDF we can move ahead to the implementation of this functionality.

Press + to interact
Please provide values for the following:
computer_vision_key
Not Specified...
computer_vision_endpoint
Not Specified...
import time
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
client = ComputerVisionClient(
computer_vision_endpoint,
CognitiveServicesCredentials(computer_vision_key)
)
def pdf_to_text():
filepath = open('CourseAssets/printed_handwritten.pdf','rb')
response = client.read_in_stream(filepath, raw=True)
filepath.close()
operation_location = response.headers["Operation-Location"]
operation_id = operation_location.split("/")[-1]
while True:
result = client.get_read_result(operation_id)
if result.status.lower () not in ['notstarted', 'running']:
break
time.sleep(10)
return result
result = pdf_to_text()
if result.status == OperationStatusCodes.succeeded:
for readResult in result.analyze_result.read_results:
for line in readResult.lines:
print(line.text)
print(line.bounding_box)
  • From lines 1 to 4, we’ve import the required packages. ...