Demo Code Deep Dive

Here, we pull apart the demo code to understand how to access the API and the results.

Imports

# [START speech_quickstart_beta]
from google.cloud import speech_v1p1beta1
from google.cloud.speech_v1p1beta1 import enums

The first line allows access to the speech API client, The second line brings in enums that are useful later in the code, such as enums.RecognitionConfig.AudioEncoding.MP3.

Client

The function sample_recognize starts by creating a client that will be the main actor and that will make the calls to the API.

This will be a common pattern for any function that leverages the API.

def sample_recognize(storage_uri):
    ...
    client = speech_v1p1beta1.SpeechClient()
    ...

Variables

The code defaults to transcribing a specific audio file. This file is known to have specific attributes, such as the encoding and language.

  • Synchronous Speech Recognition Requests require:
Fields Description
encoding Specifies the encoding scheme of the supplied audio (of type AudioEncoding). The encoding field is optional for FLAC and WAV files.
sample_rate_hertz Specifies the sample rate (in Hertz) of the supplied audio. The sampleRateHertz field is optional for FLAC and WAV files.
language_code Contains the language + region/locale to use for speech recognition of the supplied audio. The language code
...