Demo Code Deep Dive
Here, we pull apart the demo code to understand how to access the API and the results.
Imports
# [START speech_quickstart_beta]
from google.cloud import speech_v1p1beta1
from google.cloud.speech_v1p1beta1 import enums
The first line allows access to the speech API client,
The second line brings in enums
that are useful later in the code, such as enums.RecognitionConfig.AudioEncoding.MP3
.
Client
The function sample_recognize
starts by creating a client that will be the main actor and that will make the calls to the API.
This will be a common pattern for any function that leverages the API.
def sample_recognize(storage_uri):
...
client = speech_v1p1beta1.SpeechClient()
...
Variables
The code defaults to transcribing a specific audio file. This file is known to have specific attributes, such as the encoding and language.
- Synchronous Speech Recognition Requests require:
Fields | Description |
---|---|
encoding |
Specifies the encoding scheme of the supplied audio (of type AudioEncoding). The encoding field is optional for FLAC and WAV files. |
sample_rate_hertz |
Specifies the sample rate (in Hertz) of the supplied audio. The sampleRateHertz field is optional for FLAC and WAV files. |
language_code |
Contains the language + region/locale to use for speech recognition of the supplied audio. The language code |