API General Overview

The Cloud Speech API has a lot to offer. There are a number of options for interacting with the service: four AI models are available. The API can process many types of audio files, and it supports hundreds of languages. This lesson provides an in-depth overview of the Google STT API.

Versions

This course covers v1 and v1p1beta1 versions of the Google Speech-to-Text API.

Choices for interacting with STT

There are three main methods for interaction with STT API including:

  • Google cloud command-line utility
  • Command-line using curl and REST
  • Client libraries for Python, Java, Node.js, and more

Types of audio processing

Three types of audio recognition are offered depending on use case. The three types are synchronous, asynchronous, and streaming.

  • Synchronous (Recognize) - Sends audio data to the Speech-to-Text API, performs recognition on that data, ...