What is Speech-to-Text (STT)?

Since Speech-to-Text is powered by Google’s own advanced deep learning models, you can expect state-of-the-art accuracy. You can also customize speech recognition to transcribe domain-specific terms and rare words by providing hints and boosting your transcription accuracy of specific words or phrases.

Speech-to-Text can use one of several machine learning models to transcribe your audio file. The API currently offers voice recognition that supports more than $125$ languages and variants.

Other than the above-mentioned features, the STT API allows you to:

Transcribe your content in real-time or from stored files
Deliver a better user experience in products through voice commands
Gain insights from customer interactions to improve your service

Speech-to-Text is priced based on the amountmeasured in increments rounded up to 15 seconds of audio that is successfully processed by the service each month. However, you can use this service for free if your audio duration does not exceed 60 minutes per month.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

TRENDING TOPICS

Learn to Code

Tech Interview Prep

Generative AI

Data Science

Machine Learning

GitHub Students Scholarship

Early Access Courses

Blind 75

Layoffs

Pricing

For Individuals

Try for Free

Gift a Subscription

CONTRIBUTE

Become an Author

Become an Affiliate

Earn Referral Credits

RESOURCES

Blog

Cheatsheets

Webinars

Answers

ABOUT US

Our Team

Careers

Hiring

Frequently Asked Questions

Press

LEGAL

Cookie Policy

Business Terms of Service

Data Processing Agreement

INTERVIEW PREP COURSES

Grokking the Modern System Design Interview

Grokking the Product Architecture Design Interview

Grokking the Coding Interview Patterns

Machine Learning System Design