How to use open source Whisper ASR in Python

Whisper is a state-of-the-art automatic speech recognition (ASR) system, a brainchild of OpenAI. It’s purpose-built to transcribe spoken language into written form, a process that has a multitude of uses, encompassing everything from transcription services to voice-controlled assistants. This Answer will shed light on how to use the open-source version of the Whisper ASR system, particularly in Python.

Setting up the environment

Before understanding the code, it’s important to ensure an optimal environment. This entails having Python installed on your system, as well as the Whisper Python package. The latter can be installed using pip:

Code explanation

In the provided code snippet, we initially import the Whisper package and load the model. Subsequently, we invoke the model.transcribe method, passing in the audio file we aim to transcribe. The audio file should conform to a format that Whisper supports, such as WAV, FLAC, or MP3.

The input audio provided to the code snippet above can be found here.

The method returns a result, inclusive of the transcription of the audio file, which is then printed.

Translating audio into English text

Whisper can also facilitate audio translation in other supported languages into English text. Here’s the method to achieve this:

Code explanation

In this instance, we load the tiny model and invoke the transcribe method with the task parameter set to translate. This instructs Whisper to translate the audio into English text.

Conclusion

Whisper ASR is a tool for the conversion of speech into text, and its open-source Python package facilitates easy integration into your applications. Regardless of whether you’re creating a transcription service, a voice-activated assistant, or any other application that necessitates speech recognition, Whisper ASR can prove to be a highly valuable resource.

Free Resources

Learn in-demand tech skills in half the time

PRODUCTS

Mock Interview

New

Courses

Skill Paths

Projects

Assessments

TRENDING TOPICS

Learn to Code

Tech Interview Prep

Generative AI

Data Science

Machine Learning

GitHub Students Scholarship

Early Access Courses

Blind 75

Layoffs

Pricing

For Individuals

Try for Free

Gift a Subscription

CONTRIBUTE

Become an Author

Become an Affiliate

Earn Referral Credits

RESOURCES

Blog

Cheatsheets

Webinars

Answers

ABOUT US

Our Team

Careers

Hiring

Frequently Asked Questions

Press

LEGAL

Cookie Policy

Business Terms of Service

Data Processing Agreement

INTERVIEW PREP COURSES

Grokking the Modern System Design Interview

Grokking the Product Architecture Design Interview

Grokking the Coding Interview Patterns

Machine Learning System Design

How to use open source Whisper ASR in Python

Setting up the environment

Whisper through the command line

Code explanation

Whisper open source in Python

Code explanation

Translating audio into English text

Code explanation

Conclusion