

Identifying the Right Boost Value

Identifying the Right Boost Value

Learn how to tune your code to produce the lowest WER possible.


Recall that speech adaptation allows us to provide some phrases and class tokens that can help improve speech recognition, and boost values can be supplied to tune these.

Determine a baseline WER

Recall that in a previous lesson, we learned that measuring the accuracy of a single transcription, or even two different transcriptions, is an approach filled with potential bias.

The first step in figuring out the best boost values is to get the WER values for at least 1 hour’s worth of audio.

Which model?

To truly get the optimal output, each of the models would need to be used. However, a valid approach might be to choose the model that best fits the audio.

The same is true with the choice to enable enhanced transcription. If your business is willing to pay for it, run WER baseline tests with the enhanced mode.

