How to convert text to speech in JavaScript

In the Web Speech API, we can convert:

  • Text-to-Speech using SpeechSynthesis
  • Speech-to-Text using SpeechRecognition

We will be using the following two interfaces to convert text to speech:

  • SpeechSynthesisUtterance has data of what text to be spoken and how it should be spoken(speed, pitch, lang, and voice).
  • SpeechSynthesis can be used to retrieve information about voices available on the device, start and pause speech, etc.

Properties in SpeechSynthesisUtterance

  • lang: the language of the output.
  • pitch: sets a pitch(the relative highness or lowness of a tone) for the output spoken words. It has a range from 0 to 2.
  • rate: the speed at which the text is spoken. rate ranges from 0.1 to 10.
  • text: The text to be spoken.
  • voice: which voice to be used to speak.
  • volume: The output volume. It ranges from 0 to 1.

Methods in SpeechSynthesis

  • speak: to start speaking the text
  • getVoices: get the available voices in the system.
  • pause: pause the speech
  • resume: resume the speech
  • cancel: cancel the speech

The following steps are used to convert text to speech.

1. Check if the browser supports Text-to-Speech

if ('speechSynthesis' in window) {
 // Speech Synthesis is supported 🎉
}else{
  // Speech Synthesis is not Supported 😞 
}

2. Simple Text to Speech

let utterance = new SpeechSynthesisUtterance("Educative.io");
speechSynthesis.speak(utterance);

3. List voices available in the system

From the SpeechSynthesis interface of the Web Speech API, we can get the list of voices available on the machine.

speechSynthesis.getVoices();

Initially, the getVoices() method will return an empty array because voices may not be loaded. The following is a small workaround for that:

function getVoices() {
  let voices = speechSynthesis.getVoices();
  if(!voices.length){
    let utterance = new SpeechSynthesisUtterance("");
    speechSynthesis.speak(utterance);
    voices = speechSynthesis.getVoices();
  }
  return voices;
}

We can get the voices from this method, construct a select box, and hear the different voices. We can use the voice index as a value to the option and get the voice using the index from the voice list.

4. Passing options

let textToSpeak = "I Love Educative.io";

let speakData = new SpeechSynthesisUtterance();
speakData.volume = 1; // From 0 to 1
speakData.rate = 1; // From 0.1 to 10
speakData.pitch = 2; // From 0 to 2
speakData.text = textToSpeak;
speakData.lang = 'en';
speakData.voice = getVoices()[0];

speechSynthesis.speak(speakData);

Code

We can convert text to speech using the SpeechSynthesisUtterance object. We can also configure some settings like:

  • In what voice the text should be spoken.
  • The speed at which the utterancea spoken word, statement, or vocal sound will be spoken at.
  • The pitchThe relative highness or lowness of a tone as perceived by the ear at which the utterance will be spoken.

Take a look at the complete code below:

Free Resources