Decoding Strategies

Learn about the types of text decoding strategies and how to implement them to generate text.

Now that we have a trained model, the next step is to input some context words and generate the next word as output. This output generation step is formally known as the decoding step. It is termed “decoding” because the model outputs a vector which has to be processed to get the actual word as output. There are a few different decoding techniques; let’s briefly discuss the popular ones: greedy decoding, beam search, and sampling.

Greedy decoding

This is the simplest and fastest decoding strategy. As the name suggests, greedy decoding is a method that picks up the highest probability term at every prediction step.

While this is fast and efficient, being greedy does create a few issues while generating text. By focusing on only the highest probability outputs, the model may generate inconsistent or incoherent outputs. In the case of character-language models, this may even result in outputs that are non-dictionary words. Greedy decoding also limits the variance of outputs, which may result in repetitive content as well.

Beam search

Beam search is a widely used alternative to greedy decoding. This decoding strategy, instead of picking the highest probability term, keeps track of nn possible outputs at every timestep. The following diagram illustrates the beam search decoding strategy. It shows multiple beams forming from step 0, creating a tree-like structure:

Get hands-on with 1400+ tech skills courses.