Inference Decoding

Understand the difference between training and inference decoding.

Chapter Goals:

  • Learn how to perform decoding for inference
  • Learn about variable scopes for declaring and retrieving variables

A. Decoding without ground truth

During training, we have access to ground truth tokens, which are used as input for the decoder. However, for inference (i.e. generating output in real-time), we only have access to an input sequence. In this case, the input for the decoder at each time is just the decoder's output token from the previous time step.

Using a decoder during inference. The SOS token is always the initial decoder input token. For simplicity, the attention mechanism is not shown in this diagram.
Using a decoder during inference. The SOS token is always the initial decoder input token. For simplicity, the attention mechanism is not shown in this diagram.

In TensorFlow, we can switch from training to inference mode just by changing the TrainingSampler object to an inference helper. The most commonly used inference helper object is GreedyEmbeddingSampler.

B. Greedy decoding

The name of ...