Summary: Matching Tokenizers and Datasets
Get a quick recap of what we covered in this chapter.
In this chapter, we measured the impact of the tokenization and subsequent data encoding process on transformer models. A transformer model can only attend to tokens from the embedding and positional encoding sublayers of a stack. It does not matter if the model is an encoder-decoder, encoder-only, or decoder-only model. It does not matter if the dataset seems good enough to train.
Get hands-on with 1400+ tech skills courses.