Embedding Words
Let's see how we can transform words into word vectors.
We're ready to transform words into word vectors. Embedding words into vectors happens via an embedding table. An embedding table is basically a lookup table. Each row holds the word vector of a word. We index the rows by word-IDs, hence the flow of obtaining a word's word vector is as follows:
word->word-ID: Previously, we obtained a word-ID for each word with Keras'
Tokenizer
.Tokenizer
holds all the vocabulary and maps each vocabulary word to an ID, which is an integer.word-ID->word vector: A word-ID is an integer and therefore can be used as an index to the embedding table's rows. Each word-ID corresponds to one row, and when we want to get a word's word vector, we first obtain its word-ID and then do a lookup in the embedding table rows with this word-ID.
The following diagram shows how embedding words into word vectors works:
Get hands-on with 1400+ tech skills courses.