NER with Character and Token Embeddings
Learn to implement NER with character and token embeddings.
Nowadays, recurrent models used to solve the NER task are much more sophisticated than having just a single embedding layer and an RNN model. They involve using more advanced recurrent models like long short-term memory (LSTM), gated recurrent units (GRUs), etc. We’ll set aside the discussion about these advanced models. Here, we’ll focus our discussion on a technique that provides the model embeddings at multiple scales, enabling it to understand language better—that is, instead of relying only on token embeddings, also use character embeddings. Then, a token embedding is generated with the character embedding by shifting a convolutional window over the characters in the token.
Using convolution to generate token embeddings
A combination of character embeddings and a convolutional kernel can be used to generate token embeddings. The method will be as follows:
Pad each token (e.g., word) to a predefined length.
Look up the character embeddings for the characters in the token from an embedding layer.
Shift a convolutional kernel over the sequence of character embeddings to generate a token embedding.
Get hands-on with 1400+ tech skills courses.