...
/Named Entity Recognition with RNNs: Defining the Model
Named Entity Recognition with RNNs: Defining the Model
Learn how to define hyperparameters in a text corpus.
We'll cover the following...
Defining hyperparameters
Now let’s define several hyperparameters needed for our RNN, as shown here:
max_seq_length
: Denotes the maximum length for a sequence. We infer this from our training data during data exploration. It’s important to have a reasonable length for sequences because otherwise, memory can explode due to the unrolling of the RNN.embedding_size
: The dimensionality of token embeddings. Since we have a small corpus, a value < 100 will suffice.rnn_hidden_size
: The dimensionality of hidden layers in the RNN. Increasing the dimensionality of the hidden layer usually leads to better performance. However, note that increasing the size of the hidden layer causes all three sets of internal weights (that is, U, W, and V) to increase as well, resulting in a high computational footprint.n_classes
: Number of unique output classes present.batch_size
: The batch size for training data, validation data, and test data. A hgher batch size often leads to better results as we are seeing more data during each optimization step, but just like unrolling, this causes a higher memory requirement.epochs
: The number of epochs to train the model for.
These are defined below:
# The maximum length of sequencesmax_seq_length = 40# Size of token embeddingsembedding_size = 64# Number of hidden units in the RNN layerrnn_hidden_size = 64# Number of output nodes in the last layern_classes = 9# Number of samples in a batchbatch_size = 64# Number of epochs to trainepochs = 3
Defining the model
We’ll define the model here. Our model will have an embedding layer, followed by a simple RNN layer, and finally, a dense prediction layer. One thing to note in the work we have done so far is that we haven’t yet defined a Tokenizer
object. Although the Tokenizer
has been an important part of our NLP pipeline to ...