What are Convolutional neural networks for NLP?

Overview

Convolutional neural networks (CNN) are incredibly powerful models making immense waves in the realm of machine learning by taking on complex tasks such as image/video recognition and analysis.

Inspired by the visual cortexThe visual cortex is a region of the brain that receives, integrates, and processes visual information received from the retinas., the architecture of CNNs is such that it can assign importance to objects in the input by figuring their features out and then differentiating those objects from each other. To make this possible, the CNN contains various layers—the Convolution, ReLU, and Pooling layers.

Figures 1 and 2 illustrate a CNN's pipeline, from the input to the output. While Figure 1 details the half responsible for Feature Extraction, Figure 2 details the Classification half of the CNN for an application to be fed images and then determine the presence of koala bears in them.

Figure 1: Feature extraction
Figure 2: Classification

Natural language processing (NLP)

Now that we're well-versed with CNNs, let's briefly look at the domain we wish to apply them to.

Natural language processing (NLP), at its absolute minimum, is what happens when a computer is tasked with listening to words and sentences in a natural language. This is referred to as NLUNatural Language Understanding. It then forms some sort of comprehension from them. This is referred to as NLGNatural Language Generation.

Figure 3: NLP pipeline

Using both NLU and NLG, there are a host of applications for NLP. Some of these include the following:

  • Machine Translation
  • Virtual Assistants
  • Sentiment Analysis

Note: For more information on NLP, refer to this Answer

CNNs for NLP

The CNN illustrated in Figures 1 and 2 for our koala detection application works incredibly well, but currently, inputs are provided in the form of images.

How do we go from CNNs that work for images to CNNs that work for NLP?

For starters, the inputs to the CNN for NLP will be in the form of sentences, not images. However, we can notice from Figure 1 that the input type does not affect the illustrated methodology as long as we can convert it into an array of numbers for feature extraction and classification.

Fortunately, sentences can be represented as an array of vectors, wherein each word of the sentence—known as a token—is a vector in a vector space comprising the entire training vocabulary. This is known as Word Embeddings.

Total vocabulary from all the sentences
1 of 2

Given this array of vectors, we can duplicate the pipeline in Figures 1 and 2, while making changes to the filters and pooling operations appropriate to the NLP application we'd like to perform.

A diagram illustrating the pipeline for sentence classificationAssigning a set of labels (or classes) to a sentence. using CNNs can be seen below:

Sentence classification using a CNN

Copyright ©2024 Educative, Inc. All rights reserved