The aim of artificial intelligence (A.I.) is to allow machines to receive information and interpret it in the same intelligent way that humans do. **Natural language processing (NLP)** is a popular sub-category of A.I.

>**NLP** is the study of how machines analyze natural languages and produce meaningful information about the text.

In other words, NLP is used to teach a machine how to read and understand human languages. Trained machines ​can extract the relationships between words, identify the entities in a sentence (i.e., entity-recognition), and so much more!



# Choosing the data set
For a machine to perform an NLP task, you must first train it on a dataset relevant to that particular task. This dataset will usually be a large corpus of text, like Wikipedia.

A corpus of Wikipedia and newspaper articles could be useful for named-entity recognition tasks as they contain meaningful information about different entities.

Choosing an appropriate dataset is very important because low-quality training data *will* result in an inaccurate model.

# Training the model
Once the dataset is obtained and refined, it is time to use it to train the model.

There are various techniques used to train a model on a given corpus. Data scientists are continuously researching new ways to create the most accurate model.

One popular approach is to use **word vectorization**, where words are converted into vectors. The vectors of two similar words will be closer together than the vectors of two non-similar words.

The context in which the word is used can also give more information about its meaning. 

# Applications of natural language processing
NLP has numerous applications in the modern world. Some of them are:

What is natural language processing?

Choosing the data set

Training the model

Applications of natural language processing