Training the spaCy Text Classifier
Let's learn about the details of spaCy's text classifier component.
In this section, we will learn about the details of spaCy's text classifier component TextCategorizer
. Previously, we saw that the spaCy NLP pipeline consists of components. We also learned about the essential components of the spaCy NLP pipeline, which are the sentence tokenizer, POS tagger, dependency parser, and named entity recognition (NER).
TextCategorizer
is an optional and trainable pipeline component. In order to train it, we need to provide examples and their class labels. We first add TextCategorizer to the NLP pipeline and then do the training procedure. The illustration below shows where exactly the TextCategorizer
component lies in the NLP pipeline; this component comes after the essential components. In the following diagram, textcat refers to the TextCategorizer
component.
Get hands-on with 1400+ tech skills courses.