What are n-grams?

An n-gram is a contiguous sequence of n words. For instance, “edpresso shots” is a 2-gram.

The concept of n-grams is commonly found in Natural Language Processinga branch of artificial intelligence that helps computers understand and manipulate human language (NLP) and data science.

An n-gram of size 1 is also referred to as a “unigram,” size 2 is a “bigram,” and size 3 is a “trigram.”

  1. “Educative” is a unigram (1-gram)
  2. “Edpresso shots” is a bigram (2-gram)
  3. “Here is another example” is a 4-gram
svg viewer

N-gram models

N-grams may be used to create probabilistic language models called n-gram models. N-gram models predict the occurrence of a word based on its N – 1 previous word.

Applications

N-grams have a wide variety of uses. Some applications of n-grams in NLP include auto-completion of sentences, auto spell-check, and semantic analysis. They are also used in DNA sequencing and other computational linguistic applications.

Free Resources

Copyright ©2024 Educative, Inc. All rights reserved