...

/

Searching for Matching Documents with tf-idf

Searching for Matching Documents with tf-idf

Learn about tf-idf and how it is calculated and used in information retrieval, search engines, and other NLP applications.

Playing a game with documents

There is a common children’s game called “I Spy.” A group sits in a circle, and the leader says, “I spy, with my little eye, something blue. Everyone else would then try to guess what the leader was looking at. Was it the blue telephone? Or perhaps the blue couch?

Natural language processing is often similar to this game. Given a document or a word, we have to determine the best-matching document from a list of documents. This is exactly what is done with an internet search or spam filtering.

There are many strategies for this type of search. One of the most common is called term frequency-inverse document frequency or tf-idf.

Note: TF–IDF, TF*IDF, ...

Access this course and 1400+ top-rated courses and projects.