Hidden Markov Models

Learn what a hidden Markov model is and how it can be represented with transition and emission matrices.

A hidden Markov model (HMM) is a probabilistic model used for modeling sequential data, where the underlying system is assumed to be a Markov process with hidden states that generate observable outputs. The "hidden" in HMM refers to the fact that the states of the system that generate the observations are not directly observable. Instead, we observe a sequence of outputs and aim to infer the most likely sequence of hidden states that generated these observations. In the context of POS tagging, the observations we are making are the sequence of words in our sentence, and it's the task of the model to use this to assume the part of speech (hidden).

An HMM consists of the following components:

  • States: A set of hidden states that generate the observations. Each state represents a particular condition, category, or context. In NLP, these states might represent parts of speech or syntactic structures.

  • Observations: A sequence of observations or outputs that are emitted from the hidden states. These observations are what we can observe directly. In NLP, observations could be words in a sentence.

  • Transition probabilities: A transition matrix that encodes the probabilities of transitioning from one hidden state to another. This matrix captures the dynamics of how the hidden states change over time. Each entry (i,j)(i, j) in the transition matrix represents the probability of transitioning from state ii ...