
Example of HMM training with hmmlearn
For this example, we are going to use hmmlearn, which is a package for HMM computations (see the information box at the end of this section for further details). For simplicity, let's consider the airport example discussed in the paragraph about the Bayesian networks, and let's suppose we have a single hidden variable that represents the weather (of course, this is not a real hidden variable!), modeled as a multinomial distribution with two components (good and rough).
We observe the arrival time of our flight London-Rome (which partially depends on the weather conditions), and we want to train an HMM to infer future states and compute the posterior probability of hidden states corresponding to a given sequence.
The schema for our example is shown in the following diagram:

Let's start by defining our observation vector. As we have two states, its values will be 0 and 1. Let's assume that 0 means On-time and 1 means Delay:
import numpy as np
observations = np.array([[0], [1], [1], [0], [1], [1], [1], [0], [1],
[0], [0], [0], [1], [0], [1], [1], [0], [1],
[0], [0], [1], [0], [1], [0], [0], [0], [1],
[0], [1], [0], [1], [0], [0], [0], [0], [0]], dtype=np.int32)
We have 35 consecutive observations whose values are either 0 or 1.
To build the HMM, we are going to use the MultinomialHMM class, with n_components=2, n_iter=100, and random_state=1000 (it's important to always use the same seed to avoid differences in the results). The number of iterations is sometimes hard to determine; for this reason, hmmlearn provides a utility ConvergenceMonitor class which can be checked to be sure that the algorithm has successfully converged.
Now we can train our model using the fit() method, passing as argument the list of observations (the array must be always bidimensional with shape Sequence Length × NComponents):
from hmmlearn import hmm
hmm_model = hmm.MultinomialHMM(n_components=2, n_iter=100, random_state=1000)
hmm_model.fit(observations)
print(hmm_model.monitor_.converged)
True
The process is very fast, and the monitor (available as instance variable monitor) has confirmed the convergence. If the model is very big and needs to be retrained, it's also possible to check smaller values of n_iter). Once the model is trained, we can immediately visualize the transition probability matrix, which is available as an instance variable transmat_:
print(hmm_model.transmat_)
[[ 0.0025384 0.9974616 ] [ 0.69191905 0.30808095]]
We can interpret these values as saying that the probability to transition from 0 (good weather) to 1 (rough weather) is higher (p01 is close to 1) than the opposite, and it's more likely to remain in state 1 than in state 0 (p00 is almost null). We could deduce that the observations have been collected during the winter period! After explaining the Viterbi algorithm in the next paragraph, we can also check, given some observations, what the most likely hidden state sequence is.