Hidden Markov Model (HMM)

Hidden Markov model (HMM) is a process model in which the process is considered Markov, and it is not known what state si the system is in (states are hidden), but each state si can produce an event oj with some probability bioj, which can be observed.

The model is a Markov chain for which we know the initial probability and the transition probability matrix. It is called hidden because we have no information about its current state. We get information based on some observation, in the algorithm discussed below we will use just a natural number from 1 to N as the index of the observed event. For each state of the hidden Markov model, an emission probability vector is given, which characterizes the probability of observing each event when the model is in this state. The set of such vectors forms the emission matrix.

Markov model λ is defined as λ = {S, Ω, Π, A, B}, where S = {s1… sn} are states, Ω = {ω1… ωm} are possible events, Π = {π1… πn} are initial probabilities, A = {aij} is the transition matrix, and B = {biωk} is the probability of observing the event ωk after the transition to the state si.

Examples

Let’s consider an example of a hidden Markov model. Santa Claus has three gift bags in multi-colored packaging: red, blue, green and purple. At night, Santa Claus sneaks into the apartment and secretly lays out gifts under the tree in a row, taking out one gift from the bag. The next morning we find an ordered sequence of five gifts and want to make the best guess about the sequence of bags from which he took these gifts. Santa Claus with bags is a hidden Markov model. In this case, 4 colors are a space of M possible events, 3 bags are the number of states N, 5 gifts are our K observations, each of which is represented by a number – a color number – from 1 to 5. We know what are the probabilities that Santa Claus will start get gifts from the bag with number i – vector π [i]. We also know the transition matrix A, what is the probability that Santa Claus goes from the bag with the number i to the bag with the number j. Santa Claus’s bags are endless, but we know exactly what the ratio of the colors of gifts in each bag was loaded to him at the factory in Lapland. This is the probability matrix of the B emission.

HMM algorithms

Cluster analysis is a multivariate statistical procedure that collects data containing information about a sample of objects, and then arranges the objects into relatively homogeneous groups;
Regression analysis is a statistical method for studying the influence of one or more independent variables X1, X2 … Xp on the dependent variable Y;
Statistical classification is a formalized problem in which there are many objects (situations), divided in some way into classes.

Learning

The parameter of the HMM learning problem is finding the best result. Let there be given an output sequence or a set of such sequences, the best set of transition states and emission probabilities. The task, as a rule, is to obtain the maximum probabilistic estimate of the HMM parameters, given the set of output sequences. There is no general solution to this problem, but the Baum-Welsh Algorithm can be effectively used to find the local maximum likelihood result. If the HMM is used to predict time series, then a more sophisticated method such as the Markov chain of Monte Carlo can be used, which has proven to be favorable for finding a single probability model, both in terms of accuracy and stability. Since there are significant computational loads in MCMC, in cases where computational scalability is also of interest, one can also resort to the variational approximation of Bayesian inference. Indeed, approximate variational inference offers computational efficiency comparable to expectation maximization, because so far the generating accuracy profile is only slightly inferior to the exact MCMC-type Bayesian inference.