A Semi-Supervised Classification Algorithm using Markov Chain and Random Walk in R

In this article, a semi-supervised classification algorithm implementation will be described using Markov Chains and Random Walks. We have the following 2D circles dataset (with 1000 points) with only 2 points labeled (as shown in the figure, colored red and blue respectively, for all others the labels are unknown, indicated by the color black).

Now the task is to predict the labels of the other (unlabeled) points. From each of the unlabeled points (Markov states) a random walk with Markov transition matrix (computed from the row-stochastic kernelized distance matrix) will be started that will end in one labeled state, which will be an absorbing state in the Markov Chain.


This problem was discussed as an application of Markov Chain in a lecture from the edX course ColumbiaX: CSMM.102x Machine Learning. The following theory is taken straightway from the slides of the same course:





Since here the Markov chain does not satisfy the connectedness condition of the Perron-Frobenius theorem, it will not have a single stationary distribution, instead for each of the unlabeled points the random walk will end in exactly one of the terminal (absorbing) state and we can directly compute (using the above formula, or use power-iteration method for convergence, which will be used in this implementation) the probability on which absorbing state it’s more likely to terminate starting from any unlabeled point and accordingly label the point with the class of the most probableabsorbing state in the Markov Chain.

The following animation shows how the power-iteration algorithm (which is run upto 100 iterations) converges to an absorbing state for a given unlabeled point and it gets labeled correctly. The bandwidth b for the Gaussian Kernel used is taken to be 0.01.

As can be seen, with increasing iterations, the probability that the state ends in that particular red absorbing state with state index 323 increases, the length of a bar in the second barplot represents the probability after an iteration and the color represents two absorbing and the other unlabeled states, where the w vector shown contains 1000 states, since the number of datapoints=1000.

Now, for each of the unlabeled datapoints, 500 iterations of the power-iteration algorithm will be used to obtain the convergence to one of the absorbing states, as shown. Each time a new unlabeled (black) point is selected, a random walk is started with the underlying Markov transition matrix and the power-iteration is continued until it terminates to one of the absorbing states with high probability. Since only two absorbing states are there, finally the point will be labeled with the label (red or blue) of the absorbing state where the random walk is likely to terminate with higher probability. The bandwidth b for the Gaussian Kernel used is taken to be 0.01. Off course the result varies with kernel bandwidth. The next animation shows how the unlabeled points are sequentially labeled with the algorithm.


Views: 11922

Tags: Chain, Classification, Learning, Machine, Markov, Semisupervised


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Kapil Vishwakarma on June 9, 2020 at 7:27pm

Since it says, “...using Markov Chain and Random Walk in R,” where is the R implementation of this?

Comment by Sandipan Dey on March 18, 2018 at 10:41am

@Roman this implementation was from scratch though, just by following the algorithm random walk classifier, which is pretty straight-forward. The challenge is choosing the bandwidth of the Gaussian Kernel though, to achieve the desired result.

Comment by Roman Nekrasov on March 18, 2018 at 4:51am

Examples, function sslMarkovRandomWalks: https://cran.r-project.org/web/packages/SSL/SSL.pdf

Comment by Timothy Lau on July 3, 2017 at 5:41am

Is there code to show how this was done?

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service