__Bayesian Machine Learning (__**part -6 )**

__Probabilistic Clustering – Gaussian Mixture Model__

Continuing our discussion on probabilistically clustering of our data, where we left out discussion on part 4 of our Bayesian inference series. As we have seen the modelling theory of Expectation – Maximization algorithm in part-5, its time to implement it.

So, let’s start!!!

For a better revision, please follow the link :

__Remembering our problem__

We were having observed data as given in the below picture.

We have assumed there are in total 3 clusters and for each cluster we have defined the Gaussain distribution as follows :

We considered that data came from a **Latent variable t** who knows which data point belongs to which data point with what probability. The Bayesian model looks like :

Now the probability of the observed data given the parameters looks like :

The point to note here is that the equation :

Is already in its lower bound form, so no need to apply Jensen’s inequality.

__E-Step__

As we know the E-step solutions is :

**q(t=c) = P(t=c | X _{i .}, θ)**

As we know there are 3 clusters in our case we will need to compute the posterior distribution for our latent carriable **t** for all the 3 clusters, so, let’s see how to do it.

We will perform the above equation for all the observed data points, for every cluster with respect to every point. So, for example we have 100 observed data points and have 3 clusters, then the matrix of the posterior on the latent variable **t** is of dimension – 100 x 3.

__M-Step__

In the M step, we maximize our lower bound variational inference w.r.t **θ** and we use above computed posterior distribution on our latent variable **t** as constants. So let’s start,

The equation we need to maximize w.r.t **θ and** π_{1} ,π_{2} **,** π_{3} ** , is:**

The priors are computed with constraint that every prior >= 0 and sum of all

prior = 1

These E step and M step are iterated a multiple time one after the other, in a fassion that results of E step are used in M step and results of M step are used in E step. Doing this we end the iteration when the loss funtions stops reducing.

The loss funtion is the same function which we are differentiating in the M step.

The result of applying the EM algorithm on the given data set is as below:

So, in this post we saw how we can implement the EM algorithm for probabilistic clustering.

Thanks For Reading !!!

© 2020 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central