Home » Uncategorized

t-SNE algo in R and Python, made with same dataset

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets. Click here for details. 

According to Wikipedia, t-SNE is a nonlinear dimensionality reduction technique developed by Geoffrey Hinton and Laurens van der Maaten. It is particularly well-suited for embedding high-dimensional data into a space of two or three dimensions, which can then be visualized in a scatter plot. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points.

The information below was tweeted by Rubens Zimbres:

t-SNE algo in R and Python, made with same dataset (digits from Python). It seems the discriminant power is the same, have to check. BUT time spent in computation is more than double for R.


Top DSC Resources

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge