t-SNE algo in R and Python, made with same dataset

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets. Click here for details.

According to Wikipedia, t-SNE is a nonlinear dimensionality reduction technique developed by Geoffrey Hinton and Laurens van der Maaten. It is particularly well-suited for embedding high-dimensional data into a space of two or three dimensions, which can then be visualized in a scatter plot. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points.

The information below was tweeted by Rubens Zimbres:

t-SNE algo in R and Python, made with same dataset (digits from Python). It seems the discriminant power is the same, have to check. BUT time spent in computation is more than double for R.

Code in Python in repo 2017 (on Github)
Code in R in repo 2016 (on Github)

Top DSC Resources

Article: Difference between Machine Learning, Data Science, AI, Deep Learnin…
Article: What is Data Science? 24 Fundamental Articles Answering This Question
Article: Hitchhiker’s Guide to Data Science, Machine Learning, R, Python
Tutorial: Data Science Cheat Sheet
Tutorial: How to Become a Data Scientist – On Your Own
Tutorial: State-of-the-Art Machine Learning Automation with HDT
Categories: Data Science – Machine Learning – AI – IoT – Deep Learning
Tools: Hadoop – DataViZ – Python – R – SQL – Excel
Techniques: Clustering – Regression – SVM – Neural Nets – Ensembles – Decision Trees
Links: Cheat Sheets – Books – Events – Webinars – Tutorials – Training – News – Jobs
Links: Announcements – Salary Surveys – Data Sets – Certification – RSS Feeds – About Us
Newsletter: Sign-up – Past Editions – Members-Only Section – Content Search – For Bloggers
DSC on: Ning – Twitter – LinkedIn – Facebook – GooglePlus

t-SNE algo in R and Python, made with same dataset

Leave a Reply Cancel reply