Machine Learning and Data Science Cheat Sheet

You can download the new machine learning cheat sheet here (PDF format, 14 pages.) 

Originally published in 2014 and viewed more than 200,000 times, this is the oldest data science cheat sheet - the mother of all the numerous cheat sheets that are so popular nowadays. I decided to update it in June 2019. While the first half, dealing with installing components on your laptop and learning UNIX, regular expressions, and file management hasn't changed much, the second half, dealing with machine learning, was rewritten entirely from scratch. It is amazing how things have changed in just five years!

Source for picture: see here (original) or here (PDF)

Written for people who have never seen a computer in their life, it starts with the very beginning: buying a laptop! You can skip the first half and jump to sections 5 and 6 if you are already familiar with UNIX. This new cheat sheet will be included in my upcoming book Machine Learning: Foundations, Toolbox, and Recipes to be published in September 2019, and available (for free) to Data Science Central members exclusively. This cheat sheet is 14 pages long.


1. Hardware

2. Linux environment on Windows laptop

3. Basic UNIX commands

4. Scripting languages

5. Python, R, Hadoop, SQL, DataViz

6. Machine Learning

  • Algorithms
  • Getting started
  • Applications
  • Data sets and sample projects

To not miss this type of content in the future, subscribe to our newsletter. For related articles from the same author, click here or visit www.VincentGranville.com. Follow me on on LinkedIn, or visit my old web page here.

Views: 267335


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Chintan Donda on November 10, 2015 at 12:52am

Excellent !!

Covers almost all the MUST have things for Data Science.

Thanks for sharing.

Comment by Yi He on September 11, 2015 at 8:37pm

Many thanks.

It seems like there is a little error: Basic web crawler * could not be accessed, which gives the message that "Our apologies – this page was not found". Would you please help check it? Thanks.

Comment by Mukesh N. Shende on July 23, 2015 at 3:50am

Excellent collation!!! As beginner to the site and DS journey this is well organized stuff. Looking forward use this as I march ahead in learning more about data science.

Comment by Hisham Sliman on May 31, 2015 at 4:41pm

'This Article' is now "https://github.com/gumption/Python_for_Data_Science"

Comment by Stephen O'Connell on February 19, 2015 at 9:01am

Very nice summary, and some very useful links.  I am in the process of putting together a dashboard and found your 10 Features all Dashboards Should Have post a helpful checklist.

Do you find github/bitbucket/etc. a useful data science tool at this point?

Comment by Antonio Marcos Moraes Ribeiro on January 18, 2015 at 5:07pm

Wonderful tutorial!
Things that not learn in the classroom...
Congratulations Vincent!!!

Comment by Besim Ismaili on December 5, 2014 at 5:17am

AMAZING! Thank you a lot!

Comment by Anand Surampudi on November 24, 2014 at 2:31am

Vincent, thanks a lot!

Comment by Maloy Manna on October 15, 2014 at 5:13am

Excellent compilation - just one niggle though - avoid using Cygwin on Windows or Homebrew on MacOSX - it may mess up some big data frameworks. It's probably best to use Linux with a Virtualbox - it also allows you to use nice IDEs like Dataiku.

Great work though - thanks Vincent!

Comment by Milton Labanda on October 10, 2014 at 6:24pm

@Vincent, what do you think about use only opensource softaware office (Writer, Calc) instead Excel, Word? some negative experience with these?

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service