You can download the new machine learning cheat sheet here (PDF format, 14 pages.)
Originally published in 2014 and viewed more than 200,000 times, this is the oldest data science cheat sheet - the mother of all the numerous cheat sheets that are so popular nowadays. I decided to update it in June 2019. While the first half, dealing with installing components on your laptop and learning UNIX, regular expressions, and file management hasn't changed much, the second half, dealing with machine learning, was rewritten entirely from scratch. It is amazing how things have changed in just five years!
Source for picture: see here (original) or here (PDF)
Written for people who have never seen a computer in their life, it starts with the very beginning: buying a laptop! You can skip the first half and jump to sections 5 and 6 if you are already familiar with UNIX. This new cheat sheet will be included in my upcoming book Machine Learning: Foundations, Toolbox, and Recipes to be published in September 2019, and available (for free) to Data Science Central members exclusively. This cheat sheet is 14 pages long.
Content
1. Hardware
2. Linux environment on Windows laptop
3. Basic UNIX commands
4. Scripting languages
5. Python, R, Hadoop, SQL, DataViz
6. Machine Learning
To not miss this type of content in the future, subscribe to our newsletter. For related articles from the same author, click here or visit www.VincentGranville.com. Follow me on on LinkedIn, or visit my old web page here.
Comment
Thanks Venkatesh, I fixed the link.
Vincent, section 8 Machine Learning, the reference link is not working.
Thanks for sharing. Very informative and useful
Here's further perspective a year and a half after Vincent's article. His article is still awesome, very helpful.
If you're new to the Data Science space and trying to figure out what platform to adopt, go Linux/Unix. The Data Science, Big Data, Hadoop, etc., space is Linux stack. There is no indication of any OS that will replace it.
Make items 1 and 2 irrelevant in your career. Spend the time building skills in the areas identified in items 3 and 4. Yes, the layer on Windows does work. You can do the analysis you need to as described. It does work. But if you have a choice and think about the future why use it?
If you're in an organization that is Microsoft stack and uses Azure, or aspire to be in such an organization, then go with Windows with the layer. Otherwise, go with Linux/Unix. Thrive and good luck!
Extremely informative list! Thanks
Super Vincent
Great list!!
Vincent
A journey of a thousand miles begins with a simple step, or mouse click, or page view,..., you get the idea. Thanks for a comprehensive overview and roadmap as I begin my journey.
I love this post! :)
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central