Subscribe to Dr. Granville's Weekly Digest

20 short tutorials all data scientists should read (and practice)

The new, completed version of this Data Science Cheat Sheet can be found here.

We are now at 20, up from 17. I hope I find the time to write a one-page survival guide for UNIX, Python and Perl. Here's one for R. The links to core data science concepts are below - I need to add links to web crawling, attribution modeling and API design. Relevancy engines are discussed in some of the tutorials listed below. And that will complete my 10-page cheat sheet for data science. 

Here's the list:

  1. Tutorial: How to detect spurious correlations, and how to find the ...
  2. Practical illustration of Map-Reduce (Hadoop-style), on real data
  3. Jackknife logistic and linear regression for clustering and predict...
  4. From the trenches: 360-degrees data science
  5. A synthetic variance designed for Hadoop and big data
  6. Fast Combinatorial Feature Selection with New Definition of Predict...
  7. A little known component that should be part of most data science a...
  8. 11 Features any database, SQL or NoSQL, should have
  9. Clustering idea for very large datasets
  10. Hidden decision trees revisited
  11. Correlation and R-Squared for Big Data
  12. Marrying computer science, statistics and domain expertize
  13. New pattern to predict stock prices, multiplies return by factor 5
  14. What Map Reduce can't do
  15. Excel for Big Data
  16. Fast clustering algorithms for massive datasets
  17. Source code for our Big Data keyword correlation API
  18. The curse of big data
  19. How to detect a pattern? Problem and solution
  20. Interesting Data Science Application: Steganography

Other Cheat Sheets

Vincent's Cheat Sheets for Perl, R, Excel (includes Linest, Vlookup), Linux, cron jobs, gzip, ftp, putty, regular expressions, Cygwin, pipe operators, files management, dashboard design etc. coming soon

Cheat Sheets for Python 

Cheat Sheets for R 

Cross Reference between R, Python (and Matlab) 

Cheat Sheets for SQL 

Additional 

Related linkThe Data Science Toolkit

Other interesting links

Views: 33484

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Linda B. on May 29, 2014 at 5:23am
I appreciate you taking time to post these cheat sheets. They are a great resource for newbies like myself.
Comment by Arthur on May 28, 2014 at 4:16am
Many cheatsheets are already at DZone, perhaps worth adding a link to
Comment by fatih hamurcu on May 22, 2014 at 3:48am

Thanks for these things.

On the contrary, Have/May you considered to put a cheatsheets about popular NoSQL databases like MongoDB, Hbase, Redis?

Comment by Amy on May 12, 2014 at 9:47am

A few useful cheat sheets have been added to the above posting..

Comment by JJ Persaud on February 21, 2014 at 9:14am

Thanks for taking the time Vincent.

Follow Us

Videos

  • Add Videos
  • View All

© 2014   Data Science Central

Badges  |  Report an Issue  |  Terms of Service