Important Note: Our Data Science Cheat Sheet is now available. Please read it (and follow instructions as needed) if you are not familiar with UNIX, R and scripting languages. This is the minimum stuff that you need to know to get started - if you start from scratch. Most candidates in our DSA are already familiar with the concepts explained in our cheat sheet.
Below is the updated list of available projects, for participants in our data science apprenticeship (DSA) program. It includes four business / applied data science and two data science research projects. In addition to these projects, we strongly encourage you to participate in our data science challenges.
Project #8: Detecting fake reviews on Amazon
Business and Applied Data Science
Data Science Research
For the spurious correlations project, you could actually create one variable X with arbitrary but fixed values, and check how it correlates with thousands of simulated variables. The reason being that any set of observations for X has the same probability to occur, under the uniform distribution assumption. This considerably reduces the number of computations, turning a O(n^2) problem with large n, into O(n), from a computational complexity point of view.