For motivated students who can learn on their own, here's an option that I would like to offer: the possibility to become an expert data scientist in less than six months, for a cost well below $10,000, and with guaranteed job opportunities.
The program would be open to everyone without screening, but the degree and the guaranteed jobs would be offered only to students with a successful completion of selected projects. If you don't succeed, you don't pay.
The program would contain three parts:
Part I: Online training:
20-pages booklet containing all the info you need to jump-start your data science career, written in simple English:
- how to download Python, Perl, Java, R, get sample programs, get started with writing an efficient web crawler, get started with Linux, Cygwin, Excel (including logistic regression)
- Hadoop, MapReduce, NoSQL their limitations and more modern technologies
- how to find data sets or download very lage data sets for free on the web
- how to analyze data: from understanding business requirements to maintaining an automated (machine talking to machine) web / database application in production mode - a 12 steps process
- how to develop your first "Analytics as a Service" application and scale it
- big data algorithms, and how to make them more efficient and more robust (application in computational optimization: how to efficiently test trillions of trillions of multivariate vectors to design good scores)
- basics about statistics, monte-carlo, cross-validation, robustness, sampling, design of experiments
- tons of startup ideas for analytic people
- reference data science book available for free (click here to see 2nd Edition)
- basics of Perl, Python, real time analytics, distributed architecture and general programming practices
- data visualization, dashboards and how to communicate like a management consultant
- tips for future consultants
- tips for future entrepreneurs
- rules of thumb, best practices, craftsmanship secrets, and why is data science an art?
- additional online resources
- lift and other metrics to measure success, metrics selection, use of external data, make data silos communicate via fuzzy merging and statistical techniques
Part II: Potential projects to be completed:
- hacking and reverse-engineering projects, for instance a captcha attack
- web crawling projects: how many Facebook accounts are duplicate or dead? Or categorize Tweets
- taxonomy creation or improving an existing taxonomy
- optimal pricing for bid keywords on Google
- create a web app that provide (in real time) better-than-average trading signals
- find low-frequency and Botnet fraud cases in a sea of data
- internship in computational marketing with a data science start-up
- automated plagiarism detection
- estimating the number of entries (articles) on Wikipedia
- use web crawlers, assess whether Google Search favors (1) its products over competitors [is this an unfair business practice?], (2) local over non-local results and (3) returns different results to web robots and humans. Identify other bias and patterns in Google search results.
Part III: Students successfully completing two projects
- would be featured in the largest data science community
- would receive help finding a job or advice about jump-starting their own company
- would get endorsement from a leading data scientist
- may be hired by sponsor companies funding this project
How to enroll?
If interested, join our Data Science Apprenticeship group to receive updates about our program and schedule, and to receive an invitation to participate as well as free training material, when the program is open.