Subscribe to Dr. Granville's Weekly Digest

Proposal for an Apprenticeship in Data Science

For motivated students who can learn on their own, here's an option that I would like to offer: the possibility to become an expert data scientist in less than six months, for a cost well below $10,000, and with guaranteed job opportunities.

The program would be open to everyone without screening, but the degree and the guaranteed jobs would be offered only to students with a successful completion of selected projects. If you don't succeed, you don't pay.

The program would contain three parts:

Part I: Online training:

20-pages booklet containing all the info you need to jump-start your data science career, written in simple English:

  • how to download Python, Perl, Java, R, get sample programs, get started with writing an efficient web crawler, get started with Linux, Cygwin, Excel (including logistic regression)
  • Hadoop, MapReduce, NoSQL their limitations and more modern technologies
  • how to find data sets or download very lage data sets for free on the web
  • how to analyze data: from understanding business requirements to maintaining an automated (machine talking to machine) web / database application in production mode  - a 12 steps process
  • how to develop your first "Analytics as a Service" application and scale it
  • big data algorithms, and how to make them more efficient and more robust (application in computational optimization: how to efficiently test trillions of trillions of multivariate vectors to design good scores)
  • basics about statistics, monte-carlo, cross-validation, robustness, sampling, design of experiments
  • tons of startup ideas for analytic people
  • reference data science book available for free (click here to see 2nd Edition)
  • basics of Perl, Python, real time analytics, distributed architecture and general programming practices
  • data visualization, dashboards and how to communicate like a management consultant
  • tips for future consultants
  • tips for future entrepreneurs
  • rules of thumb, best practices, craftsmanship secrets, and why is data science an art?
  • additional online resources
  • lift and other metrics to measure success, metrics selection, use of external data, make data silos communicate via fuzzy merging and statistical techniques

Part II: Potential projects to be completed:

  • hacking and reverse-engineering projects, for instance a captcha attack
  • web crawling projects: how many Facebook accounts are duplicate or dead? Or categorize Tweets 
  • taxonomy creation or improving an existing taxonomy
  • optimal pricing for bid keywords on Google
  • create a web app that provide (in real time) better-than-average trading signals
  • find low-frequency and Botnet fraud cases in a sea of data
  • internship in computational marketing with a data science start-up
  • automated plagiarism detection
  • estimating the number of entries (articles) on Wikipedia
  • use web crawlers, assess whether Google Search favors (1) its products over competitors [is this an unfair business practice?], (2) local over non-local results and (3) returns different results to web robots and humans. Identify other bias and patterns in Google search results.  
  • creation of RSS feed exchange

Part III: Students successfully completing two projects

  • would be featured in the largest data science community 
  • would receive help finding a job or advice about jump-starting their own company
  • would get endorsement from a leading data scientist
  • may be hired by sponsor companies funding this project

How to enroll?

If interested, join our Data Science Apprenticeship group to receive updates about our program and schedule, and to receive an invitation to participate as well as free training material, when the program is open.

Related articles:

Views: 24327

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Vincent Granville on April 18, 2014 at 10:18pm

Please check out our most recent update to access additional material. Our 20-page booklet has been called the "data science cheat sheet", and a preliminary version is now available. I'm in the process of adding more material. I will post an update by Monday April 21.

Comment by Ananth Ananthapuram on April 16, 2014 at 7:27pm

I bought the book and completed 3 chapters. It's Interesting!!!

Where can I download 20 pages booklet?

Comment by ANDRIANAVALONA Tantely Hoby H. on March 4, 2014 at 12:54am

I'm really interested in this project. I hope to read you back soon when it will start.

Comment by Venkat C on July 21, 2013 at 3:05pm

I'm interested please let me know how to enroll and start the training in Data Science as this is right fit for me to enhance my skills in this area as I have over a decade of experience and based at NJ, US.

Comment by jgorricho on May 1, 2013 at 12:17pm

Is the booklet mentioned in part 1 above available?

Comment by Steven Paul Sanderson II on March 15, 2013 at 1:38pm

I have an interesting project that I have been trying to move forward for sometime, I want to, and have been collecting messages from twitter that match ticker symbols. I have over 1 million messages collected over a long time frame. I want to see if there is correlation between them and securities price movements.

Comment by Vincent Granville on March 11, 2013 at 5:44pm

@Fahad: The maths/stats needed are different and more applied than those required to attend a traditional stats curriculum. Emphasis is on algorithms, much less on math modeling: there won't be any advanced calculus such as eigenvalues, singular value decomposition, generalized inverse matrix, differential equations, orthogonal polynomials, generating functions or integrals.

Comment by Fahad Khan on March 11, 2013 at 4:14pm

Very quick question Dr. Vincent. 


Is or will there be in the future, a good dose of calculus based statistics and mathematics beyond the 3 semesters of Calculus become a requirement of the program?

I am asking because I do not know whether that much math is needed to become a data scientist? 

Thank you very much for posting this and introducing this program!


Fahad

Comment by Conrad Montlouis on March 7, 2013 at 8:57pm

I would be very interested!  How to?

Comment by Steven Paul Sanderson II on March 4, 2013 at 4:29pm

Very Interested in this Dr. G

Follow Us

Videos

  • Add Videos
  • View All

© 2014   Data Science Central

Badges  |  Report an Issue  |  Terms of Service