Note: for the most recent updates, click here.

At the request of many prospective participants, here's an update about our DSA (Data Science Apprenticeship). Also, I have added a few large data sets, new projects and more material. Click here for details.

If you have already earned a data science certificate or diploma, but was not requested to develop and use your own API in batch mode, and harvest/work on a data set with at least 50 million observations in a distributed environment, then it's time to learn the real stuff that will land you a real job!

A surface with only one side

Klein Bottle

Click here for a general overview of our apprenticeship. We have published the data and source code for our big data keyword correlation API. Read the material and download the three files (and post your comments if you have questions, I'll reply ASAP): it will teach you how API's work, and how to write your first API from scratch!

Our next API example will come with the source code of a web crawler, and will illustrate how to detect copyright infringement or how to detect the original, first version of an article published in multiple news outlets (doing a better job than Google).

All the training material will be offered for free to everyone. We have not yet put everything into a nice booklet, but some of the content is already available:

A few data sets available for download, from the following articles:

The following articles will be included in our curriculum, so you can start reading them now

List of potential projects for students:

Starred items (*) are recent additions.

We are still in the process of writing our small booklet to teach you all the fundamentals (computer science, statistics, business analytics, Python, Map Reduce, big data etc.) in 20 pages. Also, feel free to check our Data Science eBook -  2nd Edition. A much more comprehensive, curated and easy-to-read version is published by Wiley (April 2014) and costs less than $30.

Views: 99188

Replies to This Discussion

Fantastic, thank you!


Thank you so much!!

I can't wait.... this is going to be great!!


Has this program started? I would be very interested to join- thanks

Thank you so much, this is great.

Nice one.  Thank you, Vincent

When will the program start again?  I am very interested in completing it.

Any update on when the program will start again?

Dear Vincent, we never have the chance to interface, I just got this new event, wonder how to make it more profitable for all of us. Question: Whether information finding = web search (aspects relating to social, multimedia, device input, and geospatial, hence Big Data included) rely to data dictionary? How the search platform look like? vertical? horizontal?

See http://sigir2013.ie/industry_track.html#FindandbeFound

We should start the program!  Organize ourselves into groups based on sectors.  I'm sure he needs the assistance.


© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service