Subscribe to DSC Newsletter

Starting down the path of Data Science

I've always been interested in data, how it's interpretted and the different ways it can be sliced. However, I've always considered statstics itself to be a math that I didn't like. As "data science" and "big data" became more popular, however, I started to look into ways to learn more about it and possibly use it as a entry into another act of my career.

The advent of MOOCs has opened up the possibility of learning new subjects and subject-focused websites like Data Science Central here make diving into new subjects more accessible than in the past. I wanted to detail the start of my journey here - in a couple of months, I'll update where I am on my path and what else I've learned along the way.

The hardest part of a learning journey is where to start. If you don't know anything, there's so much to learn that it's difficult to know what order to take things in. Google sent me down a few directions with a solid Quora page, following prolific data scientists on Twitter that I got off a Hilary Mason post (Hilary herself being a prolific user of Twitter) and just general hunting around.

However, again, that gave me lots of places to start but not any notion as to what the right start was. As I was reading, though, it became more and more apparant that no data science language, technique or tool was going to be useful unless I had an understanding of basic statistics. This led me back to Coursera, where I had dabbled in classes but never dove in. Picking through the catalog led me to Dr. Mine Çetinkaya-Rundel's Duke course on Data Analysis and Statistical Inference. Dr. Çetinkaya-Rundel's approach was comprehensive yet approachable. Sometimes in the college courses, the professors bury you in mathematical theory and you get so busy memorizing Greek letters that you get frustrated. The approach here was to keep the theory light by explaining it through application. I took the class with no issues and am no longer intimidated by statistics.

In the meantime, Udacity was pushing its Data Science specialization and Coursera had published theirs through John Hopkins. Both had a free and paid component but the course list looked a bit tighter and better integrated on Coursera, even though I chose the free option. I also reviewed Dr. Granville's Data Science Apprenticeship here. I eventually ruled out the apprenticeship, for now, because I felt I had more learning to do.

Right now, I'm in the second of three sections in the Johns Hopkins track. Those will wrap up by July. I'm also signed up for the Machine Learning course that may have started all this, at least in the MOOC space, from Stanford. The Johns Hopkins track has a Machine Learning course but I expect it to be a lighter treatment than the Stanford one and a good grounding in Machine Learning is necessary to play in that space as well.

I don't know where this journey will take me yet but it's been a great learning experience so far. I look forward to continuing and then maybe jumping in on the apprenticeship at some later date!

Views: 6214


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Jerry Smith on September 9, 2015 at 4:42am

I too am working the John Hopkins track..  How did that go for you?

Comment by Sarah Hoessler on May 22, 2014 at 10:44pm

Hi Don, 

Thank you for sharing the beginning of your journey in data science, I wish you success on this path ! I'm a fervent of MOOCs too, and even if I majored in statistics only 5 years ago, I feel that my academic path was not adapted to the current problems that need to be solved in the industry. In a way, I define myself as a newbie too! I am acquainted with all the classes you mention, I am in the middle of the Johns Hopkins data science track for a refresher (and also hunting tricks in R), and have just finished Andrew Ng's machine learning class. I think that in this field you cannot (and should not!) stop learning.

Though I did not get a chance to go beyond week 1 content of the Data Analysis and Statistical Inference class, I was very much impressed at how illuminatingly the content was explained, and I trust this MOOC to be one of the very best introductory classes to statistical reasoning.

I don't have a lot of professional experience yet in data analytics, and even if experience is important of course, I do believe that scientific reasoning and tenacity are the only crucial skills to make a good data scientist. Just as when you change jobs, nobody tells you that you actually start again from scratch (only bringing your soft skills with you); when you are confronted with a new data question, you need to be able to learn new techniques and approaches, but it all resolves back to the same reasoning. Don't be too much impressed at all those tools and cool techniques that are being developed in the last years, and don't try too hard at mastering them right now ; they will impose themselves to you when you are faced with a problem that calls for them. The more I look into those tools, the more I understand that they are no magical wands. I really fear that those statistical tools that are more and more user-friendly (and user-friendly is not a bad thing in itself, don't get me wrong!), raise the risk of being applied mechanically to any poorly defined problem, invariably yielding some raw and possibly misleading output. 

So if you are serious about developing your data science skills, I advise you to go deeper into statistical theory. Don't stifle your curiosity about data science tools and techniques, I recognize that too many of them are really exciting, but don't let them make you lose sight of the core of data science ~ science !

We are at a very exciting moment of the development of our field, and I hope that our international community will find efficient ways to train the masses of young data scientists that are urgently needed. MOOCs can be the right place to start; if you stay curious, and never stop asking "why?", whatever place you start, I trust you can't get wrong ! Please do keep us posted on how you are doing !


  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service