This article was written by Monica Rogati. Monica is an independent data science executive and advisor. She built key data products and teams at Jawbone and LinkedIn; she now helps companies make the most out of their data.
Do a project you care about. Make it good and share it.
There’s a lot of interest in becoming a data scientist, and for good reasons: high impact, high job satisfaction, high salaries, high demand. A quick search yields a plethora of possible resources that could help -- MOOCs, blogs, Quora answers to this exact question, books, Master’s programs, bootcamps, self-directed curricula, articles, forums and podcasts. Their quality is highly variable; some are excellent resources and programs, some are click-bait laundry lists. Since this is a relatively new role and there’s no universal agreement on what a data scientist does, it’s difficult for a beginner to know where to start, and it’s easy to get overwhelmed.
Source for picture: click here
Many of these resources follow a common pattern: 1) here are the skills you need and 2) here is where you learn each of these. Learn Python from this link, R from this one; take a machine learning class and “brush up” on your linear algebra. Download the iris data set and train a classifier (“learn by doing!”). Install Spark and Hadoop. Don’t forget about deep learning -- work your way through the TensorFlow tutorial (the one for ML beginners, so you can feel even worse about not understanding it). Buy that old orange Pattern Classification book to display on your desk after you gave up two chapters in.
This makes sense; our educational institutions trained us to think that’s how you learn things. It might eventually work, too -- but it’s a unnecessarily inefficient process. Some programs have capstone projects (often using curated, clean data sets with a clear purpose, which sounds good but it’s not). Many recognize there’s no substitute for ‘learning on the job’ -- but how do you get that data science job in the first place?
Instead, I recommend building up a public portfolio of simple, but interesting projects. You will learn everything you need in the process, perhaps even using all the resources above. However, you will be highly motivated to do so and will retain most of that knowledge, instead of passively glossing over complex formulas and forgetting everything in a month. If getting a job as a data scientist is a priority, this portfolio will open many doors, and if your topic, findings or product are interesting to a broader audience, you’ll have more incoming recruiting calls than you can handle.
Here are the steps I recommend. They are optimized for maximizing your learning and chances to get a data job.
To check out all this information, click here.