1) You can picture yourself being introduced as a data scientist without blushing.
Data science is more than a career, it is a lifestyle that requires committing to be a lifelong learner with all the trade-offs that entails. If the idea of “data science” seems over hyped and hokey then you most likely will not have the drive to persevere through the hard work to the point where you are comfortable being called a data scientist.
2) You are willing to spend less time playing computer games, watching Netflix reruns and “hanging out”.
Regardless of whether you pursue formal education or your own path to learning, it is a challenging field that takes a lot of dedication. All the approaches that promise you a fast path to success are bunk. What fields can you become accomplished in after 6 months? Is there any computer science or engineering field of study that produces significant results in less than a few years at a minimum? Then you need practical experience after that, at least a couple of years’ worth. It is best to recognize those realities early on.
3) The thought of learning something new about math or statistics doesn’t cause sweating, fainting or heart palpitations.
All the voices on the internet telling you that today’s data science tools are so good you don’t need to study math and statistics are not being honest. Look at any reputable institution offering certificate programs in data science and they will either have pre-requisites or will teach the background material you need. If you are going to learn on your own, don’t try to skip these fundamentals.
4) You once wrote some formulas in Excel and liked it.
You need to enjoy writing programs with one or more computer languages. You don’t need to be a professional software developer capable of building the next cool feature for Linked In, Twitter or Facebook, but, you need to ingest data, transform data, and glue together packages written by other people that do useful things. You may, at some point, decide to write a library or package that others will find useful but many data scientist are happy and successful without ever doing that.
5) You’re not afraid to fall and skin your knee trying to get better at something.
Data science is experimental and there are many problems for which there are no “good” solutions that can be developed in a reasonable amount of time. Sometimes it takes a lot of effort to conclude that I can’t say anything interesting. Maybe someone else comes along next and uses an approach that you didn’t know about or didn’t consider and finds an interesting result. Ouch. No matter how hard you work and how much you study, sometimes things don’t pan out.
6) More than once, you stayed up past midnight trying to figure out why something on your computer wasn’t working.
Data science is sometimes described as the intersection of math/statistics and computer science. You will need to proficient in configuring and using computers. Sometimes the tools don’t cooperate and without them you can’t get much done. Doing web searches for software and hardware error messages and understanding the results is not taught in any data science course that I know about. You will need those skills.
7) You know at least one person that you could probably talk to about a math problem and you like them.
In most careers, having a network of other practitioners is critical to success. Since you are likely to need to brush up on or start with some math study ask yourself – who can I bounce ideas off? If you don’t have any idea, start building that network.
8) In the last 6 months, you’ve clicked on some link about learning data science to see what was on the other end.
I have never seen so much discussion in the popular media about machine learning, artificial intelligence and data science. If you aren’t sharing this experience you should start looking at Twitter, Linked In and/or Reddit for more background on why the shortage of data science talent is not going away anytime soon. Many people are starting study programs but relatively few are completing them. Do your homework before you start any type of training or formal education to see how many people that start are completing the program.
9) You like to be employed and making good money.
Two years ago, there were dire predictions that there was no way to fill the job needs for data scientists and that qualified people were commanding outrageous salaries. Last year some pundits were predicting that the shortage was not going to persists because “citizen data scientist” were going to fill the need with new tooling that didn’t require long periods of study and any experience. We are back to the reality that the industry needs people with both extensive training and experience. Those people are still commanding top salaries. The need is going to persist, but, you won’t be successful in the long run if you are just chasing the salary. You need to love the work as much as the money to persist through the obstacles that you will encounter.
10) Most of what you have read so far makes sense and you kept reading.
You finished reading to this point and didn’t give up. I’m not saying my perspective is the only one or that there is only one way to reinvent yourself into a career in data science. I do feel like it is a challenging career and there are a lot of voices telling people the opposite – that there is a quick and easy path to becoming an accomplished data science professional. I hope you now realize that that is probably not the case and start to look for better strategies.
My suggestion is a take on the advice often given to people that want to become writers – “write what you know”. For data science it might be something like – “explore what you love”. Kaggle currently has information on over 12,000 datasets. Just looking at the first couple of pages I see:
- Data Science for Good: Center for Policing Equity
- NHL Game Data
- Family Households with Married Couples Data
- Volcanoes on Venus, and
- Women’s Ecommerce Clothing Reviews
Out of 12,000 data sets there must be something that looks interesting enough to want to go exploring. Once you see something interesting look at the metrics on the right. The top right-hand column show the number of kernels (projects) that have been started for working with that dataset. Study what other data scientists and students are doing with the data. Then tackle working with that dataset on your own.
Figure out how to ingest, clean and visualize the data. This is much more valuable for deciding whether this a good career for you than getting samples of clean data and “running models”. Most data scientist spend 4-5 hours working with data for every 1 hour spent working with models. You must love that aspect of the job to be successful and it is often the part that many people don’t get exposed to early enough.
Try working with Excel if that is in your skill set and the data doesn’t exceed the tools limits. Quickly move on to working with coding tools like Python and R. Try both and see what the relative advantages and disadvantages are. If that seems like too much work, maybe that is an indication that this is not a good field for you.
I hear a lot of folks discussing is it better to start leaning data science by learning coding or by studying math and statistics. I’m suggesting start with a dataset that is fascinating to you and learn whatever coding, math and/or statistics to start telling a story about the data. As long as you have a passion for exploring data, you will overcome the limits of your math and coding skills to keep peeling back the onion to uncover deep insight and more interesting stories.