- Brush up on your math and statistics skills. A good data scientist must be able to understand what the data is telling you, and to do that, you must have solid basic linear algebra, an understanding of algorithms and statistics skills. More advanced mathematics may be required for certain positions, but this is a good place to start.
- Understand the concept of machine learning. Machine learning is emerging as the buzzword but it is inextricably linked to big data. Machine learning uses artificial intelligence algorithms to turn data into value and learn without being explicitly programmed.
- Learn to code. Data scientists must know how to manipulate code in order to tell the computer how to analyse the data. Start with an open source language like Python and go from there.
- Understand databases, data lakes and distributed storage. Data is stored in databases, data lakes or across distributed networks, and how those data repositories are built can often dictate how you can access, use, and analyse that data. Failing to see the big picture or think ahead when you construct your data storage can have far-reaching consequences.
- Learn data munging and data cleaning techniques. Data munging is the process of converting “raw” data to another format that is easier to access and analyse. Data cleaning helps eliminate duplication and “bad” data. Both are essential tools in a data scientist’s toolbox.
- Understand the basics of good data visualisation and reporting. You don’t have to become a graphic designer, but you do need to be well versed in how to create data reports that a lay person — like your manager or CEO — can understand.
- Add more tools to your toolbox. Once you’ve mastered the above skills, it’s time to expand your data science toolbox to include programs like Hadoop, R and Spark. Knowledge of and experience with these tools will set you above a great many data science job applicants.
- Practice. How do you practice data science before you have a job in the field? Develop your own pet project from open source data, enter competitions, network with working data scientists, join a bootcamp, volunteer or intern. The best data scientists will have experience and intuition in the field and be able to show their work to a recruiter.
- Become a part of the community. Follow thought leaders in the industry, read industry blogs and websites, engage, ask questions, and stay abreast of current news and theory.