CrowdFlower Report Reveals: Data Scientists Say Messy Data and Lack of Time for Analysis Are Their Top Career Obstacles
2015 Survey of Data Scientists Uncovers What's Holding Them Back in Their Jobs and How Organizations Can Empower Them to Deliver Greater Strategic Value
CrowdFlower today released its 2015 Data Scientist Report. Findings revealed that data scientists saw messy, disorganized data as a major hurdle preventing them from doing what they find most interesting in their jobs: predictive analysis and data mining for behavioral patterns and future trends. The majority of data scientists surveyed also acknowledged the skills shortage within their field.
Salient findings of the report uncover what is and isn't working in the data science field. These findings include:
- "Data science" is a new term for something that's been around for a while. While the term "data science" is relatively new, 16 percent of data scientists reported that they have worked in this field for 10 years or more. This suggests that "data science" is a new term that describes something that IT professionals have been doing for many years.
- Messy, disorganized data is the number one obstacle holding data scientists back.Two-thirds of respondents say cleaning and organizing data was the least interesting and most time-consuming task, taking time away from more preferred tasks, such as predictive analysis and data mining.
- There are not enough data scientists. Nearly 80 percent of respondents indicate there is a shortage of data scientists, suggesting that an increase in qualified data scientists would enable companies to balance workload and improve overall breadth and depth of their data science capabilities.
- Data scientists want more support from their companies. Nearly 79 percent of respondents are satisfied in their jobs, with almost one-third finding their position "totally awesome," but noting that their organizations can still do more to better equip them. Data scientists said that organizations can empower data science teams by providing the proper tools to do their job better (cited as a solution by 54.3 percent of survey respondents) and setting clearer goals and objectives on projects (cited by 52.3 percent of respondents).
- Data scientists use a diverse toolkit dominated by open source. The survey found that although Excel is still the most commonly used tool (by 55.6 percent of respondents), data scientists also use at least 47 other tools and languages to do their jobs. Nearly all data scientists (98 percent) use open source software, and tried-and-true open source languages such as R remain major parts of data scientists' toolbox.
- The most in-demand data science skill set is programming and coding. In addition to the survey that was conducted, CrowdFlower used its own data enrichment platform to collect and analyze 1,024 LinkedIn data scientist job postings and found that the top two skills companies are looking for are programming and coding (seen in 55.3 percent of job postings) and statistical tools (seen in 52.1 percent of job postings).
"We know that data scientists are valuable for their companies, but there's still a disconnect between what they actually do and what they want to do," said Lukas Biewald, co-founder and CEO of CrowdFlower. "At the end of the day, the time they invest in cleaning data is time that could be better spent doing strategic, creative work like predictive analysis or data mining. If companies can give data scientists some of that data cleaning time back, they'll have happier teams that can focus on really exciting things."
Download the CrowdFlower 2015 Data Scientist Report and "Data Behind Today's Data Scientists" Infographic
- The complete survey findings are available in the CrowdFlower 2015 Data Scientist Report, which can be downloaded here.
- A high-level overview of the results is visually illustrated in CrowdFlower's "Data Behind Today's Data Scientists" infographic, which can be downloaded here.
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge