These are the findings from a CrowdFlower survey. Data preparation accounts for about 80% of the work of data scientists. Cleaning data is the least enjoyable and most time consuming data science task, according to the survey. Interestingly, when we asked the question to our data scientist, his answer was:
Automating the task of cleaning data is the most time consuming aspect of data science, though once done, it applies to most data sets; it is also the most enjoyable because as you automate more and more, it frees a lot of time to focus on other things.
Below are the three charts published in the Forbes article, regarding the survey in question. The one at the bottom lists the most frequent skills found in data scientist job ads.