Subscribe to DSC Newsletter

I'm a techie but not in the data science world (yet). I've read in several sources that data scientists spend an inordinate amount of time cleaning data. I'm curious if there are positions or contracts for "para-data scientists" to clean data. I imagine that cleaning data is a different enough skill than the analysis and other data science skills and that it could be farmed out to those who lack the core data science skills. Data cleaning might appeal to lower paid techies who might be willing to pay their dues to become data scientists or learn enough about the field to decide whether or not to join the ranks. And, cleaning data may appeal to some, especially to those with enough programming and general techie skills to figure out how to efficiently clean large amounts of potentially unstructured data.

Thoughts?

Sol

Tags: cleaning, data

Views: 591

Reply to This

Replies to This Discussion

Data prep was my specialty for many years and I still do it occasionally.  It is almost always necessary as databases really designed for analysis are rare, there are almost always inconsistencies and typos, and the requisite inspection and exploration are needed anyway for effective modeling (it's really the first stage of analysis and should be treated accordingly).  But it's something that a properly trained programmer can do probably better than a statistician can, and most analysts hate doing it anyway, so why not delegate it?  And as time goes on, your data prep/cleaning specialist can't help but pick up some modeling skills so he might well be able to help you there as well.

RSS

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service