Interesting PowerPoint presentation by Carlos Somohano. Here's my favorite slide (a slightly modified version):
Brief History of Data Science
- 1974 – Peter Naur @UoC Datalogy & Data Science
- 2001 – William S. Cleveland @CSU Data Science: An Action Plan
- 2003 – Journal of Data Science
- 2009 – Jeff Hammerbacher @Facebook What does a Data Scientist Do?
- 2010 – Drew Conway @NYU The Data Science Venn Diagram
- 2010 – Mike Loukadis @O’Reilly “What is Data Science?”
- 2011 – DJ Patil @LinkedIn data scientist vs.data analyst
- 2013 - Vincent Granville @DSC “Horizontal vs. Vertical Data Scientist”
Read Powerpoint presentation.
Another interesting comment about data scientists:
In my case, as a data scientist, I generate leads for marketers. A good quality lead is worth $40. The costs associated with producing one lead is $10. It requires data science to efficiently generate a large number of highly relevant leads, purchasing the right traffic, organic growth optimization etc. If I can't generate at least 10,000 leads a year, nobody will buy due to low volume. If my leads don't convert in actual revenue and produce ROI for the client, nobody will buy.
Also, thanks to data science, I can sell leads for a lower price than competitors - much less than $40. For instance our newsletter open rate went from 8% to 24%, significantly boosting revenue and lowering costs. We also reduced churn to a point where we actually grow, all of this thanks to data science. Among the techniques used: improving user, client and content segmentation; outsourcing and efficiently recruiting abroad; automation (we have zero employee); multiple vendor testing in parallel (A/B testing); competitive intelligence; true computational marketing; optimizing delivery rate from an engineering point of view; eliminating inactive members; detecting and killing spammers; and optimizing an extremely diversified mix of newsletter metrics (keywords in subject line, HTML code, content blend, ratio of commercial vs. organic content, keyword variance to avoid burn out, first sentence in each message, levers associated with re-tweets, word-of-mouth and going viral, etc.) to increase total clicks, leads and conversions delivered to clients. Also, we need to predict sales and revenues - another data science exercise.