Subscribe to DSC Newsletter

There are two types of data scientists:

  • Vertical data scientists have very deep knowledge in some narrow field. They might be computer scientists very familiar with computational complexity of all sorting algorithms. Or a statistician who knows everything about eigenvalues, singular value decomposition and its numerical stability, and asymptotic convergence of maximum pseudo-likelihood estimators. Or a software engineer with years of experience writing Python code (including graphic libraries) applied to API development and web crawling technology. Or a database guy with strong data modeling, data warehousing, graph databases, Hadoop and NoSQL expertise. Or a predictive modeler expert in Bayesian networks, SAS and SVM.
  • Horizontal data scientists are a blend of business analysts, statisticians, computer scientists and domain experts. They combine vision with technical knowledge. They might not be expert in eigenvalues, generalized linear models and other semi-obsolete statistical techniques, but they know about more modern, data-driven techniques applicable to unstructured, streaming, and big data, such as (for example) the very simple and applied Analyticbridge theorem to build confidence intervals. They can design robust, efficient, simple, replicable and scalable code and algorithms.

DJ Patil, an Horizontal Data Scientist

Horizontal data scientists also come with the following features:

  • They have some familiarity with six sigma concepts, even if they don't know the word. In essence, speed is more important than perfection, for these analytic practitioners.
  • They have experience in producing success stories out of large, complicated, messy data sets - including in measuring the success.
  • Experience in identifying the real problem to be solved, the data sets (external and internal) they need, the data base structures they need, the metrics they need, rather than being passive consumers of data sets produced or gathered by third parties lacking the skills to collect / create the right data.
  • They know rules of thumb and pitfalls to avoid, more than theoretical concepts. However they have a bit more than just basic knowledge of computational complexity, good sampling and design of experiment, robust statistics and cross-validation, modern data base design and programming languages (R, scripting languages, Map Reduce concepts, SQL)
  • Advanced Excel and visualization skills.
  • They can help produce useful dashboards (the ones that people really use on a daily basis to make decisions) or alternate tools to communicate insights found in data (orally, by email or automatically - and sometimes in real time machine-to-machine mode).
  • They think outside the box. For instance, when they create a recommendation engine, they know that it will be gamed by spammers and competing users, thus they put an efficient mechanism in place to detect fake reviews. 
  • They are innovators who create truly useful stuff. Ironically, this can scare away potential employers, who, despite claims to the contrary and for obvious reasons, prefer the good soldier to the disruptive creator.

In my opinion, vertical data scientists are fake data scientists. They are the by-product of our rigid University system which trains people to become either a computer scientist, a statistician, an operations research or a MBA guy - but not all the four at the same time. This is one of the reasons why we have created our data science program. This is also one of the reasons why recruiters can't find data scientists: they find and recruit mostly vertical data scientists. Companies are not yet used to identifying horizontal data scientists - the true money makers and ROI generators among analytic professionals. The reasons are two-fold:

  • Untrained recruiters quickly notice that horizontal data scientists lack some of the traditional knowledge that a true computer scientist, or statistician, or MBA must have - eliminating horizontal data scientists from the pool of applicants. You need a recruiter familiar both with software engineering, business analysts, statisticians and computer scientists, and able to identify qualities not summarized by typical resume keywords, and identify which (lack of) skills are critical from the ones that can be overlooked, to detect these pure gems. 
  • Horizontal data scientists, faced with the prospects a few job opportunities, and having the real knowledge to generate significant ROI, end up creating their own start-up, working independently, sometimes competing directly against the very companies that are in need of real (supposedly rare) data scientists. After having failed more than once getting a job interview with Microsoft, eBay, Amazon or Google, they never apply again, further reducing the pool of qualified talent.

Hopefully, our data science program will help with this - in particular educating recruiters and hiring managers as well.

Question: Can you name a few horizontal data scientists? Vertical data scientists are a dime a dozen.

Related articles:

Views: 28701

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Dmitriy Kruglyak on March 18, 2013 at 5:39pm
This is a bogus dichotomy.

The only thing that matters is whether or not a "data scientist" is able to meaningfully contribute to the bottom line of their employer.

The so-called "vertical data scientists" usually have very specific, defined and useful skills that can make a difference right away - within certain scope / limits.

The so-called "horizontal data scientists" often fail to go much further than producing brainstorms of doubtful value. To accomplish anything useful they have to come down from their high horse, roll up their sleeves and become domain experts in some specific problem - essentially turning into a "vertical data scientist".

When you know the problem you want to solve the choice between mile wide / inch deep and inch wide / mile deep practitioner is pretty obvious.
Comment by Vincent Granville on March 18, 2013 at 9:01am

@Stephen: Good point. In my case, and as an horizontal data scientist, my salary is zero. But in terms of revenue from running a profitable data science company, I'm far above the upper limits posted in my article on Facebook data scientist salaries (although I admit that these numbers look a bit low). In addition, I live in a state with no income tax and have quite well optimized my finances: be it from a tax point of view, by buying / selling real estate at the right time / right place, optimization of my health, retirement and education expenditures, optimization with respect to the stock market, selling patents, purchasing assets (car, house) far less expensive than I can afford (many times when they are cheap) and making them last longer than most people, etc. All of this thanks to having a strong analytic mindset, that I leverage as much as I can in many aspects of life.  

Comment by Stephen B. Richter on March 18, 2013 at 8:32am

Sterling analysis! However, I would like to see salary data of both groups, in the public and private sectors. Does anyone fall into the top 1% as a Data Scientist? The average annual income of the top 1 percent of the population is $717,000. Source: http://www.forbes.com/sites/moneywisewomen/2012/03/21/average-ameri...

Comment by Stephen Penn, DM, PMP on March 18, 2013 at 8:19am

With data science so new, many employers want to bucket horizontal data scientists into existing categories. "You have coding skills, but you aren't a coder? You know statistics but aren't a statistician? You know SQL, but you aren't a data architect? Then what are you?"

 

I really appreciate this article.  It gives me terminology to use when talking with potential employers. 

 

Also, the last sentence in the last bullet in the list of features describing horizontal data scientists is truly insightful.  In many of the articles on analytics I read today, the message is "Hire a data scientist and magical things will happen."  Once the hiring decision is made however, the work for the hiring manager is just beginning.  The disruption you wrote about can be very scary to an existing organization with an existing culture and hierarchy.  Both the data scientist and the hiring manager have a lot of work to do in fitting this positive disruption into the company's existing strategy.

 

Your data science program will go a long way to solve this problem.  I also believe research in trying to integrate existing organizational strategy and data science is very important. Some work has been done here, but we, as an industry, have a way to go.

Comment by Patrick K Stroh on March 18, 2013 at 7:38am

Interesting analysis. I would add one more attribute ... curiousity.  I think it serves as the foundation for / intertwines with many of the other items you mention.  Nice post.

Comment by Sayara Beg on March 18, 2013 at 6:01am
This is great. I can finally explain who I am. I am a horizontal data scientist!
Comment by Tim Negris on March 18, 2013 at 5:44am

This is a perceptive analysis of an important distinction and I agree with your opinion and conclusions.  But, if you want recruiters and business people to embrace these ideas as well, then I strongly suggest that you find a different scheme than "vertical/horizontal" to characterize these groups.

If you say, "vertical data scientist" to someone in HR or, say, Marketing, they will take that term to mean almost the opposite of how you mean it.  To them, a vertical data scientist would be one who understands the application of analytics to the problems of a particular industry or business function, someone who can help a telecom marketer predict customer churn, for instance - a "horizontal" in your scheme.

And, if you say, "horizontal data scientist" to those same people, they will probably think of someone who does not have all that specific business understanding but rather just knows about "all that analytics stuff", somebody who knows all the algorithms, but not about their particular business applications - your "vertical" data scientists.

Many of the many recruiters who do not specialize in placing data scientists and many of their business sponsors - the "buyers" - find the recruiting process intimidating or confusing because they don't really understand what data scientists do, from either a theoretical or applied standpoint. Statistics PhDs might as well be Knights Templar to them.

Those people could be more easily educated about the different kinds of data scientists if they first had a basic, high level understanding of things like machine learning, regression, classification, and how what we mean by "prediction" has nothing to do Nostradamus or market timing.

Throw open the temple doors, reveal the rites, and translate the incantations for them.  Only then can they bring you the new First Degree or Master members that you seek.

Comment by Bob Angell on March 18, 2013 at 5:23am

Great post. With the breadth of technology today, there are many who fit the horizontal definition and even see hybrids of vertical and horizontal in my world (Informatics). 

Comment by Joshua Burkhow on March 17, 2013 at 8:31pm

I think this is the first decent explanation I have heard of that easily separates the "verticals" and the "horizontals". The data science community has often referred to Drew Conway's Venn Diagram as the modern day Data Scientist but hasn't gone much further in really getting people to understand more simply the breadth vs depth argument. Well done sir, my hats off to you. 

Comment by Philip Best on March 17, 2013 at 3:40pm

So glad to see a practitioner, especially one with a PhD, embrace the benefits of breadth with a bit of depth.

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service