Unicorn Data Scientist Shares his Secrets with You

The case study presented here - including root cause analysis and solution - was performed for a digital publisher. It offers a different perspective on what data scientists are capable of. The expert involved here is not a coder, certainly not a production guy, yet is able to leverage his business acumen and domain expertise to

  • Imagine dozens of scenarios and rank them by chance of occurring
  • Get silo-ed data from various departments (finance, sales, marketing, product, IT)
  • Analyze the data in connection with the scenarios (including checking data validity)
  • Get external data (competitive intelligence) as needed
  • Find the causes (not just correlations)
  • Find the remedies
  • Detect issues well before anyone else can see them, by looking in summary data
  • Complete the analysis with a 48 hours turnaround

Such a data scientist who can save billions to a company, is usually not hired, for the following reasons

  • Companies are looking for coders, not business solvers, when they hire a data guru, despite claiming the contrary
  • A data scientist without Python on his resume is unlikely to ever get hired
  • Hard work gets rewarded, smart work does not.

This is why companies erroneously think that data scientists are unicorns, because they won't even interview such a professional when a position becomes vacant. Yet this guy does not consider himself a unicorn. And it's not the story of a smart data scientist, highly capable, misunderstood and poor, unable to find a job. It's the opposite: a data scientist (granted, not on any payroll, not considering himself a worker), who will compete aggressively against whoever (small or big) is in his path, and win time and over, including financially. 

This story also illustrates that data science is not necessarily about big data, big statistical models, or big Python code. It might indeed involves none of them. Anyway, this "unicorn" shares his secret with you in this article. Bookmark it, it might become very handy one day! And you won't need to find that elusive unicorn: just follow his recipe below.

Unicorn Data Scientist Finds Root Causes of Sales Drop, Fixes it

Helping a digital publisher, this data scientist (let's call him the unicorn), among other things, monitors the company finances, using a dashboard  - Freshbooks - that advertises itself as "bookkeeping in the cloud". And it's true that Freshbooks is fantastic, making many high level KPIs easily accessible. One set of metrics is in the "invoicing tab", where you can see all invoices submitted , sorted by date. Gaps between subsequent invoices range between 1 to a7 days. So by looking at the distribution of these gaps (no need to develop a stats model) it is obvious after a 30 second visual inspection, that something happened if the most recent invoice is two weeks old.

This was the starting point for this investigation; the discovery of an issue with invoicing. Note that the discovery and the choice of the data tracking platform (Freshbooks), was the job of the unicorn, in this company. In all other companies, it takes much more time to react, as data scientists look at sales, rarely at invoicing. In short, detection took place one month earlier than in most companies. Interestingly, the alarm was raised at a time when revenue was higher than ever before - but there's a 40 days lag between invoicing and revenue.

The unicorn then imaged scenarios, and the kind of data needed to rule out most of them, and detect the real culprits:

  • Is one of the sales reps invoicing less, or is it generalized across sales reps? Is one of the sales reps in vacation?
  • Are specific clients responsible for the drop, or is it generalized across clients? Impacting old or new clients? No pattern found. Also there was no client loss or cancellations.
  • Some clients moved from monthly to quarterly invoicing? The answer was yes, in this case.
  • Were the previous month numbers unusually high? Yes. It also depends when the bookkeeper issues the invoices, and she switched from twice a week to less than once a week - creating a bigger gap than usual.
  • Has time-to-close (a deal) increased/ No indication of this, though targeting new clients always increase the metric in question.
  • Could a price increase be the cause? Yes for few, little clients.
  • Did competition decrease their prices? Offering more competitive products? Gaining market share. Or is the industry experiencing a sudden downturn? No evidence of this. But this happened just before the elections - and this could have played a role, clients waiting until after the elections.
  • The products do not work as they used to? No, traffic is at an all time high, and traffic quality high. But some advertisers keep repeating the same announcement over and over. Need to educate advertisers. Data to counter this hypothesis comes from newsletter total clicks by client (vendor data), Google Analytics (after filtering out noise), and client data (leads generated).
  • Has the web traffic decreased, or not as relevant as before? More from outside US? Different demographics? No. And anyway, it would have a progressive effect, not a sudden effect.
  • Is it a general market trend? No, but in the process, the unicorn discovered new markets to target. He also looked at the kind of ads displayed on competitor websites, as well as traffic growth from competitors, making sure to keep a top position. 
  • Is there a reputation issue? No. Check tweets and other postings about the client. And effect would be progressive, not sudden - unless the reputation issue is huge and sudden.
  • Was advertising budget reduced, with negative consequences? No.
  • Is there inventory left to sold? It turned out that this was the main factor. All ad inventory was sold in September/October for the Christmas season.


Since lack of inventory was the issue, growing traffic by adding a relevant channel, and offering new advertising products, was the proposed solution. Also, blocking some inventory in advance for potential large clients won't solve the invoicing issue, but can boost total revenue in the long term. Again, the unicorn came with the solutions.

Note that this investigation required collaborating with many teams: sales, inventory management, finance, marketing, competitive intelligence. As always, multiple factors were involved, with two dominating factors (inventory sold out, bookkeeper working with client less frequently because she has acquired more clients).

Note that this is a causal analysis, not just a search for correlations. Easier to perform quickly if your company has fewer than (say) 400 employees.

DSC Resources

Additional Reading

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

Views: 8452


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Ihe Onwuka on January 24, 2015 at 1:53pm

It is really quite silly to insist on Python on a data science resume.

Firstly the advocates of the language claim it is easy to learn.

Secondly there are certain programming languages knowledge of which tends to delineate superior programming  ability (e.g  Prolog, Erlang, Scheme/Lisp Haskell, Scala, Clojure, OCaml). Python is not one of them.

Thirdly it's not a DSL and therefore unlike say (OWL or SParQL) does not delineate any specialist  domain knowledge. 

I agree with the general premise that there are data related problems within the realm of what should be "data science" (e.g ontological as well as business) that fall outside of the ambit of the usual skills that seem to be axiomatically stamped on data science job specs (R/Python/Hadoop).

Comment by Richard Ordowich on November 17, 2014 at 10:54am

I fail to see why this role is referred to as a data scientist. Couldn't an "accounting scientist" or "order processing scientist" or "sales scientist" have discovered the root cause?

A "unicorn" is a systemic critical thinker and problem solver. They can appear from anywhere.

If solving a business problem makes you a data scientist then I claim I was the first ever data scientist. I've been solving business problems for over 35 years. When data was stored on drums and there was no such thing as a personal computeror spreadsheet but there was real business intelligence in people's heads.

Comment by Rebecca Barber, PhD on November 17, 2014 at 10:29am

I feel like this point is far to often forgotten.  Data Scientist is rapidly becoming a shorthand for a certain set of technical skills, when what it started out meaning and needs to continue to mean is a person with a strong combination of both technical AND business knowledge. 

I am another one who would get passed over in a search due to having no python.  Yet I am extremely effective at my job and have proven time and again that I can bring my statistical, data mining and predictive modeling skills to bear on a problem, combine them with exceptional problem solving and business knowledge, and have a real impact the bottom line.  

As a profession we need to be on guard against this shifting of perception of data scientists toward a code jockey with a specific set of technical skills and some stats knowledge.  That isn't enough to get the job done. It is our communication and business skills that bring our worth above that of a programmer, and we need to hold on to that part of our identity.  

Comment by Marco Lunardi on November 17, 2014 at 7:28am

Great post, I utterly agree!

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service