The case study presented here - including root cause analysis and solution - was performed for a digital publisher. It offers a different perspective on what data scientists are capable of. The expert involved here is not a coder, certainly not a production guy, yet is able to leverage his business acumen and domain expertise to
Such a data scientist who can save billions to a company, is usually not hired, for the following reasons
This is why companies erroneously think that data scientists are unicorns, because they won't even interview such a professional when a position becomes vacant. Yet this guy does not consider himself a unicorn. And it's not the story of a smart data scientist, highly capable, misunderstood and poor, unable to find a job. It's the opposite: a data scientist (granted, not on any payroll, not considering himself a worker), who will compete aggressively against whoever (small or big) is in his path, and win time and over, including financially.
This story also illustrates that data science is not necessarily about big data, big statistical models, or big Python code. It might indeed involves none of them. Anyway, this "unicorn" shares his secret with you in this article. Bookmark it, it might become very handy one day! And you won't need to find that elusive unicorn: just follow his recipe below.
Unicorn Data Scientist Finds Root Causes of Sales Drop, Fixes it
Helping a digital publisher, this data scientist (let's call him the unicorn), among other things, monitors the company finances, using a dashboard - Freshbooks - that advertises itself as "bookkeeping in the cloud". And it's true that Freshbooks is fantastic, making many high level KPIs easily accessible. One set of metrics is in the "invoicing tab", where you can see all invoices submitted , sorted by date. Gaps between subsequent invoices range between 1 to a7 days. So by looking at the distribution of these gaps (no need to develop a stats model) it is obvious after a 30 second visual inspection, that something happened if the most recent invoice is two weeks old.
This was the starting point for this investigation; the discovery of an issue with invoicing. Note that the discovery and the choice of the data tracking platform (Freshbooks), was the job of the unicorn, in this company. In all other companies, it takes much more time to react, as data scientists look at sales, rarely at invoicing. In short, detection took place one month earlier than in most companies. Interestingly, the alarm was raised at a time when revenue was higher than ever before - but there's a 40 days lag between invoicing and revenue.
The unicorn then imaged scenarios, and the kind of data needed to rule out most of them, and detect the real culprits:
Solution
Since lack of inventory was the issue, growing traffic by adding a relevant channel, and offering new advertising products, was the proposed solution. Also, blocking some inventory in advance for potential large clients won't solve the invoicing issue, but can boost total revenue in the long term. Again, the unicorn came with the solutions.
Note that this investigation required collaborating with many teams: sales, inventory management, finance, marketing, competitive intelligence. As always, multiple factors were involved, with two dominating factors (inventory sold out, bookkeeper working with client less frequently because she has acquired more clients).
Note that this is a causal analysis, not just a search for correlations. Easier to perform quickly if your company has fewer than (say) 400 employees.
DSC Resources
Additional Reading
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge
Comment
It is really quite silly to insist on Python on a data science resume.
Firstly the advocates of the language claim it is easy to learn.
Secondly there are certain programming languages knowledge of which tends to delineate superior programming ability (e.g Prolog, Erlang, Scheme/Lisp Haskell, Scala, Clojure, OCaml). Python is not one of them.
Thirdly it's not a DSL and therefore unlike say (OWL or SParQL) does not delineate any specialist domain knowledge.
I agree with the general premise that there are data related problems within the realm of what should be "data science" (e.g ontological as well as business) that fall outside of the ambit of the usual skills that seem to be axiomatically stamped on data science job specs (R/Python/Hadoop).
I fail to see why this role is referred to as a data scientist. Couldn't an "accounting scientist" or "order processing scientist" or "sales scientist" have discovered the root cause?
A "unicorn" is a systemic critical thinker and problem solver. They can appear from anywhere.
If solving a business problem makes you a data scientist then I claim I was the first ever data scientist. I've been solving business problems for over 35 years. When data was stored on drums and there was no such thing as a personal computeror spreadsheet but there was real business intelligence in people's heads.
I feel like this point is far to often forgotten. Data Scientist is rapidly becoming a shorthand for a certain set of technical skills, when what it started out meaning and needs to continue to mean is a person with a strong combination of both technical AND business knowledge.
I am another one who would get passed over in a search due to having no python. Yet I am extremely effective at my job and have proven time and again that I can bring my statistical, data mining and predictive modeling skills to bear on a problem, combine them with exceptional problem solving and business knowledge, and have a real impact the bottom line.
As a profession we need to be on guard against this shifting of perception of data scientists toward a code jockey with a specific set of technical skills and some stats knowledge. That isn't enough to get the job done. It is our communication and business skills that bring our worth above that of a programmer, and we need to hold on to that part of our identity.
Great post, I utterly agree!
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central