Why Data Scientists create poor Data Products ? ( And 5 things that can be done to change this )

As the consumer and industrial world gets massively digitized Data products are being baked into critical processes at a very high rate. These data products distill signals from massive torrent of human generated and machine generated data to drive a front line action . At this point we wanted to distinguish between 2 types of data products which we have seen in the market place

  1. Consumer Data Products : Data products created to harness human generated data intelligence like Sentiment Analyzer, Recommender engines , Social graph analyzers, Digital Purchase intent detector
  2. Industrial Data Products : Data products created to harness machine/sensor generated data intelligence in Industrial IOT world like Asset recomnenders, Mean time between failure calculators etc

In the consumer world, Data Scientists were able to create an amazing job of curating game changing data products primarily because they were able to relate to the consumer context be it the decoding digital intent from a sales funnel or suggesting the next best action to a digitally engaged user

In the industrial world we have seen its relatively difficult for pure play data scientists to relate to the machine world and as a result a lot of data products which have been created with the best of intentions have failed to make it to the operational side primarily because of the dissonance in mental models between an industrial engineer and a data scientist. So what can one do to increase the chances of Industrial data products being adopted in the engineering world ?Based on Fluturas experience in curating Industrial Data Products here are our 5 mantras.

Learning-1 : Be Engineering backward, Instead of Data Forward

Data scientists tend to get seduced by the algorithms and the platforms processing billions of event data. In the process they lose sight of the problem to solve.For example consider making an electrical or mechanical engineer the product manager . He/She would stay focused on the engineering problem to solve. Its easier for an engineer to learn data science than for a data scientist to learn engineering nuances

Learning-2 : "Walk a mile in the Engineers shoe"

Learning-3 : Industrial Engineers quality threshold > Data Scientist quality threshold

Learning-4 : Analysis is not a job to be done

Learning-5 : Aim for both heart and mind

Please click here to read full article

Views: 3054

Tags: IOT, analytics, big, data, science


You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Sione Palu on November 13, 2014 at 9:04am

First, I see no difference. Engineer dangling in data analytics or a data scientist doing the same thing does make them indistinguishable, well except perhaps for their different job titles.

Second, sometime one cannot solve a problem without focusing on algorithms or trying different algorithms. Most of the time, the problem is known or right in front of us.  How to solve the problem is the BIG question and this is where algorithm comes in.

Speaking of my own experience. Doing time-series forecasting is a difficult problem especially if one is to forecast longer horizon. The problem is how to make longer horizon forecast more reasonable or accurate (such as 6 months ahead) because doing shorter horizon forecasting can be reasonable. This is where algorithm come in, because if one doesn't try a few, then he/she's dead in the water. We've tried, AR (different variants),  ANN (different types), SVR, ANFIS, Kalman-Filter, Grey-System, Particle filter, Monte carlo (probability tree or path dependent Brownian motion) and a few others. We prefer using trinomial probability tree path dependent monte carlo, but we're still looking to find if something is much better, like recently we're exploring agent-based model simulation or statistical mechanics/physics game theory simulation (Ising model, minority games, SOC - self organized criticality) and others.

My superior often says, we should focus on the problem. I said, that's true, but what if we have no solution to the problem to improve or perhaps we should just stick the existing solutions but still not good enough. I said why do we waste time in exploring other forecasting algorithms since our problem can be solved with ANN or ARIMA (they've been around for decades). He said, we need to improve the accuracy. I said, well, we've tried many, but some are slightly better than others and not one of them is outstanding since our time-series are highly non-stationary & chaotic. He said, ok keep searching. This is where focusing on algorithm is important.

Comment by Kumaran Ponnambalam on November 9, 2014 at 7:20pm

Well said !! It is important to focus on the problem and the solution and not on the algorithms.

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service