Subscribe to DSC Newsletter

Summary:  Is predictive modeling dead?  Have we wasted our time practicing this?  I don’t think so but here are some interesting thoughts to the contrary.

I just read a thought provoking article by Smita Adhikary of Big Data Analytic Hires under the title "Is Logistic Regression Dead?".  And I might well have passed over this as a narrow observation about a single ML technique.  Except that on taking another look what she’s really proposing is that Predictive Modeling in total is dead?  In the article she proposes:

Where ever we deal with a customer through digital media the focus of predictive analytics has shifted from prediction to classification.

" All of a sudden the whole world of analytics is now talking about Support Vector Machines, Random Forests, Bagged Regressions et al. – everything is about classification; everything is about adaptively learning and self-evolving algorithms that augment the understanding of the customer with every successive digital footprint."

Now there is a clear line of demarcation between 'predictive modelers' and 'data scientists'.

"The former have been kind of relegated to traditional banking, insurance, telecom companies where static scores and optimization based solutions are still pursued. The latter define the sought after (just like we were eons ago) whiz-kids ruling the roost at the cool tech companies, and presumably changing the world."  (Ouch!)

This paradigm shift has drastically changed the ‘skills’ requirement in job descriptions:

"When screening candidates employers are now specifically looking for ‘Python, R and machine learning’, as against ‘SAS, regression, optimization’ in the days of yore."

Ms. Adhikary might as well have said that predictive modeling is a dead end that you might as well not bother studying.  But her credentials are good.  She started her career as a quant and is now a Managing Consultant with Big Data Analytic Hires which focuses on sourcing for data science.  She's right there in the hiring flow so her point of view should have some validity.

Personally, I still see the world divided between the Big-Web-User community where interacting with the client is always digital, and the Core Data Science community where digital interaction may occur but is secondary to in-store or in-person sales events.  If you are recruiting just for Big-Web-Users then your point of view may naturally become skewed.

If you’re living entirely in the digital world with your customers then I think she’s right.

“Cut to the digital era, the customers and the (ecommerce) merchant have now started interacting in a dynamic setting where nothing is frozen anymore. … The customers are free to navigate the merchant website any which way they fancy. … the customer’s site navigation on the day of the purchase is mere execution of a decision that has been made even before the customer lands on the site…the pages visited on the day of the purchase are often not causal to the purchase, just simply correlated.”

Kind of makes you feel sorry for the DS practitioners who focus on UX and web interaction.  In the completely dynamic interactions of the digital world if correlation is all you’ve got then classification may be the best you can do.

But that doesn’t mean that half your education in the predictive arts has just become irrelevant.  We all need to keep up on the latest ML techniques and if you're a practitioner chances are you are elbows deep in SVMs, random forests, and the other latest classification tools.  But here's an interesting fact from Forrester, despite the attention grabbing growth of ecommerce it still represents only about 9% of total US retail sales.

So here’s my take on this.  No, predictive modeling isn’t dead and yes there’s more to data science than correlation.  I will grant you that data scientists with the deep web skills to tease out these correlations with Big Web User customers are rare.  Well, all data scientists are rare but these are perhaps more rare and therefore if you’re looking for one you’re more likely to use a recruiting firm like Ms. Adhikary’s.  But if you’re focused on the 90% of the market not (yet) taking place on line then you need a fully balanced set of skills.

 

August 24, 2015

Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2015, all rights reserved.

 

About the author:  Bill Vorhies is President & Chief Data Scientist at Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001.  Bill is also Editorial Director for Data Science Central.  He can be reached at:

[email protected] or [email protected]

 

Views: 8317

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Comment by Marta Seoane on August 31, 2015 at 1:22pm

There is not clear agreement about what constitutes data science; in some circles predictive modeling is seen as an aspect of data science. In this article, predictive modeling is seen as a separate field. Predictive modeling deals with some aspects of reality, as data scientists deal with other. For example, the health care industry deals with classification through their websites and individual  medical records, and with predictive analytics through patients medical records and massive data from surveys. Why does it have to be an either-or preposition. In reality I see classification as a preliminary step to prediction.

Great posting and discussion. Thank you.

Marta

Comment by Sione Palu on August 28, 2015 at 4:52pm

Regression comes under machine learning. You can't made contradictory statements to say "in the past regression analysis was prevailed"  the follow with  "but after everyone is exploring other machine learning techniques dub regression analysis" it yields good results.  Logistic Regression evolves as well.  Check out recent variant,  "Variation Bayesian Logistic Regression" &  dramatically improves performance.

Comment by Rupesh Kumar Perugu on August 28, 2015 at 5:42am

I believe what she meant over there is - there are more evolving techniques better than "logistic regression".

In specific to digital marketing domain, in the past "regression analysis" was most prevailed type of modeling in this area but after everyone is exploring other machine learning techniques which are yielding good results. 

Comment by Sione Palu on August 27, 2015 at 4:31pm

Ms. Adhikary’ is completely wrong. Way off the mark. 

Im not gonna touch on her other misleading & uninformed points she made, but I just went on Google Scholar to search if "Logistic Regression" is dead. Well, its alive, from recent publications this year on the topic.

"Logistic Regression"

https://scholar.google.co.nz/scholar?hl=en&as_sdt=0,5&q=%22...

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service