DSC Webinar Series: Building Data Pipelines that Drive Highly Predictive, Resilient Models

Predictive performance and resiliency are equally important requirements for model operationalization. Users would rather have resilient models with “good” performance than models that show high predictivity at training and testing, but underperform when deployed, and require frequent tuning. Of the multiple impactors of performance and resilience, it is well known that training data has an outsize impact on predictivity, and data drift is a major influencer of resilience. In today’s Data Science Central webinar, we will discuss two strategies to improve accuracy and resiliency of models:

– Enriching first party data to create effective training data sets that possess adequate breadth, depth and scale.

– Using the concept of data stability to reduce the impact of data drift and produce resilient models from scratch.

In discussing these strategies, we will address the following questions:

– Why machine learning models often underperform in production
– How data enrichment can improve model performance
– The concept of data stability, and how it can be used proactively
– How feature transformation and selection is so critical to model resilience.

Speaker:
Dr. Anindya Datta, Founder, CEO, and Chairman – Mobilewalla

Moderator:
Rafael Knuth, Contributing Editor – DSC