Much has been written about *customer* churn - predicting who, when, and why *customers* will stop buying, and how (or whether) to intervene. Employee churn is similar - we want to predict who, when, and why employees will terminate. In many ways, it is smarter to to focus inward on employees. For one thing, it is far easier for an company to change the operations or even the behavior of an employee, than that of a customer. As will be seen, employee churn can be massively expensive, and incremental improvements will give big results.

The most important difference between employee vs. marketing churn is that a business *chooses to hire* someone. Unfortunately, you usually don’t get to choose your customers. There is also more at stake - this person will literally be the face of your company, and collectively, the employees produce everything your company does.

Employee churn has unique dynamics compared to other problems. To jump-start the “business understanding” phase of analytics efforts, we are writing a series of articles to translate employment processes into tractable data mining problems.

A new hire ideally ramps up to full productivity over months, going through on-boarding, training, certification. In one client engagement, a call center employee had to train for months to pass a Series 7 exam, before even being legally allowed on the phone. During all of that time, an employee delivered no value… they were just preparing to start working.

Figure 1

**Figure 1** shows a stylized cost/benefit plot for one employee across three years of tenure. At time zero, costs are very high - an expensive recruitment process, administration, training, supplies are all above the normal flow. In this model, after about a year, the main monthly expense is salary and overhead. In this hypothetical job, an employee takes a year to ramp up to full productivity. Different jobs will have different curves, but this sigmoid curve is common.

To decrease the overall costs due to employee churn, *something* has to budge on these curves:

- Decrease hiring/onboarding costs
- Decrease time to full productivity
- Decrease salary/productivity ratio
- Increase overall productivity (which is at odds with all above points)
- Decrease employee turnover prior to the full productivity phase
- Hire to increase the proportion of employees who are likely to “survive” to the full productivity phase

Like quantitative scissors, there are no other options in this model.

Unfortunately, few companies have any idea of what these costs and benefit numbers are for any given role. Many have worked out the lifetime value of a customer to 5 decimal points, but few have ever considered the lifetime value of an employee. And, not all roles are “producers” like sales reps or factory workers - for example, what is the monthly corporate contribution of a data scientist? Data Science may be “the sexiest job of the 21st century”, but no one really knows how much we “make it rain.”

At Talent Analytics, we have found it simpler to evaluate employee cost relative to a potential performance level. Simple heuristics can begin to build the curves defined in **Figure 1**. The shocker comes when we subtract (benefit - cost) and take the cumulative sum to find an break-even point..

In this stylized example, the employee starts providing monthly value after 10 months, and **does not break even until after 2.5 years**. By comparison, in our engagements we often see impressive attrition after just 3–6 months.

Customers provide profit right away, so customer churn analytics is just trying to keep the gravy train rolling. Employee churn analytics is more like trying to get the train to run long enough to provide any value at all.

With the employee value proposition laid out, we can begin to crack this nut and save the business some money. We are looking for signals that will let us score the likelihood of a person to stay in a role inside a given time window. By deploying the right predictive model, we can decrease the impact of one or more of the “scissor points” above.

Hint: The most powerful place to solve this problem is before you cut the first paycheck.

There is much more to this subject. In future installments, we will consider:

- Differentiating “good” and “bad” churn
- Variables, time windows, analytical methods and black boxes
- Survival analysis
- Intervention and uplift modeling - what is the employee analogy to “Sleeping Dogs” and “Persuadables” in marketing churn?
- Using cost information to tune models - are false negatives or false positives more expensive?

As an experiment, we are putting the R code for this cost model and its plots on GitHub. It is a public project for all to try, modify, and share at https://github.com/talentanalytics/churn201 . Feel free to “pull request” any improvements to make this even better. We will build up this toy model as an engine for this series. Please engage!

© 2019 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central