*The Statsbot team estimated LTV 592 times for different clients and business models. *

Customer lifetime value, or LTV, is the amount of money that a customer will spend with your business in their “lifetime,” or at least, in the portion of it that they spend in a relationship with you. It’s an important indicator of how much you can spend on acquiring new customers. **For example, your customer acquisition cost (CAC) is $150, and LTV is $600.** You would be able to increase the budget to get more people and grow your business. The balance between CAC and LTV allows you to check any business for market survival.

Estimating LTV is a predictive metric which depends on future purchases, based on past patterns, and allows you to see how much risk you are exposed to as a business, and how much you can afford to spend to acquire new clients. At an individual level, it also enables you to figure out who your highest-value customers are likely to be, not just now, but also in the future.

In order to understand how to estimate LTV, it is useful to first think about evaluating a customer’s lifetime value at the end of their relationship with us. So, say a customer stays with us for 12 months, and spends $50 per month.

The revenue that they generated for our business over their lifetime is then $50*12 = $600. Simple! We can consider the *basic definition of LTV* as a sum of payments from a specific user.

This same principle applies to the group. If we want to see *the average LTV* for the group we can look at total spend divided by the number of customers. When we’re talking about *estimating LTV* for the group*, or predictive LTV,* we need to take into account how long customers stay with us. To get that, we actually look at it “backwards”: we look at how many customers we lose over time, or the churn rate.

At a group level, the basic formula for estimating LTV is this:

Where ARPU is average monthly recurring revenue per user and the churn rate is the rate at which we are losing customers (so the inverse of retention).

This basic formula can be obtained from assumption:

*Next Month Revenue = (Current Month Revenue) * (1 - Churn Rate)*

*Note: When we’re estimating customer lifetime value for SaaS we can neglect Gross Margin, because costs are minor and don’t affect the accuracy of a result. But when we calculate predictive LTV for ecommerce later in this article, we’ll include COGS in our formula.*

The main limitation of the LTV formula above is that it assumes that churn is linear over time, as in: we are as likely to lose a customer between the first month of membership of our service and the second, as we are to lose them much later on. Going deeper into the nature of predictive LTV, we can say that it’s a sum of a geometric series, and linear churn doesn’t look like a straight line (as is shown in many articles about LTV).

In fact, we know that linear churn is usually not the case.

In a flexible subscription model, we lose many people at the very beginning, when they are “testing out” a service, but once they have been with us for a long time, they are less likely to leave.

Ultimately, it depends on the type of contract that exists between customers and the business: for example, annual renewals, where churn is more linear, will result in LTV that is very close to the formula above.

Services which do not have any contracts may lose a high percentage of their new customers, but then churn may slow down.

We can think of this concept graphically:

If the LTV of the group is the area under the line, we can very clearly see that the rate at which we lose customers will impact our LTV estimates very significantly. So we will need to take this into account when we are making our calculations. For a first estimate of LTV, however, it makes sense to go with the simplest formula. After that, we will add levels of complexity.

In order to make the most basic estimate of LTV, we need to look at our transaction history. Then, we can establish the average revenue per customer as well as the churn rate over the period that we are looking at. For simplicity, I’m going to look at the last year.

You can calculate ARPU in 2 steps:

month_ARPU AS(SELECT visit_month, Avg(revenue) AS ARPU

FROM

(SELECT

Cust_id,

Datediff(MONTH, ‘2010-01-01’, transaction_date) AS visit_month,

Sum(transaction_size) AS revenue

FROM transactions

WHERE transaction_date > Dateadd(‘year’, -1, CURRENT_DATE)

GROUP BY

1,

2)

GROUP BY 1)

The results will look like this:

In the case above, that would give us an average monthly spend of **$987.33.**

Calculating churn rate is a bit more complicated, as we need the percentage of people **not** returning from one month to the next, taking each group of customers according to the month of their first visit, and then checking if they came back or not in the following month.

The problem is always that, in a transactional database, we have customers’ visits on separate lines, rather than all on the same line.

The way to fix that problem is to join the transactional database to itself, so that we can see a customer’s behavior on one single line.

In order to isolate those who churned, we take the visits from month 1, and left join the visits from month 2 on the cust_id. The lines where visits from month 2 have a cust_id that is null are the ones where the customer has not returned.

WITH monthly_visits AS (SELECT DISTINCT Datediff(month, ‘2010-01-01’, transaction_date) AS visit_month,

cust_id

FROM transactions

WHERE

transaction_date > dateadd(‘year’, -1, current_date)),

(SELECT

avg(churn_rate)

FROM

(SELECT

current_month,

Count(CASE

WHEN cust_type='churn' THEN 1

ELSE NULL

END)/count(cust_id) AS churn_rate

FROM

(SELECT

past_month.visit_month + interval ‘1 month’ AS current_month,

past_month.cust_id,

CASE

WHEN this_month.cust_id IS NULL THEN 'churn'

ELSE 'retained'

END AS cust_type

FROM

monthly_visits past_month

LEFT JOIN monthly_visits this_month ON

this_month.cust_id=past_month.cust_id

AND this_month.visit_month=past_month.visit_month + interval ‘1 month’

)data

GROUP BY 1)

)

Say this gives us a result of 0.1, just for simplicity.

It is a simple calculation, then, to estimate LTV: we have monthly ARPU and monthly churn, so we just divide one by the other!

$987.33/0.1 = **$9873.3**

As stated earlier, there are limits to this formula, mostly because it makes a series of assumptions that may not hold in the real world. The main one is that retention and churn rates are *stable both across cohorts and across time*.

Stability across cohorts implies that early adopters of your service act in similar ways to late adopters, while stability across time implies that customers’ likelihood of churning out is the same at the beginning of their relationship with you as it is, for example, 2 years in. Depending on how close to the truth these assumptions are, you may need to revise your LTV estimate downwards.

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

- DataOps: How Bell Canada Powers their Business with Data - July 15

Demand for data outstrips the capacity of IT organizations and data engineering teams to deliver. The enabling technologies exist today and data management practices are moving quickly toward a future of DataOps. DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. Register today.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

- DataOps: How Bell Canada Powers their Business with Data - July 15

Demand for data outstrips the capacity of IT organizations and data engineering teams to deliver. The enabling technologies exist today and data management practices are moving quickly toward a future of DataOps. DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. Register today.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central