Direct Mail Campaigns (and their online equivalents) continue to be a popular method to promote a company's offer to potential customers. All of us have received letters from retail stores, financial institutions and other companies with special offers that prompt us to take speedy action to avail of a discount, a bonus or similar attractive proposition. In most cases, I tend to discard these letters without opening them and in rare cases I open them before deciding that they don't apply to me. My behavior is not unique; typical response rates for direct mail campaigns hover around 2% - 3% which means most folks who receive these direct mailers tend to discard them. For companies, this is obviously not good news. Each mailer costs them money (printing and postage) so it is in their best interest not to send out a direct mail to someone that is not likely to respond positively to a campaign.
So for example, a company sends out 1,000,000 direct mailers and gets a response rate of 1%, that is, 10,000 people respond positively to the campaign. In order to obtain 10,000 responses, if there was a way for the company to send out fewer than 1,000,000 direct mails, that would save the company time, effort and money. But is there a way to do this? Using, data mining, companies can. In today's post, we explore the use of a decision list algorithm to solve this problem and based on fictional data, we evaluate the benefits that a company can realize as a result of optimizing their direct mail campaigns.
As always, we first start with the data which is as follows:
As can be seen, we have 13,504 records in our data set with each record reflecting the response (0 = negative; 1 = positive) of a customer to a particular campaign. Additionally, we have data about each customer such as age, income, marital status, etc and in cases where the response was positive, we have information about the product purchased and the date of purchase.
Upon creating a cross tab report to evaluate the success of this campaign, we observe the following result:
What this table indicates is that we sent out 13,504 direct mailers and obtained 1,952 positive responses for a response rate of 14.45% (remember this is fictional data!). What we will now attempt to do, is to use a decision list algorithm so we can better predict those that are likely to respond positively to a campaign such that we arrive at 1,952 positive responses using fewer direct mailers than the original 13,504. In order to do this, the algorithm will study the historical behavior of the prospects and try to identify subgroups or segments that show a higher or lower likelihood of a binary (yes or no) outcome relative to the overall sample.
By running the data through the Decision List algorithm, we observe the following results:
At the top of the table, we see a summary of the original data, that is, 13,504 direct mailers sent out for a response rate of 14.45%. Below this line, we see that the Decision List algorithm has broken the data set down into 7 distinct segments or sub groups. These sub groups are created based on demographic, economic and behavioral variables present in the data set. For example, we note that only 21.98% of customers in the age group 30 - 34 are likely to respond positively to the campaign where as 85.69% of those customers with an income exceeding 55,267 are likely to respond positively to the campaign.
Based on the segments identified above and assuming that each direct mailer costs $1, we observe the following savings for the company:
The savings of $7,750 are realized because we were able to identify a group of 7,750 prospects that were unlikely to respond positively to the campaign and therefore the company could save costs by not mailing the campaign out to them.
Another way of reviewing the success of this algorithm is to review the gains chart:
From this chart, we can clearly see the value in using a decision list algorithm to segment the overall mailing list and sending out the direct mailer only to those prospects that are most likely to respond positively to a campaign. From the chart above, we see that in order to obtain 80% of the responses of the original campaign, we only need to send out the mailer to approximately 25% of the overall prospect list.
As can be seen from the analysis above, applying data mining to optimize a direct mail campaign can significantly improve the results one observes and result in tangible cost savings for companies as several direct mail companies can readily attest to.