Survival Analysis for Online Marketing Channels

Valuing your customers can take many forms. Generally LTVs do this quite well but often you need more depth into the quality of your customers. A great indicator of quality customers are those that purchase at a faster rate. For instance, customers that made their second purchase sooner are generally much more likely to become high valued, or high LTV, customers. This is where Survival Analysis comes handy.

What is Survival Analysis?

Survival Analysis deals with the analysis of time duration until one or more events happen, such as death in biological organisms. Traditionally it hasn’t been used in marketing until very recently. Implementing it can be relatively simple with the right SQL queries and an R package.

The Business Case

You know the LTVs of your segmented customer base but you also want some more insight into the predictive nature of how you can get more of those highly valued customers. You also know that the sooner that second purchase is made the more likely they are to becoming great customers. But how soon does that purchase need to be made? Can you influence that second purchase? And which channels, or segments, provide the customers most likely to purchase sooner?

Implementing the Analysis

Take a sample set of data large enough to be statistically significant for each segment. Ensure that there aren’t any factors that could skew the data. For instance, for one client I knew that they had implemented a new shipping threshold that fundamentally changed customers purchasing habits so I purposefully only selected customers that were only exposed to these new shipping policies. Then, segment these customers using dummy variables. These segments should be useful indicators, such as channels the customers were acquired through (ie. Google CPC, Facebook, etc) but also they could be subscribed to your email newsletters, purchase particular products, or are a part of a club. Your data should look something like this, where ‘Time’ is in days, ‘Event’ is ‘1’ for the second purchase taking place, and in this case, customers with 50 days never reach the event as the sample set only has customers observed for this time period:

time event googlecpc referrals affiliate emailsub
16 1 0 0 0 0
50 0 0 0 1 0
50 0 1 0 0 0
50 0 0 0 0 0
50 0 0 0 0 0
50 0 0 0 1 0
50 0 0 1 0 0
50 0 0 0 0 1
28 1 0 0 0 1
50 0 0 0 0 0
50 0 0 1 0 0
35 1 0 0 0 1
50 0 0 0 1 0
50 0 0 0 0 0
50 0 1 0 0 0
4 1 0 0 1 0

There are several types of Survival Analysis you can now perform. In this post I’m going to focus on a non-parametric function called Kaplan-Meier Curves, which is a series of horizontal steps of declining magnitude, which approaches the true survival function for that population. Your curves will look something like this:

Screen Shot 2014-02-28 at 12.14.26 AM

Here the graph is comparing Google CPC (googlecpc = 1) to all other data (googlecpc = 0). Generally, there should be a distinct drop off in the first few days as customers return (make sure not to include refunds!), and the curve gradually levels out as the probability of them returning becomes lower and lower. Also, notice that customers acquired by Google CPC are more likely to make that second purchase compared to all other customers.

Actionable Insights

Insights depend completely on your data and business. Email subscriptions tend to be a very strong indicator of return customers. Ensure that they subscribe to something that you are communicating your offers with. Following this analysis for one client, initiatives to subscribe customers to emails sooner ended up increasing the death rates of survival curves, meaning more customers reached their second purchase faster. A shocker was that the referral program actually had one of the worst survival rates. This prompted a review of that acquisition tool as well as a greater emphasis on other tools as they tested newer options out.

Kaplan-Meier curves also indicate when you should attempt to communicate with your customers before they are unlikely ever to return. In the curves above after 5 days there is a significant drop and about after 40 days customers are very unlikely to return. These could be great times to contact your customers with an opportunity to revisit your site to purchase again.

Conclusion

Survival Analysis is predictive in nature but remember that when you make changes based on your findings these curves will fundamentally change if you have used affective tools to reengage your customers. Just like other data you collect, use Survival Analysis as a constant gage to improve your purchasing funnels and acquisition methods by using the levers at your disposal.

Advertisements

Big Data: Maximizing the effectiveness and efficiency of your online marketing dollars.

Thank you everyone for joining me at The Big Data World Canada 2013. A special thanks to Terrapinn for putting together such a fantastic event. Below is a copy of my presentation on Big Data strategies for online marketing. There will be an audio track to accompany it shortly, for which I will keep you updated.