Machine learning and product analytics: Navigating the hype

Machine learning and artificial intelligence have seen an explosion of real-world applications in the last decade. Applications such as targeting personalized content to users in real time have demonstrated impressive results. And more and more we are even seeing ML/AI features deployed in product analytics contexts.

Indeed, Mixpanel was very early in building ML features in our product. However, after a lot of experimentation, we have purposefully deemphasized building new ML features and have even deprecated some we used to have. While these features promise a lot and can show impressive results in a demo, most of them come with serious downsides, like lack of transparency and confusing or even misleading results. In this post, we will run through the most common ML features in analytics products and show how non-ML methods actually offer superior results.

Scenario 1: Predicting user outcomes

In this scenario, you have a product-based outcome you want to influence, like retention or purchases, and your goal is to determine what user behaviors are likely to lead to improvement in the outcome. Predictive models are sometimes presented as a solution here. The idea is that if you can predict user outcomes based on their prior behavior, then the behaviors that are most predictive of positive outcomes are the ones that you should try to drive in order to improve the outcome metric across all users.

For example, let’s say you care about subscriber retention. A predictive model could look at all the things subscribers did in your app in one month and then look to see who is still a subscriber one month later. The behaviors most predictive of still being a subscriber would be the ones to focus on if you want to increase subscriber retention.

Unfortunately, using predictive models in this way misunderstands what predictive models are for. Predictive models like this are designed to provide accurate predictions of the future. User-level predictions of the future are useful for personalizing your product but not for understanding what behaviors drive outcomes. Predictive models essentially isolate correlations over time. Certain behaviors will be correlated with other future behaviors, but they will not necessarily cause those behaviors.

Another problem with using predictive models to find behavioral inputs is that the most predictive behaviors are often not actionable. For example, if you want to determine who will have the most sessions next month, the most predictive behavior will almost certainly be the number of sessions this month since past performance is often the best predictor of the future. But then the model is essentially telling you: “If you want to increase the number of sessions, you should focus on increasing the number of sessions.” This is just circular and not useful for driving the outcome you care about.

So what should you do if you want to isolate the causal drivers of an outcome? The following steps will lead you in the right direciton:

Look for correlations: Rather than predicting outcomes, skip the complexity and look for behaviors that are correlated with subsequent outcomes. Mixpanel’s Signal report is designed to help you find these correlations.

Form a hypothesis: Among the correlations you find, some may be actionable and some may not. Zero in on one or two correlations that seem actionable (ie, you think you could influence the behavior in question). These are your hypotheses: “I hypothesize that if we increase [behavior x], we will improve [outcome y].”

Test your hypotheses: The gold standard for testing product hypotheses is a well-designed A/B test. An alternative is to use a causal analysis like Mixpanel’s Impact report.

Scenario 2: Segmenting users

In the clustering scenario, you want to understand the different types of users of your product. The idea is usually that you want to categorize users into a set of personas and then see how different outcomes differ from persona to persona to ensure that all your different types of users are well served.

ML features that purport to solve this use case are typically based on cluster analysis. The idea behind cluster analysis is it will identify distinct groups of users that behave similarly to one another. Clustering features typically require the user to specify a number of clusters to find and then look across all events, or sometimes just the most popular events, to find the requested number of clusters. These features will also typically ask you to provide one or more outcomes so you can see how the clusters differ from one another.

There are several problems with doing cluster analysis in this kind of setup. The first problem is that the clustering algorithm is blind to the product and business context of your app. It is scanning simple event counts, and it is therefore missing out on features from user profiles or any kind of complex features. For example, maybe the ratio of two events is what is important rather than the counts. Maybe geographic region is a key part of the clusters. The resulting clustering is likely to be noisy as a result of missing these.

Another problem is that you need to guess how many clusters are in the data. The only way to do this is to guess by trial and error until you find something that “looks good”. And this leads us to the final problem: There is no objective way for you to assess whether the resulting clusters are good or even represent anything resembling actual clusters. A clustering algorithm will divide your users into the requested number of clusters even if all of your users just fall along a continuum rather than clustering into distinct behaviors.

All of these problems add up to a clustering analysis that is more often than not partitioning your users into an arbitrary number of clusters based on an arbitrary collection of events. There is no safeguard to ensure that these clusters represent anything real, and whether they differ on the outcomes you select is up to random chance.

If you want to segment your users into reusable clusters or personas, there are two alternatives:

High-investment path: To have an enduring set of user personas you can trust, you should start with qualitative user research where you interview users to understand the different modes or use cases your users fall into. This qualitative analysis should then be synthesized into a set of behavioral features that capture the different kinds of groups of users. This set of features can then be used in cluster analysis.

Pragmatic path: For most companies we work with, the methodology above is overkill. Most often what you are looking for is the segments of users that are converting well or not converting well on a particular outcome. For example, what segments of users are most likely to successfully purchase after adding an item to a cart? And what segments of users are least likely to do so? These kinds of questions can be answered most directly in Funnels and Retention. Within Funnels, Mixpanel has a Find Interesting Segments feature that will even look through all the ways to segment your users and surface the ones most and least likely to convert.

Analytics or everyone.

Adam Kinney

Head of Analytics @ Mixpanel