Mixpanel Modeling Layer: data superpowersLast edited: Mar 1, 2022
It’s easier than ever to collect billions of data points, but sheer volume doesn’t imply insight. Enter data modeling: the science of writing data in a way that efficiently yields insightful answers. Practically speaking, there is no perfect data model. To keep up with rapid growth, the most innovative product companies evolve their data model to answer every new question they ask.
At a small company, it’s easy to modify your tracking for each new analysis you want to run. This fails at scale for a simple reason: the team tracking the data (engineering) diverges from the team that uses it (product, marketing, design). These teams may even span different organizations with different priorities. Reaching across the company and waiting weeks to get data leads to stagnation — not innovation.
Large companies solve this problem by staffing a specialized data team, which sits between writers and readers of the data and builds infrastructure to carefully model raw event data for each analysts’ use case. At Mixpanel, we ❤️data teams, and we built Data Pipelines to seamlessly integrate Mixpanel into their workflows. However, we realized that having a team in between a PM and their analysis can introduce friction and cost to both parties, particularly for common cases.
So we asked ourselves the following question:
What if we gave every PM in the world the ability to model behavior data perfectly for their analysis?
Introducing the Mixpanel Modeling Layer
We’re thrilled to announce the Mixpanel Modeling Layer, a suite of features that empowers PMs to answer tough product questions in seconds, based on the events and properties they already have in Mixpanel.
In this post, we’ll cover four core features: the Segmentation Engine, Custom Properties, Behavioral Analysis, and Custom Sessions. We’ll walk through each feature through the lens of a PM at a media company studying video engagement of their users. This content is also covered in a short video walkthrough by Mixpanel PM Moinak Bandyopadhyay.
Unlimited, retroactive segmentation
Product analytics starts with segmentation. Basic tools let you define a set of key metrics based on your events (Page Views, DAU, Session Count), but limit how deeply you can segment those metrics. In Mixpanel, any metric can be segmented on any property without needing to define this segment ahead of time. This power allows PMs to start with a top-level metric and slice it down to any user or behavioral segment they care about in real-time.
For example, we can drill down on a simple count of Watch Video events to observe the number of videos watched by account type, gender, age and platform. These metrics were not calculated ahead of time, but we can define and visualize them in Mixpanel within seconds based on only the raw events tracked.
Calculate new properties on the fly with Custom Properties
Next, let’s say you want to study video watch time according to your internal categorization of videos as “short,” “medium,” and “long.” Unfortunately, your app has tracked video duration in minutes and not by your custom definition. The tracked data lacks an important property for analysis. It’s a common roadblock PMs encounter, and it’s especially frustrating when this property can be inferred from the tracked data —either mathematically or by some logic specific to your business.
Previously, you’d have to go back to the developers or data team to recompute that specific property in the tracking code. This is time-consuming, error-prone, and doesn’t apply to historical data.
With Custom Properties, you can compute new properties on the fly using the data you have. You can then use these new properties to filter, segment, or aggregate any metric, as if it had been tracked all along. Using a familiar Excel-like syntax, our media platform PMs can define a Duration Category property based on watch time (mins), and use their own custom logic to bucket durations into “short,” “medium,” and “long.”
Understand your users more deeply with behavioral analysis
Now let’s go a level deeper to understand the distribution of users by various aspects of their behavior, such as the number of videos watched in the last 7 days.
This is easy to calculate if you’ve modeled data for it. For example, by incrementing a counter on each user when they watch a video. But if you have a new behavioral question, like the total number of minutes users spend watching videos, you have to go back and track that data.
In Mixpanel, not only can you do rich analysis on tracked user properties, you can do it on functions of events and event properties too. Back to our video example, you can define both the total number of videos watched per user and the total video watch time, quickly seeing distributions of these behavioral metrics across the entire user base. You can also filter or segment these distributions by any property to deeply understand user behavior across relevant segments. All of this based only on the raw event data (Watch Video events with durations) that was tracked to Mixpanel originally.
If you are currently a Mixpanel customer and would like to try out this new feature, please reach out to your account representative or email firstname.lastname@example.org.
Tailor session analysis to your domain with custom sessions
Counting sessions rather than events helps uncover patterns in user behavior. The challenge, however, is that most tools provide a one-size-fits-all session definition that are either time-based (e.g. ending with 30-min period of inactivity) or navigation-based (ending when the user closes the app).
This rigidity is insufficient for sophisticated product teams. A gaming app and a banking app shouldn’t be pigeonholed into the same session definition. As a workaround, teams often have to track custom session events, leading to redundancy and maintenance overhead.
Mixpanel offers sessions that are codeless, retroactive, and customizable, meaning you can build your own session definitions after you’ve already started tracking events. You can designate any event to define the start or end of a session or use a timeout-based definition set to any value from 1 minute to 1 day. This lets you experimentally determine the precise definition appropriate to your business, based on the events you’ve been tracking all along.
Mixpanel also attaches metadata to sessions based on the events that comprised it, including event count and session duration. This allows PMs to drill down into custom session events with the same flexibility and precision as any other events they’ve tracked.
The features described above are just a small taste of the value Mixpanel’s Modeling Layer brings to truly self-serve product analytics, and our work doesn’t end here. Over time, this powerful class of features will empower product teams to model their data perfectly for every analysis without blockers. Combined with our best-in-class segmentation engine, our modeling layer gives PMs the power to answer tough product questions in seconds rather than weeks.
Check out this video from Moinak, a Product Manager at Mixpanel, to see the Modeling Layer in action.
New: We have also released the ability to enrich existing data by uploading a table of entities to provide more ways to filter and segment your data. Think: upload a CSV of video_id, genre, language, and creator and automatically join it against all historical events that had video_id. This unlocks analysis on video genres without having ever tracked them in the past.
If you are currently a Mixpanel Growth or Enterprise customer and interested in the beta, please reach out to email@example.com