Implementing data and analytics for product-led growth - Mixpanel
How we implemented data and analytics for product-led growth
Data Stack

How we implemented data and analytics for product-led growth

Last edited: Aug 22, 2022 Published: Feb 15, 2022
Adam Kinney & Madhu Palani

We adopted a product-led growth (PLG) strategy at Mixpanel in 2021. The idea behind PLG is that you want potential customers to try the product for free and realize the value it provides for themselves before starting a sales cycle. So as new free users sign up, you want to measure how far along they progress towards realizing value. When they do achieve value, you want to alert your sales team to that and route them to the appropriate account executive.

We’ve found that while there’s a lot of great content on how a PLG strategy works conceptually, there’s very little in the way of concrete examples or best practices for how data and analytics teams should implement it. This blog is the first in a series that will show how we’re doing it at Mixpanel. We don’t have all the answers, but we’ve made a lot of progress on bringing PLG to life in Mixpanel over the past six months, and we’re excited to start a conversation about how others are solving these problems, too.

Mixpanel’s PLG approach to new business looks like this:

Mixpanel's PLG approach

When new potential customers land on our site organically or from paid marketing, they can find a variety of high-level information about Mixpanel, like technical information in the developer docs and practical product analytics concepts from blogs and ebooks. But there are just two paths they can take towards becoming a Mixpanel customer: They can sign up for an account and try out Mixpanel, or they can request a live product demo or other contact from our sales team.

Since the latter is a signal that the customer is interested in starting a sales process, we route them immediately to the right sales team. However, when they elect to sign up on their own, they can either purchase a paid plan online without sales being involved or they can get started with Mixpanel’s free tier. Users on the free tier will then pass a series of product usage milestones indicating that they are getting real value from using Mixpanel. At that point, they can signal they are ready to talk to sales, and then they’re routed to the right sales team. Additionally, we want sales to reach out to accounts that are about to hit their free plan usage cap—accounts that see a surge of interest in the form of multiple signups in a short period of time and high value accounts right at signup or after attending a webinar. Sales cycles can also kick off from referrals from our partner ecosystem and from the sales team reaching out directly to potential customers.

From the data and analytics perspective, we needed to do two things to make this process work:

  1. We needed to track potential customers along all the paths they may take in the diagram above and route them to sales when appropriate with the right contextual information. While we already had the ability to route leads who request to talk to sales, we needed to add the ability to detect when accounts are about to hit their plan’s usage caps, when there’s an interest surge from an account, and when accounts hit the product milestones that indicate they are getting value from the product.
  2. We needed to make data and analysis on how accounts are progressing through the entire diagram above available self-serve to everyone at Mixpanel. The goal here was to make it possible to do deep analysis on how accounts are progressing through the system so that we can optimize every step and become more and more efficient at getting potential customers to value and winning their business. Perhaps, unsurprisingly, we decided to use Mixpanel to accomplish this.

The big challenge with implementing a PLG strategy is that you need to make use of product data, marketing data, and sales data all together. Marketing and sales data are often joined in CRMs like Salesforce, but product data is often siloed and difficult to join with marketing and sales data. Solving this challenge required a new data infrastructure, and the rest of this post will detail how we did that. In a following post, we’ll describe how we structured PLG data for our new internal Mixpanel project for business analysis and walk through the kinds of analysis we do to optimize our business.

Data infrastructure for PLG

The new convergence of data warehouses (Snowflake, BigQuery), simplification of moving data into them (Fivetran,, ability to transform the data (dbt), and ease of moving the data out of the warehouse (Census, Hightouch) have made it more possible than ever to adopt the PLG strategy.

Getting started with the PLG effort was very similar to building products here at Mixpanel. We started with the types of questions we wanted to ask from the data and worked backwards from there. So the goals that we set for our data infrastructure were:

  1. It should provide us with an ability to drill down on a particular customer and view their activity across marketing, product, and sales and support functions
  2. It should provide us with the ability to do behavioral analysis across our customer base

So, obviously, Mixpanel was our analysis tool of choice, and since we needed a universal ID to track user activity across all these functions, we chose the Salesforce Account ID (SFDC Id) as the distinct identifier (distinct Id) for all of the events.

💡 Tracking events with SFDC ID as the distinct ID along with the user ID as a property on the events let us do funnel/retention analysis on an account level but still gave us the ability to filter down to a particular user if required. For example, a typical sales funnel involves multiple users from the customer’s organization involved in the process and hence the distinct ID cannot be tied to a single user.

With that in mind, below is how we set up our data infrastructure.

Mixpanel's data stack

Although setting up this infrastructure was relatively straightforward, let’s talk about a couple of key issues and how we solved them in order to leverage this infrastructure to fuel our PLG machine.

Problem 1: Identity resolution

Identity resolution is the problem of tracking a user across disparate data sources. At Mixpanel, we have three types of data across which we had to resolve identities:

Product data

Product data is a combination of transactional data that is stored in your application database and clickstream data tracked through client or server side SDKs.

Transactional data

This is the critical data that powers your product and is typically stored in a transactional database like MySQL/Postgres as a set of normalized tables. This is highly critical and accurate application data and is a source of truth for your product and business. The IDs here are typically auto-generated by your database and are integers. Typical examples are UserID, AccountID, and ProjectID.

Mixpanel transactional data
Clickstream data

These are actions that are tracked directly from the client (Browser and/or Android/IOS apps) and represent user interactions with your application. They are tracked to a CDP or directly to Mixpanel. This data is less resilient and typically lossy due to adblockers, connection issues, etc but provides valuable information around the user’s in-product behavior. The IDs here are your distinct IDs which are typically UUID4 strings.

💡 At Mixpanel, our users only interact with us through our website (Browser) and we use UserID (from the transactional database) as the distinctID. This helps us tie the clickstream data to the transactional data.

We use Mixpanel’s pipelines product to move both our transactional data and clickstream data into BigQuery.

Sales data

Salesforce is our CRM tool and we use it both as an operational tool as well as a system of record for sales activity.

Over the years, we have found that many people just want to use our product and explore it on their own rather than going through a traditional sales experience. So we wanted to make sure that we put the product front and center but at the same time have the human sales touch readily available to our users when required. And we need to do this while not losing context about the user or their behavior within the product. In order to achieve these smooth transitions, it is critical to tie users and their activities in the product to salesforce accounts.

💡 Any new user signup creates a salesforce lead along with the internal IDs (userID, organizationID) from the application database as attributes on the lead record. The lead is then converted into a contact by matching to an existing Salesforce account (using email domain) or by creating a new account.

All of these Salesforce tables are pulled into BigQuery using Fivetran. Additionally, we maintain mapping tables between Mixpanel Organizations (from the application database) and Salesforce accounts. This helps us easily convert signups through invite flows as contacts on the Salesforce account.

Marketing data

Marketing tools generally have tight integration with CRMs, and in our case we use Marketo. Marketo has a tight integration with Salesforce and syncs all marketing campaigns along with leads and activities directly into Salesforce. For example:

  • All the links in our marketing material have UTM parameters which, when clicked, are tracked to Marketo + Salesforce and attributed to campaigns.
  • All activity on our marketing pages—blog posts and developer documentation—including any forms on our website such as “request a demo,” “contact sales,” etc, are tracked to Marketo. Marketo then creates leads and converts them into contacts in Salesforce.

Additionally, we pull our marketing data directly from Marketo into BigQuery using Fivetran to get more detailed insights about the pre-conversion activities that were performed by the user, such as opening a marketing email, clicking a link, landing on a page, or viewing marketing content.

Tying them all together

  • Transactional ≤≥ Clickstream ⇒ On signup, identify/alias your anonymous UUID to your user ID from the transactional database
  • Product ≤≥ Sales ⇒ Create/update a lead on signup and make sure to tag the user ID on the lead/contact
  • Sales ≤≥ Marketing ⇒ Most marketing tools already track marketing users to Salesforce leads and accounts

By setting up our infrastructure this way, we are able to track our users across different data silos and tie all their activity to a Salesforce account.

Problem 2: Focus

Once we tied the data across these systems and got the pipelines flowing, we realized that we started to overload our sales team with lots of low-quality leads from signups. For context, our signups have been growing at a rate of >60% YoY! So we needed to get smarter about which leads we route to sales.

Ideal customer profile (ICP)

We wanted our sales team to only focus on the customers most likely to get value from Mixpanel, based on the following:

  • Industry or vertical
  • VC funding dates and amounts
  • Android/iOS app unique user counts
  • Web traffic count.
Domain is key

We began pulling company information from Crunchbase, Apptopia, and Alexa web rankings into BigQuery using cloud functions and joining them based on domain. We use domains extracted from the website URL, Crunchbase URL, and email addresses to match the records from these different datasets. Further, we started employing a similar matching algorithm to connect these datasets to our Salesforce tables.

We then started tagging all of our accounts and leads based on whether they fit our ICP criteria and pushing that data back into Salesforce using Census.

Getting to value

Although ICP tagging is useful, we wanted to route new users to sales only once they had gotten value from our product. But how do we measure “getting value”?

Here at Mixpanel, we use our own product very heavily to understand how our customers use us, so we ran some analysis and identified key value moments (behaviors) across our customer base that correlated with conversion to paid accounts.

Finally, we set up a Census sync to tag leads/accounts when they perform these key behaviors and pushed that information back to our Salesforce system as product-qualified leads.


We have barely gotten started on our PLG journey and we are already seeing a massive impact on key company metrics! We are doubling down on our data infrastructure efforts to enable even tighter integrations between marketing, product, sales and support! If you are interested in these problems and like having massive impact on the business, come join us—we are hiring!

Gain insights into how best to convert, engage, and retain your users with Mixpanel’s powerful product analytics. Try it free.

Get the latest from Mixpanel
This field is required.