How to build more complete and more accurate user profiles, faster - Mixpanel
Inside Mixpanel

How to build more complete and more accurate user profiles, faster

Sylvain Giuliani Head of Growth, Census

Your cloud data warehouse may secretly be your second most important product analytics tool after Mixpanel. It’s essentially a huge repository for clean, accurate, and actionable user data. The problem is getting that user data out and into Mixpanel so it can be put to use.

This can be a big issue for many businesses: If your user profiles don’t include data from many different sources, you’re not getting the full picture of your users and how they’re interacting with your product. Without that full picture, anyone making decisions based on those user profiles runs the risk of making the wrong decisions.

Fortunately, your cloud data warehouse can place data from just about any source into your user profiles. You just need the right infrastructure and the right tools.

Connecting Mixpanel to your cloud data warehouse using Census can unlock your data warehouse’s product analytics power. Mixpanel and Census, along with a hub-and-spoke data infrastructure, allow you to use a process called operational analytics to sync user data from just about any data source in order to build accurate user profiles quickly and reliably.

Put data into action with operational analytics

Operational analytics is the key to getting modeled user data out of your data warehouse and into tools where it’ll actually get used with minimal effort. In the case of user profiles, this means taking any user data out of your data warehouse and automatically placing it within Mixpanel, where it can become event data and profile data within a unified user profile.

Census uses a process called reverse ETL (extract, transform, load) to connect to your cloud data warehouse, validate modeled data, and, with a few lines of SQL, place it directly into Mixpanel. Once in Mixpanel, this data is easy for just about anyone in your organization to explore and act on.

How the integration fits into the modern data stack

An overview of the modern data stack with Census and Mixpanel

Operational analytics is the process of taking data out of your data warehouse and putting it into action. The entire process starts with data sources (aka the “spokes”) where raw user data is generated. Then, an ETL tool collects all that data and sends it to the data warehouse (aka the “hub”), where it gets modeled.

The big innovation that operational analytics introduces is the ability to send that modeled data back into individual tools (more “spokes”). Instead of languishing in your data warehouse or in an outdated dashboard, the data is placed directly in a tool that gets used for daily operational decision-making.

Say you have data from Segment, Salesforce, Google Play Ads, and SDK connections that you want to place within your user profiles. Here’s an overview of the steps you’d take using the hub-and-spoke approach, Census, and Mixpanel:

  1. Using the hub-and-spoke model, your data team can integrate all your user data into your data warehouse to clean and model it.
  2. Once cleaned and modeled, your data team can use Census to automatically validate your data and deploy it to Mixpanel.
  3. In Mixpanel, this clean, accurate data can become data (either event data or profile data) in their user profile.
  4. Once implemented, this process is nearly automatic and real-time. The only regular work would be the modeling done by the data team, which is made easy with dbt.
  5. Your user profiles in Mixpanel can now automatically include any relevant data from any source.

Salesforce data can assign a profile property about which sales territory the user is in, Segment can place an event property with landing page form-fill data, Google Ads can place event data to inform mobile attribution, and on and on.

Without any work on their end, product, marketing, and any other departments with access to Mixpanel can explore and act on user data pulled from across the tech stack on the fly. They never have to leave Mixpanel—all the data they need is right there, ready for use.

Automatically sync data for more actionable and accurate user profiles

The automation aspect of operational analytics is the most important part of syncing data to user profiles quickly. Without it, this process of pulling data out of a data warehouse becomes both an unmanageable slog for data teams and a source of potentially outdated or inaccurate data.

The amount of work it used to take to get data out of a data warehouse is one of the reasons why it was mainly reserved for backward-looking business intelligence dashboards and reports. Business intelligence relies on SQL queries to pull data out of a data warehouse, visualize it, then create a report. If you’re looking to answer ad-hoc questions about your users in the moment, this process not only requires technical expertise (i.e., fluency in SQL), but it may simply take too long.

Instead of writing SQL queries or managing complex ETL flows to produce reports and dashboards, your data team can use the hub-and-spoke model along with operational analytics powered by Census to place data into the context where it’ll get used.

There are two important places operational analytics introduces automation:

  1. Modeling using dbt
  2. Deployment using Census

Essentially, dbt helps your data team create a “library” of data models. So, when they see a common type of data come through, they don’t have to reinvent the wheel every time. This approach allows data people to work much more efficiently, to the point where the data team at Drizly described their data analysts as being able to “think and work more like software engineers.”

From there, Census kicks into gear by automatically validating the work the data team has done and, with less SQL than it takes to create a dashboard, automatically deploy the clean, modeled data to Mixpanel. The process of pulling data out of a data warehouse—which used to be very manual and time-intensive—is now standardized, replicable, and mostly automated.

Now user data from many different sources can automatically flow right into your Mixpanel user profiles. Not only is this data more robust, but it’s also clean and accurate, thanks to automated modeling and validation. Armed with a complete and accurate understanding of your users in Mixpanel, every team—from engineering to product to growth marketing—can explore, identify, and act on insights.

Start with the hub-and-spoke model

A hub-and-spoke model for data infrastructure—where your cloud data warehouse is a central “hub” and data sources are “spokes”—makes managing user data from many different sources easier on your data team. This is in contrast to a web of point-to-point data connections, where one tool connects to another in a messy, ad-hoc way instead of going through one central location. Using the hub-and-spoke model is key to creating a pipeline of accurate, clean user data to flow into your user profiles in Mixpanel.

In the hub-and-spoke model, your cloud data warehouse serves as a central hub for all your data—user data included—to flow through. Without this central place, your data team will have to manage connections between individual tools and databases, which can be hard to keep track of and manage.

Point to Point vs. Hub & Spoke Architecture

The difference between point-to-point and hub-and-spoke approaches to data infrastructure

Once all your user data is in one place, it’s much easier for your data team to standardize, clean, and model user data using a tool like dbt. At this stage, the data team can specifically model user data to fit right into your existing user profiles.

If a high-priority account is located in Bora Bora, the data team can prepare that data to become a profile property along the lines of “Bora Bora Territory.” Your data team could then do the same with Stripe transaction data, Segment website event data, and HubSpot email campaign data.

One issue with the hub-and-spoke approach in the past was that it could be difficult to get this modeled user data out of the data warehouse and back into a tool like Mixpanel, where it could be used. Historically, this required your data team to manage complex ETL flows, and it took attention away from all the other duties they had on their plate.

Often, the reward of more accurate and robust user profiles informed by your entire tech stack wasn’t always worth the effort. Fortunately, operational analytics significantly reduces the effort involved in putting your modeled data into action.

Accurate user data helps you build a better product

To build a better product, you need to understand how your users engage with it. Accurate user data from sources across the entire customer journey provide that deep understanding.

Your marketing team will gain insight into how effective their campaigns are, the product team can track the impact of a new feature throughout the sales funnel, the customer success team can see the actions a user takes before reaching out for support, and more. As every team iterates based on this accurate user data, they can continually improve the experience each user has with your product.

As an extensible platform, Mixpanel is made more powerful using integrations with tools like Census. Schedule a Mixpanel demo or learn how to integrate Census with Mixpanel today.

Get the latest from Mixpanel
This field is required.