Uncategorized

Post-mortem: Data inconsistency on July 19, 2016

Neil Rahilly

On July 19, 2016, at 1:40am PDT, we deployed our JavaScript library (v2.9.0) with a bug affecting the Mixpanel cookie. The bug caused the default cookie, which stores important information including the distinct_id and registered super properties, to expire too quickly. For a subset of customers, this ultimately could have led to a spike in uniques (though total numbers remain accurate). In other words, it might look like these customers had lots of anonymous new visitors every six minutes, while in fact, it was just the distinct_id resetting.

This impacted our customers who use the default configuration of our JavaScript library, specifically v2.9.0-v.2.9.3. For more detail on how this bug might have affected you, please see the “Customer Impact” section of this post.

What happened and why

We updated a utility function in our JavaScript library to interpret an argument in seconds rather than days, and failed to correctly update all callers of the function to reflect these new semantics. The Mixpanel cookie is intended to expire in one year, but this change caused it to expire in 365 seconds (approximately six minutes) by default, which resulted in events with new (and thus incorrect) distinct_ids and missing super properties. We discovered the issue on July 22, 2016, at 12pm PDT and reverted the change shortly thereafter at 2:11pm (v2.9.4).

We tested these changes with our own internal use of the JavaScript library for over a month. Unfortunately, the bug went unnoticed by us because the Mixpanel integrations on mixpanel.com all use mixpanel.set_config({persistence: ‘localStorage’}) to avoid the use of cookies altogether.

Customer impact

Any event tracked with the JavaScript library v2.9.0 – v2.9.3 (this includes snippets that refer to mixpanel-2-2.js and mixpanel-2-latest.js) that used the default storage behavior (“cookie”) between July 19, 2016, at 1:40am PDT and July 22, 2016, at 2:11pm PDT could have been affected by having a newly generated distinct_id and missing super properties.

What might this look like in your Mixpanel projects? Here’s a breakdown of what you might see in your reports during the date range if your users were affected:

  • Segmentation might show a spike in uniques from new distinct_ids.
  • More users might have completed the first step in a funnel, with far fewer users having completed subsequent steps.
  • There could be a much higher initial cohort in Retention with lower rates of retention.
  • It’s also possible you might have duplicate people profiles, depending on your use of mixpanel.identify(). The best practice is to always pass through your unique identifier to the identify() calls to ensure that activity and people properties are set on the correct distinct_id. However, if your implementation has people.set() calls with an empty identify() call on the same page, the property would be set on the distinct_id that existed in the cookie at that time, thus creating a people profile. If the cookie reset during the user’s session, any subsequent people.set() and identify() calls would reference a different distinct_id.
  • Any custom super properties in addition to the super properties that Mixpanel’s JavaScript library collects by default would have reset every six minutes. This would be most obvious for attribution information (e.g., initial referring domain, referrer, etc.).

You can check the impact on your project by adding a series of filters to see every event that was affected by the bug during the date range of July 19, 2016 – July 22, 2016. Go to the Segmentation report, view Top Events, and then add the following property filters: “Mixpanel Library” equals “web”, AND “Library Version” contains “2.9.” AND “Library Version” does not equal “2.9.4”.

mixpanel-library-filters

Next steps

We care deeply about data accuracy, and we apologize for the inconvenience. In the immediate term, we are strengthening our automated tests and casting a wider net by running multiple configurations of our library while dogfooding our changes. In the long term, we’ll focus on minimizing impact when making big changes by performing incremental rollouts, as well as considering creating a new release instead of automatically updating the library. If you have any questions regarding this incident, please don’t hesitate to reach out to Mixpanel Support at support@mixpanel.com.

Get the latest from Mixpanel
This field is required.