How is retention calculated?
The retention calculation in Mixpanel is designed to create a normalized, actionable metric to assess user engagement over arbitrary periods of time. It is our implementation of what is known as "cohort analysis." Cohort analysis involves breaking customers apart into groups (or cohorts) based on when they completed an action. In Mixpanel this translates into grouping based on when that customer sent a certain event.
Mixpanel can cohortize (break users into groups) based on any single event, with time granularity of days, weeks or months. Or, if you select the "anything" parameter, we form a cohort of users who sent at least one event of any type during the given time period. If a user sent the event during the selected time window (you will see the windows running along the left side of the report), the user will be included in that cohort. The count of users for any cohort can be seen along the left next to the date we are cohortizing on.
We only count unique users in cohorts; not totals. In other words the cohort will include every unique user that sent that cohortizing event in that time window, starting and stopping at 0:00 of the beginning day and midnight of the ending day. A customer can only be counted once per cohort, but can be included in more than one cohort. For example, if you are cohoertizing based on your "item purchased" event and creating weekly cohorts, a customer who purchased at least one item each week will be in every cohort, not just the cohort for their first purchase.
For each cohort, the following buckets (marked 1,2,3 etc. along the top of the retention report) are dependent on when any given customer sent the cohortizing event. Let's stick with the weekly "item purchased" cohort example. And say we are looking at people coming back and sending any event whatsoever (using the "anything" param). Let's say we are still looking at weekly cohorts. In that case, the numbers marked 1,2,3 etc. along the top of the report will be marking week long time buckets. The percentage under each is the number of customers who came back to send any event during that bucket.
Now listen up, this is the complicated part: the beginning and ending of each bucket will be different for each customer in the cohort. The limits of the time bucket marked "1" will be between 1 and 2 time periods after the comforting event was sent. If a customer sends an event at 3pm on Tuesday then bucket "1" for him begins 3pm Tuesday of the following week. That customers bucket 1 extends between 7 and 14 days from the time of the cohortizing event.
Likewise, in daily retention, in order to be counted in the bucket marked 1, the customer must send whatever event we are looking for in the retention some time between 24 and 48 hours after he sent the cohortizing event. If he sends the event between 0 and 1 time periods after the cohortizing event that event is not reflected in the report.
Thus, the buckets are actually a little further forward in time that you might think, and can take up to nearly two full lengths of time (days, weeks, months) to settle out completely, since some users bins can extend nearly two units in the future.