Exporting Raw Data

Every data point sent to Mixpanel is stored as JSON in our data store. This export API allows you to download your raw event data as it is received and stored within Mixpanel, complete with all event properties (including distinct_id) and the exact timestamp the event was fired. This returned raw JSON can then be used for a variety of tasks.

All data returned from the export API is real-time.

Prerequisites

The specific URL endpoint for /export is data.mixpanel.com, which differs from that of our main API. Some considerations around using this:

  • API authentication is the same with this endpoint as the main API.
  • You can utilize the same client libraries but you will have to change the endpoint to data.mixpanel.com instead of just mixpanel.com.

API details

  • Due to queueing, iOS and Android data can take up to 5 days to enter the raw data store.
  • For this API, returned timestamps are expressed in seconds since January 1, 1970 in your project's timezone, not UTC. This means that converting the raw exported timestamps using many epoch converters will result in incorrect offsets, as generally epoch timestamps are assumed to be in UTC. You must add back the offset between project time and UTC before storing or processing the data. For example, if your project is set to Pacific time, you would need to add 7 hours (or 8 hours if not in daylights savings time) (60 min * 60 secs * 7 hours) to the timestamp in order to convert this timestamp into UTC.
  • This endpoint uses gzip to compress the transfer; as a result, raw exports should not be processed until the file is received in its entirety. While this process is normally quick and results in a smaller file size, some large exports can take a few minutes to generate. Ensure the timeout set on the receiving client is large enough to account for this process (e.g. larger than 60 seconds).
  • Data returned from this endpoint is JSONL (newline-delimited JSON). Most receiving client libraries will automatically assume it gets a JSON string back and attempt to decode it. This specific API does not return valid JSON in aggregate, but each row is valid JSON within the API's output. Thus, raw exports, once received in full, should be parsed line-by-line instead of as an array of JSON objects.

Example usage cases

  • If you receive a spike of 10K events but notice that only a few users contributed to it and would like to dive deeper into the data.
  • If you are buying mobile ads and would like to dive deeper into the exact UDIDs and see who you really can attribute to the install.
  • If you are doing some very custom analysis Mixpanel cannot currently do. If this is the case, please email support@mixpanel.com so we can either improve our product or possibly show you how you can do it with us.

Export API reference

URI: https://data.mixpanel.com/api/2.0/export/

Please note: The URI is data.mixpanel.com and not just mixpanel.com.

Description:

Get a "raw dump" of tracked events over a time period.

from_date
string
The date in yyyy-mm-dd format from which to begin querying for the event from. This date is inclusive.
to_date
string
The date in yyyy-mm-dd format from which to stop querying for the event from. This date is inclusive.
event
array
The event or events that you wish to get data for, encoded as a JSON array. Example format: '["play song", "log in", "add playlist"]'
where
string
An expression to filter events by. See the expression section on the main data export API page.

Example URL:

https://data.mixpanel.com/api/2.0/export/?from_date=2012-02-14&to_date=2012-02-14&where=properties%5B%22%24os%22%5D+%3D%3D+%22Linux%22&event=%5B%22Viewed+report%22%5D

Return format:

{"event":"Viewed report","properties":{"distinct_id":"foo","time":1329263748,"origin":"invite","origin_referrer":"http://mixpanel.com/projects/","$initial_referring_domain":"mixpanel.com","$referrer":"https://mixpanel.com/report/3/stream/","$initial_referrer":"http://mixpanel.com/","$referring_domain":"mixpanel.com","$os":"Linux","origin_domain":"mixpanel.com","tab":"stream","$browser":"Chrome","Project ID":"3","mp_country_code":"US"}}

One event per line, sorted by increasing timestamp. Each line is a valid JSON object although the return itself is valid JSON but instead JSONL. Timestamps are expressed in seconds since January 1, 1970 in your project's timezone, not UTC as a true epoch timestamp. For example, if your project is set to Pacific time, you would need to add 8 hours (or 7 hours if not in daylights savings time) (60 min * 60 secs * 8 hours) to the timestamp in order to convert this timestamp into UTC. This means that converting the raw exported timestamps using many epoch converters will result in representing times with the incorrect offset.

Document Sections