If you ever need help, please email us at support@mixpanel.com
Back to topics

Exporting raw data you inserted into Mixpanel

Updated July 11, 2015

What is this API?

Every data point sent to Mixpanel is stored as JSON in our data store. This export API allows you to download your raw event data as it is received and stored within Mixpanel, complete with all event properties (including distinct_id) and the exact timestamp the event was fired. This returned raw JSON can then be used for a variety of tasks.

Things to keep in mind

  • The Raw Data Export API is not real-time. The /export endpoint begins to update every day at midnight according to your project's timezone. Generally the complete data set for the previous day will be available around 6am project time.
  • For this API, returned timestamps are expressed in seconds since January 1, 1970 in your project's timezone, not UTC. This means that converting the raw exported timestamps using many epoch converters will result in incorrect offsets, as generally epoch timestamps are assumed to be in UTC. You must add back the offset between project time and UTC before storing or processing the data. For example, if your project is set to Pacific time, you would need to add 8 hours (60 min * 60 secs * 8 hours) to the timestamp in order to convert this timestamp into UTC epoch time.
  • This endpoint uses gzip to compress the transfer; as a result, raw exports should not be processed until the file is received in its entirety. While this process is normally quick and results in a smaller file size, some large exports can take a few minutes to generate. Ensure the timeout set on the receiving client is large enough to account for this process (e.g. larger than 60 seconds).
  • Data returned from this endpoint is JSONL (newline-delimited JSON). Most receiving client libraries will automatically assume it gets a JSON string back and attempt to decode it. This specific API does not return valid JSON in aggregate, but each row is valid JSON within the API's output. Thus, raw exports, once received in full, should be parsed line-by-line instead of as an array of JSON objects.

Requirements before using this API

The specific URL endpoint for /export is data.mixpanel.com, which differs from that of our main API. Some considerations around using this:

  • API authentication is the same with this endpoint as the main API.
  • You can utilize the same client libraries but you will have to change the endpoint to data.mixpanel.com instead of just mixpanel.com.

Useful examples of using this API

  • If you receive a spike of 10K events but notice that only a few users contributed to it and would like to dive deeper into the data.
  • If you are buying mobile ads and would like to dive deeper into the exact UDIDs and see who you really can attribute to the install.
  • If your boss does not believe Mixpanel and wants absolute proof!
  • If you are doing some very custom analysis Mixpanel cannot currently do. If this is the case, please email support@mixpanel.com so we can either improve our product or possibly show you how you can do it with us.

Method: export

URI: https://data.mixpanel.com/api/2.0/export/

Please note: The URI is data.mixpanel.com and not just mixpanel.com

Description

Get a "raw dump" of tracked events over a time period.

Parameters

Required Name Type Description
required from_date string

The date in yyyy-mm-dd format from which to begin querying for the event from. This date is inclusive.

required to_date string

The date in yyyy-mm-dd format from which to stop querying for the event from. This date is inclusive.

optional event array

The event or events that you wish to get data for, encoded as a JSON array.

Example format: '["play song", "log in", "add playlist"]'

optional where string

An expression to filter events by. See the expression section on the main data export API page.

optional bucket string

[Platform] - the specific data bucket you would like to query.

Example URL

https://data.mixpanel.com/api/2.0/export/?from_date=2012-02-14&expire=1329760783&sig=bbe4be1e144d6d6376ef5484745aac45
&to_date=2012-02-14&api_key=f0aa346688cee071cd85d857285a3464&
where=properties%5B%22%24os%22%5D+%3D%3D+%22Linux%22&event=%5B%22Viewed+report%22%5D

Return format

{"event":"Viewed report","properties":{"distinct_id":"foo","time":1329263748,"origin":"invite",
"origin_referrer":"http://mixpanel.com/projects/","$initial_referring_domain":"mixpanel.com",
"$referrer":"https://mixpanel.com/report/3/stream/","$initial_referrer":"http://mixpanel.com/",
"$referring_domain":"mixpanel.com","$os":"Linux","origin_domain":"mixpanel.com","tab":"stream",
"$browser":"Chrome","Project ID":"3","mp_country_code":"US"}}

One event per line, sorted by increasing timestamp. Each line is a valid JSON object although the return itself is valid JSON but instead JSONL. Timestamps are expressed in seconds since January 1, 1970 in your project's timezone, not UTC as a true epoch timestamp. For example, if your project is set to Pacific time, you would need to add 8 hours (60 min * 60 secs * 8 hours) to the timestamp in order to convert this timestamp into UTC. This means that converting the raw exported timestamps using many epoch converters will result in representing times with the incorrect offset.


Please let us know if you are interested in other return formats.