Replay API
replay-stream

Replay API

Methods 

Method Description
GET /replay/:stream_type Connect to the replay stream. For realtime PowerTrack, the Stream Type is 'powertrack'. For Volume Streams, Stream Types include 'sample10' (i.e. decahose), 'firehose', 'mentions', and 'compliance'.

Authentication 

All requests to the Replay API must use HTTP Basic Authentication, constructed from a valid email address and password combination used to log into your account at console.gnip.com. Credentials must be passed as the Authorization header for each request.


GET /replay 

Establishes a connection to the Replay data stream. Tweet data will be delivered for the time period specified, and user profile objects will reflect the referenced users at the time when the Replay API is running.

Please see HERE for details on consuming streaming data after the connection is established.

Request Method HTTP GET
Connection Type Keep-Alive

This should be specified in the header of the request.
URL Found on the stream's API Help page of your dashboard, the URL is built with Stream Type, Account Name and Stream Label tokens. For realtime PowerTrack, the Stream Type is 'powertrack'. For Volume Streams, Stream Types include 'sample10' (i.e. decahose), 'firehose', 'mentions', and 'compliance'.

Replay URLs have the following pattern:

https://gnip-stream.gnip.com/replay/{STREAM_TYPE}/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json

For example, the Replay URL for realtime PowerTrack has the following pattern:

https://gnip-stream.gnip.com/replay/powertrack/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json

For example, the Replay URL for Decahose has the following pattern:

https://gnip-stream.gnip.com/replay/sample10/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json
Compression Gzip. To connect to the stream using Gzip compression, simply send an Accept-Encoding header in the connection request. The header should look like the following:

Accept-Encoding: gzip
Character Encoding UTF-8
Response Format JSON. The header of your request should specify JSON format for the response.
Rate Limit 5 requests per 5 minutes.
fromDate The oldest (starting) UTC timestamp from which the activities will be provided, must be in 'YYYYMMDDHHMM' format. Timestamp is in minute granularity and is inclusive (i.e. 12:00 includes the 00 minute). Valid times must be within the last 5 days, UTC time, and no more recent than 31 minutes before the current point in time. It's recommended that the fromDate and toDate should be within ~2 hours.
toDate The latest (ending) UTC timestamp to which the activities will be provided, must be in 'YYYYMMDDHHMM' format. Timestamp is in minute granularity and is exclusive (i.e. 12:30 does not include the 30th minute of the hour). Valid times must be within the last 5 days, UTC time, and no more recent than 30 minutes before the current point in time. It's recommended that the fromDate and toDate should be within ~2 hours.
Read Timeout Set a read timeout on your client, and ensure that it is set to a value beyond 30 seconds.
Support for Tweet edits Since all Replay requests are for Tweets posted at least 30 minutes ago, all Tweets returned by Replay will reflect their final edit state. All Tweet objects will include metadata that describes its edit history. See the "Edit Tweets" fundamentals page for more details.

Responses

The following responses may be returned by the API for these requests. Most error codes are returned with a string with additional details in the body. For non-200 responses, clients should attempt to reconnect.

Status Text Description
200 Success The connection was successfully opened, and new activities will be sent through until the end of the requested time period is reached, and a "Replay Request Completed" message is sent.
401 Unauthorized HTTP authentication failed due to invalid credentials. Log in to console.gnip.com with your credentials to ensure you are using them correctly with your request.
406 Not Acceptable Generally, this occurs where your client either fails to properly include the headers to accept gzip encoding from the stream, or specifies an unacceptable fromDate or toDate.

Will contain a JSON message indicating the issue -- e.g. "This connection requires compression. To enable compression, send an 'Accept-Encoding: gzip' header in your request and be ready to uncompress the stream as it is read on the client end." or "Invalid date for query parameter 'toDate'. Can't ask for tweets from within the past 30 minutes."
429 Rate Limited Your app has exceeded the limit on connection requests.
503 Service Unavailable Twitter server issue. Reconnect using an exponential backoff pattern. If no notice about this issue has been posted on the Twitter API Status Page, contact support.

"Request Completed" Message

Once a request has been completed, a "Replay Request Completed" message will be delivered through the stream prior to disconnecting inside a "info" JSON message. If your stream is disconnected prior to receiving this message, the request was not completed, and you will need to re-run the missing portion of the request.

A premature disconnection may occur especially where your client is not consuming activities quickly enough. In this scenario, the connection may send the "Completed" message, but the connection may close prior to your client receiving it due to the slow rate of consumption. In this scenario, your client should re-request the end-portion of the data to ensure completeness, based on the timestamps of the last Tweets received.

The "info" JSON message has the following structure:

{
  "info": {
    "message": "Replay Request Completed",
    "sent": "2016-05-27T22:15:50+00:00",
    "activity_count": 8874
  }
}

If any errors are associated with a completed Replay request, the "info" message will indicate that errors occurred and also list the minutes that were effected in the "minutes_failed" field. Here is an example:

{
  "info": {
    "message": "Replay Request Completed with Errors",
    "sent": "2016-05-27T16:00:02+00:00",
    "activity_count": 56333,
    "minutes_failed": [
      "2013-02-20T00:05:00+00:00",
      "2013-02-20T00:06:00+00:00"
    ]
  }
}

Users (or their client applications) should monitor for complete success of the Replay stream, and submit new Replay requests for any minutes that failed.

"Request Failed to Complete" Message

If a Replay request fails to complete, the "info" message will indicate the failure and also list the time range was was not processed. Here is an example:

{
  "info": {
    "message": "Replay Request Failed to Complete",
    "sent": "2016-06-27T16:37:13+00:00",
    "unprocessed_range": {
      "fromDate": "2016-06-26T00:00:00+00:00",
      "toDate": "2016-06-26T00:01:00+00:00"
    },
    "activity_count": 1822
  }
}

If this message is received another Replay request should be made based on the "fromDate" and "toDate" included in the "unprocessed_range" attribute.

Example curl Request

The following example request is accomplished using cURL on the command line, and requests the first hour of data from June 1, 2016.

curl --compressed -v -uexample@customer.com "https://gnip-stream.gnip.com/replay/powertrack/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}json?fromDate=201606010000&toDate=201606010100"


Sample streams Replay Examples (Stream Types include 'sample10' (i.e. decahose), 'firehose', 'mentions')

Decahose, firehose, mentions note- All partitions from volume streams are delievered in a single Replay connection.

    curl --compressed -v -uexample@customer.com 
"https://gnip-stream.gnip.com/replay/sample10/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json?fromDate=201712312330&toDate=201801010130"

Compliance Replay Examples

Compliance note- All partitions from Compliance Firehose are delievered in a single Replay connection.

curl --compressed -v -uexample@customer.com 
"https://gnip-stream.gnip.com/replay/compliance/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json?fromDate=201712312330&toDate=201801010130"

PowerTrack Replay Examples

Connection to Replay to complete data during the 2018 New Year's eve disconnection:

    curl --compressed -v -uexample@customer.com 
"https://gnip-stream.gnip.com/replay/powertrack/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json?fromDate=201712312330&toDate=201801010130"

Important Note: When using PowerTrack Replay, you must first add or manage the rules currently on the replay stream. PowerTrack rules are not automatically added to a Replay stream from a normal PowerTrack stream. Rules can be managed through the Rules API for a Replay stream. Please see the PowerTrack Rules API for specific details on managing rules.

Rules management on the PowerTrack replay:

curl -v -X POST -uexample@customer.com 
"https://gnip-api.x.com/rules/powertrack-replay/accounts/{ACCOUNT_NAME}/publishers/twitter/{STREAM_LABEL}.json" 
-d '{"rules":[{"value":"rule1","tag":"tag1"},{"value":"rule2"}]}'