Understanding the Alerts delivery mechanism in Elektron Data Platform

 

Introduction

Elektron Data Platform (EDP) is a cloud based service, to provide unified access to Refinitiv content and services. EDP provides a single point of API-driven content and is backed by standard authentication and entitlements.
EDP distribution layer enables a standard REST-based APIs for clients. The requested data is delivered using a mechanism, which is best suited for that particular data set. In this article we briefly look at EDP delivery mechanisms and take an in-depth look at Alerts delivery - which is a means to get asynchronous events from EDP.

Delivery mechanisms

The data from an EDP service can be delivered in variety of mechanisms, depending on content set. Following are major content delivery types:

  1. Request - Response - is the most common type, where the data is delivered through a RESTful web service. An application uses a web request (HTTP GET, POST, PUT or DELETE) to convey the request message and parameters, and the EDP service responds with data in a synchronous manner.
  2. Alerts - delivery is a mechanism to receive asynchronous updates (alerts) to a subscription criteria.
  3. Bulk - a mechanism used to deliver very large payloads, like end of day pricing data for the whole venue.
  4. Streaming - mechanism is used for real-time or quasi real-time delivery of messages, like subscribing to tick-by-tick data for an equity instrument.

The Alerts delivery, with specific focus on News alerts is explained in this article, although same concepts would apply to Research Alerts, or any other service.

Alerts Delivery

The News dataset comprises of news headlines and stories. An application can use News Request-Response REST API to get the last few headlines. An link in the headline points to a news story, which can also be requested through a REST API call. Large stories are paginated (broken up) across multiple requests.

A more common use case for News however, is that an application starts a subscription, and then listens for the updates to headlines and/or stories as they happen - in an asynchronous manner. It could be for a News blotter on the web page, or a streaming headline display on a large wall board monitor. This is achieved through the Alerts delivery mechanism as shown in the image below:

Here,

  • User application starts by expressing an interest in alerts to the News Source.
  • News service creates a queue in cloud and starts pushing the messages which match the request criteria. Message can be a News headline or a story depending on request criteria.
  • User application polls the queue for any available message. Awaiting messages are retrieved and removed from the queue.
  • Individual messages are intended for the requesting application only and have to be decrypted to get the content payload. No other application can read the messages, even if it gets the access to the queue in the cloud.

Work flow to subscribe to alerts

Lets deep dive into the API calls for starting and subscribing to news alerts. An application has to do following high level tasks:

  1. Login
  2. Alerts subscription
  3. Get the cloud credentials
  4. Polling and retrieving messages
  5. Decrypting the message

1. Login to the Elektron Data Platform and get access token for News Service

Login is the first and common step for all EDP API requests. EDP supports OAuth 2.0 authentication with Password and Refresh grants. You can learn more about Login messages and implementation details from EDP tutorials.

2. Alerts subscription

EDP News Alerts (or Research Alerts) service is invoked with the request parameters which convey the delivery endpoint and type of alerts to receive. An example of breaking, english language News for Apple Inc is:

{
    "transport": {
        "transportType": "AWS-SQS"
    },
    "filter":    {
        "urgency":    "1",
        "language" : "en",
        "subjects": "R:AAPL.O"
    }
}

where, a user is asking for real time top-news, and intends to use Amazon's Simple Queue Service as the message distributor. Currently AWS-SQS is the only supported transport. Please refer to News User Guide for all the options which can be used in request. News Headline service REST endpoint is: api.refinitiv.com/alerts/<version>/news-headlines-subscriptions.

There are three news content subscriptions available currently:

  news-headlines-subscriptions – for headline + metadata only
  news-stories-subscriptions -  for headline + metadata + story body
  news-subscriptions – provides one each of headline and a story, i.e. duplicate updates

On the EDP side, this request will -

  1. Create a queue in AWS.
  2. Start pumping messages which meet the request criteria (english news headlines for Apple Inc) into the queue.
  3. Return the newly created queue parameters like:

a. endpoint - the URL of the queue that was created for us.
b. cryptographyKey - the alert messages are encrypted before pushing into the queue. This key is required for decryption; more on this below.
c. subscriptionID - a unique identifier which will be used to close the subscription.

3. Get the credentials to a cloud queue

Applications need access credentials to access the third-party cloud based message queue. EDP, which is the owner of the queue, can generate these client credentials. In this step, application invokes the EDP cloud-credentials service, and passes in the URL of previously created queue endpoint.

EDP responds with accessKeyId, a secretKey and sessionToken, which together form the cloud credentials. These credentials expire every hour, so the application has to renew it periodically.

Sample interaction (HTTP headers omitted and JSON prettified for display):

GET https://emea1.apps.cp.thomsonreuters.com/eds/auth/cloud-credentials/beta1/?endpoint=<<QUEUE ENDPOINT>> HTTP/1.1
Host: emea1.apps.cp.thomsonreuters.com
Authorization: Bearer ****

 

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 1541
{
    "credentials": {
        "accessKeyId": "***",
        "secretKey": "****",
        "sessionToken": "****",
        "expiration": "2018-09-27T16:14:24.000Z"
    },
    "endpoint": "https://sqs.us-east-1.amazonaws.com/****"
}

 

4. Polling and retrieving messages

The cloud credentials will enable an application to perform inquiry and retrieval operations on the queue. It is now application's responsibility to:

  1. Poll the queue endpoint to see if any messages are waiting.
  2. Receive the messages, if available.
  3. Delete the message which has been processed from the queue; to avoid getting duplicates.

Any message that is not retrieved from queue, will automatically expire after 14 days, and be lost forever.

The semantics of message retrieval are specific to cloud provider. For Amazon SQS Queue, please refer to this documentation. AWS also provides value add libraries in various programming language, which abstract the underlying heavy lifting of login and XML processing, into simple API calls and callbacks.

Sample interaction for AWS-SQS (HTTP headers omitted and XML prettified for display):

POST https://sqs.us-east-1.amazonaws.com/ HTTP/1.1
Authorization: AWS4-HMAC-SHA256 Credential=****
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
x-amz-security-token: ****

Action=ReceiveMessage&MaxNumberOfMessages=10&QueueUrl=<<QUEUE ENDPOINT>>&Version=2012-11-05&WaitTimeSeconds=20

 

HTTP/1.1 200 OK
Content-Type: text/xml
Content-Length: 8543

<?xml version="1.0"?>
<ReceiveMessageResponse xmlns="http://queue.amazonaws.com/doc/2012-11-05/">
  <ReceiveMessageResult>
    <Message>
      <MessageId>5c9e6038-9265-4fd5-b9c9-eca9bc8aa0eb</MessageId>
      <ReceiptHandle>****</ReceiptHandle>
      <MD5OfBody>ddd69f288ee1cd2f6a1c2ddacc3b91c7</MD5OfBody>
      <Body>__encrypted_and_encoded_messages__</Body>
    </Message>
  </ReceiveMessageResult>
  <ResponseMetadata>
    <RequestId>2419fb87-557a-52dd-9a75-afa1a36d30aa</RequestId>
  </ResponseMetadata>
</ReceiveMessageResponse>

5. Decrypting the message

The messages in the queue are encrypted with credentials, which are only known to the application which created the request. This ensures that even if another application/party were to get hold of cloud based queue, and retrieve messages from it, the payload would be undecipherable. The messages are first encrypted using AES256 with GCM, and then base64 encoded before being pushed into the queue.

The key parameters with which alerts message is encrypted are:

AES Key Length = 256 bytes
GCM AAD Length = 16 bytes
GCM TAG Length = 16 bytes
GCM NONCE Length = 12 bytes

The 265 byte AES cryptographyKey has been provided to the application in the Step 2 above.

The message structure after base 64 decoding is shown below:

To decrypt the message, application should:

  1. Prepare the AES-256 secret key by base64 decoding the cryptographyKey.
  2. Prepare the cipher text bytes by base64 decoding the message body.
  3. Remove Tag from the cipher text.
  4. Prepare AAD by copying the first 16 bytes from cipher text.
  5. Prepare NONCE (IV) by copying bytes 4 – 16 from AAD.
  6. Then decrypt with AES – GCM, tag length = 16 bytes; with the input from 1, 3, 4, 5.

Most programming languages provide Cryptography libraries, like javax.crypto.*in Java and pycryptodonein Python, to handle the underlying task of decoding and decryption.

Message Payload

The decrypted message payload is a JSON data structure which contains all the information for that particular alert - News or Research. Please refer to News User Guide or Research API User Guide for information on this structure.

The JSON structure contains meta-data about the alert message, and either a key payload or href. payload key will be used when all the information pertaining to the alert, can be included in the packed message. For very large payloads like a news story or company report, a file will be available for download in the cloud service, and href key will contain the claim check (URL) of this large payload. A claim check is typically valid for 14 days. The application should be designed to follow the URL in the key href, to retrieve the content.

Sample News structure with payload:

{
    "attributes": [{
            "domain": {
                "type": "string",
                "value": "headline"
            }
        }
    ],
    "envelopeVersion": "1.0",
    "ecpMessageID": "urn:newsml:reuters.com:20181016:nL8N1WW4V6:2",
    "sourceSeqNo": 2539219,
    "distributionSeqNo": 4,
    "sourceTimestamp": "2018-10-16T13:45:00.000Z",
    "receiveTimestamp": "2018-10-16T13:45:00.167Z",
    "distributionTimestamp": "2018-10-16T13:45:00.496Z",
    "payloadVersion": "1.0",
    "subscriptionID": "d61b1c7c-670f-419e-8d48-161f0ccba823",
    "payload": {
        "newsMessage": {
            "_xmlns:xs": "http://www.w3.org/2001/XMLSchema",
            "_xsi:schemaLocation": "... http://www.reuters.com/ns/2003/08/content ...",
            "_xmlns": "http://iptc.org/std/nar/2006-10-01/",
            .
            .
            .

 

Sample News structure with claim check:

{
    "attributes": [{
            "domain": {
                "type": "string",
                "value": "story"
            }
        }
    ],
    "envelopeVersion": "1.0",
    "ecpMessageID": "urn:newsml:reuters.com:20180718:nASX572h10:1",
    "sourceSeqNo": 150877,
    "distributionSeqNo": 127,
    "sourceTimestamp": "2018-07-18T08:15:49.236Z",
    "receiveTimestamp": "2018-07-18T08:15:51.263Z",
    "distributionTimestamp": "2018-07-18T09:29:36.169Z",
    "payloadVersion": "1.0",
    "subscriptionID": "b32c64f7-c82b-4d23-9055-257cdd1b3e51",
    "href": "https://***/file-store/files/3b5d1a9c-9eee-4928-9640-18e8a06b1d8a/stream...",
    .

 

Sample Research structure:

{
    "envelopeVersion": "1.0",
    "ecpMessageID": "33393f03-0b75-3785-8737-634e41f1ce24",
    "sourceSeqNo": 15402695631422336,
    "contentServicePermID": 9000001,
    "distributionSeqNo": 1,
    "sourceTimestamp": "2018-10-23T17:29:15.417Z",
    "receiveTimestamp": "2018-10-23T17:29:15.833Z",
    "distributionTimestamp": "2018-10-23T17:29:16.170Z",
    "payloadVersion": "1.0",
    "subscriptionID": "fb3a2228-7b9b-413d-aca5-5fa099d78a89",
    "payload": {
        "rptStylesResp": [{
                "uid": 150000002,
                "value": "RPT_CO"
            }
        ],
        "fileName": "83216082.pdf",
        "authorRating": 3,
        "companyName": "**** Capital Markets",
        "fileExt": "pdf",
        "pages": 19,
        .
        .
        .

Output

Following the sample output from an alerts service. You can follow the code used to implement it in the EDP tutorial on Alerts implementation sample in Python.

{
 "attributes": [{
   "domain": {
    "type": "string",
    "value": "headline"
   }
  }
 ],
 "envelopeVersion": "1.0",
 "ecpMessageID": "urn:newsml:reuters.com:20181113:nFWN1XO0WI:1",
 "sourceSeqNo": 408579,
 "distributionSeqNo": 1,
 .
 .
 .
 "payload": {

       "itemClass": {
        "_rtr:msgType": "S",
        "_qcode": "ninat:text"
       },
       "versionCreated": {
        "$": "2018-11-13T19:18:47.25Z"
       },
       "generator": [{
         "$": "UCDP:parsn_alerts_rsf_10.54.4.67_1.2.39621:",
         "_versioninfo": "1.0.39621"
        }
       ],
       "title": [{
         "$": "LG OBERLIN, LLC FILES TO SAY IT HAS RAISED $22.2 MLN IN EQUITY FINANCING - SEC FILING"
        }
       ],
	   .
	   .
}

 

References

EDP tutorial on Alerts implementation sample in Python
News User Guide
Research API User Guide