Open Calais web site www.opencalais.com
Calais is a centralized web service for extracting structured information from unstructured text. It processes the text you submit and returns: Entities, Topic codes, Events, Facts and SocialTags. It also links the identified entities into Thomson Reuters entity masters, allowing an easy integration of proprietary unstructured content, such as news or client portfolios, with Thomson Reuters data.
Calais processes textual documents and provides automated & consistent metadata tagging, using standard Thomson Reuters identifiers and schemes. Various content types, such as news, research, company filings, etc., can be easily linked to each other across products, because they all "speak the same language", once processed by Calais.
The metadata tagging process combines a number of methods that analyze the semantic content of a document in order to achieve high levels of accuracy. These include statistical, machine-learning methods and custom pattern-based methods. These are developed in-house by the Text Metadata Services Natural Language Processing team.
Make a request to Calais to obtain entity and event annotation on your documents, linking your unstructured content to Thomson Reuters Open PermID.
Send a request to tag a document with Open PermIDs of Entities and Events. Can be used in synchronous mode. Please provide your document in the body of the request.
The text to tag can be sent in several formats. The tagging will contain Entities, Events and Categories identified in the story. Metadata items may contain denotations of relevance / importance, describing how meaningful they are in the story – is the story centered around this entity or event, or just mentions it?
When applicable, metadata items contain a mapping to Thomson Reuters Open PermID, allowing you to connect and browse further information about the entities identified in text.