Thrift module: communication

Module Services Data types Constants
communication Communication
CommunicationSet
CommunicationTagging

Data structures

Struct: Communication

Key Field Type Description Requiredness Default value
1 id string Stable identifier for this communication, identifying both the name of the source corpus and the document that it corresponds to in that corpus. required
2 uuid uuid.UUID Universally unique identifier for this communication instance. This is generated randomly, and can *not* be mapped back to the source corpus. It is used as a target for symbolic "pointers". required
3 type string A short, corpus-specific term characterizing the nature of the communication; may change in a future version of concrete. Often used for filtering. For example, Gigaword uses the type "story" to distinguish typical news articles from weekly summaries ("multi"), editorial advisories ("advis"), etc. At present, this value is typically a literal form from the originating corpus: as a result, a type marked 'other' may have different meanings across different corpora. required
4 text string The full text contents of this communication in its original form, or in the least-processed form available, if the original is not available. optional
5 startTime i64 The time when this communication started (in unix time UTC -- i.e., seconds since January 1, 1970). optional
6 endTime i64 The time when this communication ended (in unix time UTC -- i.e., seconds since January 1, 1970). optional
7 communicationTaggingList list< CommunicationTagging > A list of CommunicationTagging objects that can support this Communication. CommunicationTagging objects can be used to annotate Communications with topics, gender identification, etc. optional
8 metadata metadata.AnnotationMetadata metadata.AnnotationMetadata to support this particular communication. Communications derived from other communications should indicate in this metadata object their dependency to the original communication ID. required
9 keyValueMap map< string , string > A catch-all store of keys and values. Use sparingly! optional
10 lidList list< language.LanguageIdentification > Theories about the languages that are present in this communication. optional
11 sectionList list< structure.Section > Theory about the block structure of this communication. optional
12 entityMentionSetList list< entities.EntityMentionSet > Theories about which spans of text are used to mention entities in this communication. optional
13 entitySetList list< entities.EntitySet > Theories about what entities are discussed in this communication, with pointers to individual mentions. optional
14 situationMentionSetList list< situations.SituationMentionSet > Theories about what situations are explicitly mentioned in this communication. optional
15 situationSetList list< situations.SituationSet > Theories about what situations are asserted in this communication. optional
16 originalText string Optional original text field that points back to an original communication. This field can be populated for sake of convenience when creating "perspective" communication (communications that are based on highly destructive changes to an original communication [e.g., via MT]). This allows developers to quickly access the original text that this perspective communication is based off of. optional
20 sound audio.Sound The full audio contents of this communication in its original form, or in the least-processed form available, if the original is not available. optional
21 communicationMetadata metadata.CommunicationMetadata Metadata about this specific Communication, such as information about its author, information specific to this Communication or Communications like it (info from an API, for example), etc. optional

A single communication instance, containing linguistic content generated by a single speaker or author. This type is used for both inter-personal communications (such as phone calls or conversations) and third-party communications (such as news articles). Each communication instance is grounded by its original (unannotated) contents, which should be stored in either the "text" field (for text communications) or the "audio" field (for audio communications). If the communication is not available in its original form, then these fields should store the communication in the least-processed form available.

Struct: CommunicationSet

Key Field Type Description Requiredness Default value
1 communicationIdList list< uuid.UUID > A list of Communication UUIDs that this CommunicationSet represents. This field may be absent if this CommunicationSet represents a large corpus. If absent, 'corpus' field should be present. optional
2 corpus string The name of a corpus or other document body that this CommunicationSet represents. Should be present if 'communicationIdList' is absent. optional
3 entityMentionClusterList list< cluster.Clustering > A list of Clustering objects that represent a group of EntityMentions that are a part of this CommunicationSet. optional
4 entityClusterList list< cluster.Clustering > A list of Clustering objects that represent a group of Entities that are a part of this CommunicationSet. optional
5 situationMentionClusterList list< cluster.Clustering > A list of Clustering objects that represent a group of SituationMentions that are a part of this CommunicationSet. optional
6 situationClusterList list< cluster.Clustering > A list of Clustering objects that represent a group of Situations that are a part of this CommunicationSet. optional

A structure that represents a collection of Communications.

Struct: CommunicationTagging

Key Field Type Description Requiredness Default value
1 uuid uuid.UUID A unique identifier for this CommunicationTagging object. required
2 metadata metadata.AnnotationMetadata AnnotationMetadata to support this CommunicationTagging object. required
3 taggingType string A string that captures the type of this CommunicationTagging object. For example: 'topic' or 'gender'. required
4 tagList list< string > A list of strings that represent different tags related to the taggingType. For example, if the taggingType is 'topic', some example tags might be 'politics', 'science', etc. optional
5 confidenceList list< double > A list of doubles, parallel to the list of strings in tagList, that indicate the confidences of each tag. optional

A structure that represents a 'tagging' of a Communication. These might be labels or annotations on a particular communcation. For example, this structure might be used to describe the topics discussed in a Communication. The taggingType might be 'topic', and the tagList might include 'politics' and 'science'.