Thrift module: entities

Module Services Data types Constants
entities Entity
EntityMention
EntityMentionSet
EntitySet

Data structures

Struct: Entity

Key Field Type Description Requiredness Default value
1 uuid uuid.UUID Unique identifier for this entity. required
6 id string A corpus-specific and stable id such as a Freebase mid or a DBpedia id. optional
2 mentionIdList list< uuid.UUID > An list of pointers to all of the mentions of this Entity's referent. (type=EntityMention) required
7 rawMentionList list< structure.TokenRefSequence > An list of pointers to all of the sentences which contain a mention of this Entity. optional
3 type string The basic type of this entity's referent. optional
4 confidence double Confidence score for this individual entity. You can also set a confidence score for an entire EntitySet using the EntitySet's metadata. optional
5 canonicalName string A string containing a representative, canonical, or "best" name for this entity's referent. This string may match one of the mentions' text strings, but it is not required to. optional
8 propertyList list< property.Property > For multi-label tasks, more than one property can be attached to a single entity. A list of these properties can be stored in this field. optional

A single referent (or "entity") that is referred to at least once
in a given communication, along with pointers to all of the
references to that referent. The referent's type (e.g., is it a
person, or a location, or an organization, etc) is also recorded.

Because each Entity contains pointers to all references to a
referent with a given communication, an Entity can be
thought of as a coreference set.

Struct: EntitySet

Key Field Type Description Requiredness Default value
1 uuid uuid.UUID Unique identifier for this set. required
2 metadata metadata.AnnotationMetadata Information about where this set came from. required
3 entityList list< Entity > List of entities in this set. required
4 linkingList list< linking.Linking > Entity linking annotations associated with this EntitySet. optional
5 mentionSetId uuid.UUID An optional UUID pointer to an EntityMentionSet. If this field is present, consumers can assume that all Entity objects in this EntitySet have EntityMentions that are included in the named EntityMentionSet. optional

A theory about the set of entities that are present in a
message. See also: Entity.

Struct: EntityMention

Key Field Type Description Requiredness Default value
1 uuid uuid.UUID required
8 id string A corpus-specific and stable id akin to a Freebase mid or a DBpedia id. optional
2 tokens structure.TokenRefSequence Pointer to sequence of tokens. Special note: In the case of PRO-drop, where there is no explicit mention, but an EntityMention is needed for downstream Entity analysis, this field should be set to a TokenRefSequence with an empty tokenIndexList and the anchorTokenIndex set to the head/only token of the verb/predicate from which the PRO was dropped. required
3 entityType string The type of referent that is referred to by this mention. optional
4 phraseType string The phrase type of the tokens that constitute this mention. optional
5 confidence double A confidence score for this individual mention. You can also set a confidence score for an entire EntityMentionSet using the EntityMentionSet's metadata. optional
6 text string The text content of this entity mention. This field is typically redundant with the string formed by cross-referencing the 'tokens.tokenIndexList' field with this mention's tokenization. This field may not be generated by all analytics. optional
7 childMentionIdList list< uuid.UUID > A list of pointers to the "child" EntityMentions of this EntityMention. optional
9 propertyList list< property.Property > For multi-label tasks, more than one property can be attached to a single entity mention. A list of these properties can be stored in this field. optional

A span of text with a specific referent, such as a person,
organization, or time. Things that can be referred to by a mention
are called "entities."

It is left up to individual EntityMention taggers to decide which
referent types and phrase types to identify. For example, some
EntityMention taggers may only identify proper nouns, or may only
identify EntityMentions that refer to people.

Each EntityMention consists of a sequence of tokens. This sequence
is usually annotated with information about the referent type
(e.g., is it a person, or a location, or an organization, etc) as
well as the phrase type (is it a name, pronoun, common noun, etc.).

EntityMentions typically consist of a single noun phrase; however,
other phrase types may also be marked as mentions. For
example, in the phrase "French hotel," the adjective "French" might
be marked as a mention for France.

Struct: EntityMentionSet

Key Field Type Description Requiredness Default value
1 uuid uuid.UUID Unique identifier for this set. required
2 metadata metadata.AnnotationMetadata Information about where this set came from. required
3 mentionList list< EntityMention > List of mentions in this set. required
4 linkingList list< linking.Linking > Entity linking annotations associated with this EntityMentionSet. optional

A theory about the set of entity mentions that are present in a
message. See also: EntityMention

This type does not represent a coreference relationship, which is handled by Entity.
This type is meant to represent the output of a entity-mention-identifier,
which is often a part of an in-doc coreference system.