When Can Annotation Standards Help Annotation Consumers the Most?
1. When more than one source of annotation are being used together in a single application.
There are at least two types of cases to consider:
- A system is trying to train on hand-coded data from multiple annotation efforts that chose the same phenomenon as a target. For example, a developer would like to a part of speech tagger on all available corpora in a particular language that has been hand-annotated for part of speech and use the system to automatically tag new text using any of the input tagsets. Such a system would need to find some common ground between all of the encodings.
- A system that is attempting to simultaneously use multiple types of annotation at the same time for some application. ACE, GALE, CONLL and systems for other shared tasks often fall into this category. Typically, systems use the output of several transducers, some of which may be trained on hand-annotated corpora, but some of which may be the result of unsupervised or rule-based systems. The more closely synchronized these outputs are, the easier it will be to make generalizations. If the coreference system identifies noun phrases or head nouns in the same way as the semantic role labeler, it will be possible to identify nouns or noun phrases that bear a particular relation to a particular type of verb. For example, suppose one had compatible analyses of The Saturn V blew up. It's remains fell in the ocean. Then it might be possible to identify the (PropBank) ARG1 of blew with the antecedent of It. However, there are several tokenization or head selection decisions that could block generalizations of this type from being recorded even if the coreference and semantic role labeling systems work perfectly. Key questions for this example include: (1) are phrases or heads being selected to represent the constituents?; (2) is blew up a complex verb or is up a particle modifying blew, i.e., is the head blew or blew up?; (3) assuming head selection, does the head of the named entity include both tokens (Saturn V) or by some convention, only the first (Saturn) or last token (V)?
In the latter case, the crucial factors are identifying the basic units: segments, tokens, phrases or heads. Here, the term head needs a little clarification. In the parlance of many theories that assume phrase structure, not all phrases have heads, e.g., named entities, conjoined structures, certain idioms, range structures like 5 to 10 (as in 5 to 10 dollars) and many others. However, many dependency analyses that are popular in computational linguistics depend on the assumption that every phrase has some special word (the head) that links it to the rest of the dependency structure (or else dependency forest analyses of sentences would be more common than dependency tree analyses). Phrase structure based analyses are often supplemented by a similar notion of head of the phrase. However, this notion is not as central, i.e., the phrase still exists whether the head is marked or is not. In contrast, the dependency analysis depends on identifying a head or anchor.
This usage forces many analyses to assume that for every phrase, there is some head. To make this distinction clear, we will henceforth use the term anchor to mean this kind of "head". Unfortunately, the choice of anchor is far from consistent among the dependency analyses used among Computational Linguists. Note that this problem does not disappear if one assumes a phrase structure analysis and identifies key phrases for several reasons: (1) the notion of head is useful for compiling statistical co-occurence information (e.g., semantic selection) and ignoring it is “throwing out the baby with the bath water”; and (2) phrase structure analyses are not more consistent than dependency analyses.