POS Standards

From SIGANN

Jump to: navigation, search

There are several tables for converting one set of POS tags for another. For example, see Table 1 in


This page describes an approach to POS standards, but does not provide a detailed account. That is left to future work. We expect the NYC CLASP meeting to set moving to set this sort of effort in motion.

However, one good way of combining all POS tagsets together would be to convert each POS tag into a standard feature structure description. Alternative POS tags could then be compared via subsumption relations or merged via unification in the standard ways for those frameworks (The Logic of Typed Feature Structures by B. Carpenter provides formal details sufficient to do anything we would want to do). Here I will give a sample description of adjective features and possible values and translate a number of tagsets into feature structures using these.

Let's assume that words of CLASS: NOUN permits the following features, listed with the full set of feature values (for English) , where NIL indicates that the feature is optional.

  • ORTHOGRAPHY: any word marked as an noun by some POS tagger
  • NUMBER: {SINGULAR, PLURAL}
  • SUBTYPE: {PROPER, COMMON, DEF-PRONOUN, INDEF-PRONOUN, REFL-PRONOUN, INTEROG-PRONOUN, EXPLETIVE}
  • CASE:{SBJ, OBJ, NIL}
  • SEMCLASS: {ADVERBIAL, NIL}

Example:

Word: tomorrow, Penn Treebank tag=NN, CLAWS tag=NN0, Brown tag=NR, ICE tag=N.com.sing

Feature Structure unifying the feature structures implicit in the above tags.

[CLASS: noun
ORTHOGRAPHY: tomorrow
NUMBER: SINGULAR
SUBTYPE: COMMON
SEMCLASS: ADVERBIAL]

http://www.replicapiaget.com

Go Back

Personal tools