List of morpho-syntactic features

PoS tags are enriched by means of a closed set of morpho-syntactic features. All PoS tags have, at least, two features: ``token'' and ``lemma''. The values of these two features are provided by the PoS tagger given a particular word. For instance, if the word ``eggs'' was tagged as NOUN, the features ``token'', and ``lemma'' will be assigned both the values ``eggs'' and ``egg''. In addition, each PoS tag has its own set of features. Tables 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, and 1.10, show the specific features (first column) for each tag, including the possible values for each feature (second column). The meaning of each value is described in the third column. To symplify the description, the tables below do not show the null value or ``0'', which is automatically assigned to a feature when the current PoS tagger does not provide such a specific information.


Table 1.1: List of features and values for tag ADJ (adjectives).
features values description
type Q qualifying
  O ordinal
degree A Aumentative
  S Superlative
genre M masculine
  F feminine
number S singular
  P plural
function P participle



Table 1.2: List of features and values for tag ADV (adverbs).
features values description
type G general
  N negative



Table 1.3: List of features and values for tag DT (determinants).
features values description
type D demonstrative
  P possessive
  T interrogative
  E exclamative
  I indefinite
  A article
person 1 first
  2 second
  3 third
genre M masculine
  F feminine
  N neutral
number S singular
  P plural
  N invariable
possessor S singular
  P plural



Table 1.4: List of features and values for tag NOUN (nouns).
features values description
type C common
  P proper name
genre M masculine
  F feminine
number S singular
  P plural
person 1 first
  2 second
  3 third



Table 1.5: List of features and values for tag VERB (verbs).
features values description
type M main
  A auxiliar
  S semiauxiliar
mode I indicative
  S subjunctive
  M imperative
  N infinitive
  G gerund
  P participle
tense P present
  I imperfect
  F future
  S past
  C conditional
person 1 first
  2 second
  3 third
number S singular
  P plural
  N invariable
genre M masculine
  F feminine



Table 1.6: List of features and values for tag PRO (pronouns).
features values description
type D demonstrative
  P personal
  T interrogative
  E exclamative
  I indefinite
  X possessive
  R relative
  W wh-word
person 1 first
  2 second
  3 third
genre M masculine
  F feminine
  N neutral
number S singular
  P plural
  N invariable
possessor S singular
  P plural
case N nominative
  A acusative
  D dative
  O oblique
politness P polite



Table 1.7: List of features and values for tag CONJ (conjunctions).
features values description
type C coordinate
  S subordinate



Table 1.8: List of features and values for tag I (interjections).
features values description
(no features) (no values)  



Table 1.9: List of features and values for tag PRP (prepositions).
features values description
type P preposition



Table 1.10: List of features and values for tag CARD (cardinals).
features values description
genre M masculine
  F feminine
number S singular
  P plural
person 1 first
  2 second
  3 third


Unfortunately, most PoS taggers used by the parser provide non-zero values for many features. The exception is Freeling for Spanish and Galician, which contains all morpho-syntactic information required by the parser.

Pablo Gamallo 2009-01-19