List of morpho-syntactic features

PoS tags are enriched by means of a closed set of morpho-syntactic features. All PoS tags have, at least, three features: “token”, “lemma”, and “pos” (position). The values of these two features are provided by the PoS tagger given a particular word. For instance, if the word “eggs” was tagged as NOUN in the third position of the sentence, the features “token”, “lemma”, and “pos” will be assigned the values “eggs”, “egg”, and “2”, respectively (note that the first position is “0”). In addition, each PoS tag has its own set of features. Tables 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, and 1.11, show the specific features (first column) for each tag, including the possible values for each feature (second column). The meaning of each value is described in the third column. To symplify the description, the tables below do not show the null value or “0”, which is automatically assigned to a feature when the current PoS tagger does not provide such a specific information.

Unfortunately, many PoS taggers used by the parser provide non-zero values for many features. The exception is Freeling for Spanish, Galician, and Portuguese, which contains all morpho-syntactic information required by the parser.


Table 1.2: List of features and values for tag ADJ (adjectives).
features values description
type Q qualifying
  O ordinal
degree A Aumentative
  S Superlative
gender M masculine
  F feminine
number S singular
  P plural
function P participle



Table 1.3: List of features and values for tag ADV (adverbs).
features values description
type G general
  N negative



Table 1.4: List of features and values for tag DT (determinants).
features values description
type D demonstrative
  P possessive
  T interrogative
  E exclamative
  I indefinite
  A article
person 1 first
  2 second
  3 third
gender M masculine
  F feminine
  N neutral
number S singular
  P plural
  N invariable
possessor S singular
  P plural



Table 1.5: List of features and values for tag NOUN (nouns).
features values description
type C common
  P proper name
gender M masculine
  F feminine
number S singular
  P plural
person 1 first
  2 second
  3 third



Table 1.6: List of features and values for tag VERB (verbs).
features values description
type M main
  A auxiliary
  S semiauxiliary
mode I indicative
  S subjunctive
  M imperative
  N infinitive
  G gerund
  P participle
tense P present
  I imperfect
  F future
  S past
  C conditional
person 1 first
  2 second
  3 third
number S singular
  P plural
  N invariable
gender M masculine
  F feminine



Table 1.7: List of features and values for tag PRO (pronouns).
features values description
type D demonstrative
  P personal
  T interrogative
  E exclamative
  I indefinite
  X possessive
  R relative
  W wh-word
person 1 first
  2 second
  3 third
gender M masculine
  F feminine
  N neutral
number S singular
  P plural
  N invariable
possessor S singular
  P plural
case N nominative
  A accusative
  D dative
  O oblique
politeness P polite



Table 1.8: List of features and values for tag CONJ (conjunctions).
features values description
type C coordinating
  S subordinating



Table 1.9: List of features and values for tag I (interjections).
features values description
(no features) (no values)  



Table 1.10: List of features and values for tag PRP (prepositions).
features values description
type P preposition



Table 1.11: List of features and values for tag CARD (cardinals).
features values description
gender M masculine
  F feminine
number S singular
  P plural
person 1 first
  2 second
  3 third