ProLNat Group has developed the Galician and Portuguese PoS-tagging files.
The Galician FreeLing PoS-tagger has been trained with a 237,000 token corpus (from Gari-Coter) containing scientific and journalistic texts. We also increased the lexicon of the Seminario de Lingüística Informática, which Xavier Guinovart coordinates and which was included in the previous release of the Galician FreeLing. The current version of the lexicon contains about 428,000 forms. In addition, we have updated and enriched the verbal lemmatization rules. Last version in 2009.Download
The European Portuguese FreeLing PoS-tagger has been trained with the following linguistic resources:
We also developed the sufix rules for verbal and nominal lemmatization. Last version in 2010.
We have also adapted the European Portuguese modules of FreeLing to Brazilian Portuguese, but it is an unstable version.
This version might improve the analysis of Brazilian Portuguese texts (and also written in the Portuguese Language Orthographic Agreement of 1990 - Acordo Ortográfico), however it has not been widely evaluated.Download