Logo da USCPoS-Tagger for Spanish and Portuguese

CitiusTagger and CitiusNec

A PoS-Tagger and Named Entity Classification tool for Portuguese, English, Galician, and Spanish

CitiusTagger / CitiusNec is an open source software, written in Perl, to perform both PoS tagging and Named Entity Classification in the Portuguese, English, Galician, and Spanish languages. Since 2017, it is the PoS tagger module of the LinguaKit project. It has been developed at CITIUS by the ProLNat@GE group.

It makes use of the same tagset as FreeLing.

You can test it in our DEMO.

You can find a description of this tool in the following articles:

  • Gamallo, Pablo, Juan Carlos Pichel, Marcos Garcia, José Manuel Abuín e Tomás Fernández Pena (2014): Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data Procesamiento del Lenguaje Natural.
  • Garcia, M. and Gamallo, P. 2015. Yet another suite of multilingual NLP tools, Symposium on Languages, Applications and Technologies (SLATE 2015) p. 81-90.
  • Requirements

    'Storable' Perl module (for Galician language). You may install the module with CPAN:
    # cpan>install Storable

    How to install

    # tar xzvf CitiusTool.tar.gz
    # cd CitiusTool
    # sh install-citiustool.sh

    How to use

    # sh nec.sh
    Syntax: nec.sh language file

    language=pt, es, en, gl
    file= path of the file input

    Spanish PoS-Tagger

    The Spanish POS-tagger has been trained with the Ancora corpus. The current version of the lexicon contains the same forms as FreeLing.

    .

    Portuguese PoS-Tagger

    The European Portuguese FreeLing POS-tagger has been trained with the following linguistic resources:

    Galician PoS-Tagger

    The Galician POS-tagger has been trained with the Xiada corpus. The current version of the lexicon contains the same forms as FreeLing.

    .

    Valid HTML 4.01 Strict Valid CSS!