Using DepPattern to Correct the PoS Tagged Input Text

DepPattern is provided with tools suited to correct errors of the input PoS tagged text. DepPattern allows a linguist to elabore syntactic rules in order to correct systematic mistakes made by the PoS tagger. For this purpose, we are provided with 3 new elements:

Let's see an example. Suppose that the PoS tagger systematically tag as a subordinate conjunction the word that following a noun, even if in this context that is, in general, a relative pronoun. To solve the problem, we can write a rule as follows:

Single : [NOUN] CONJ<lemma:that&type:S>

Corr: tag:PRO, type:R

%

This way, the information introduced by the operator “Corr” is used to change the head expression of the unary relation “Single”. It substitutes tag PRO and type R for the information contained in the head (tag CONJ and type S). More precisely, this rule identifies as head a subordinate conjunction with lemma that following a noun (its context), and transform this head entry into a relative pronoun. Notice that there there is no dependent expression involved in the rule, since the relation type of “Single” is Head.

“Corr” also allows correcting attibutes by using the values of other attributes:

Corr: lemma:=token

It means that the value of the lemma is the value of the token. In other words, the lemma attribute inherits the value of the token attribute.