Grupo de Gramática del Español

Main research paths

Syntactic structures of Spanish

It encompasses both the study of syntactic patterns with their functional constituents and the characteristics of the verbs as predicates of clauses. This line stems from the exploitation of the resources generated firstly by the group, such as the Spanish syntactic database (BDS), which records the features of about 160.000 clauses from a corpus of American and European Spanish.

Historical syntax of the clause and the sentence

Historical syntax of the clause

It focuses on the diachronic study of the functional transformations of the clause. The following aspects are analysed: the change of verbal regime from Latin to Romance as a trigger of mutation of syntactic functions and reorganization of clausal patterns; the alternatives between residual syntactic functions and new Romance functions and their respective processes of extinction and grammaticalization. Furthermore, this line studies the changes derived from the disappearance of the synthetic passive, the formation of the composed tenses with ser for the expression of the middle voice (with or without reflexive form) and the alternation between composed and simple tenses as well as the extinction of the periphrastic forms.

Historical syntax of the sentence

Diachronic study of the sentence and its subtypes, paying special attention to concessives and conditionals. As both the Latin sentence structure and the relational semantics of its subtypes survived in Spanish, we aim to analyse the diachronic changes that determine the near disappearance of Latin conjunctions, the grammaticalization processes of Romance conjunctions, the transformations of the Latin verbal system, and the diachrony of conditional and concessive Romance patterns.

Development of corpora and linguistic resources

Elaboration of spoken and written textual corpora for linguistic studies, as well as electronic linguistic resources for the development and exploitation of the corpora and for other types of applications. Following the construction of The Syntactic Database for modern Spanish / Base de datos sintácticos del español (BDS) from the ARTHUS corpus, the latest corpora compiled by the group are the Corpus of learners of Spanish / Corpus de aprendices de español (CAES), the Syntactically parsed corpus / Corpus sintácticamente analizado (CSA) and the Corpus for the study of spoken Spanish / Corpus para el estudio del español oral (ESLORA). The development of statistical tagging and applications for on-line consultation have been included. This line of research also encompasses the participation of the group in the development of the Corpus del Español del Siglo XXI (CORPES XXI) and the Corpus de referencia do galego actual (CORGA).

Derivational morphology of Spanish

This line focuses on the functioning of the derivation mechanisms and their productivity, both in contemporary and earlier Spanish, paying special attention to the structure of the processes and their morphological features. The most recent works are related to the design and the making of an on-line application that shows the relationships between words in the same lexical family. The application MORFOGEN has two modules: a) a database of Spanish morphology and b) a visualization tool that offers a global picture of the lexical family of each word. The lexical family brings together all those words which directly or indirectly are related to the same root. They could be cultured, popular or loanwords regardless of the degree of formal and semantic transparency of their derivative relationship.

Relationships between grammar and lexicon

Study of the lexical units of different classes and subclasses, focusing on their semantic features and their formal properties. The research is aimed at determining the different subclasses: nouns of event and result, relational and qualifying adjectives, shell nouns, adjectives with modal meanings, etc.

Computational linguistics and language technologies

This line is focused on natural language processing. Both analysis (part-of-speech tagging, syntactic parsing) and information extraction (extraction of semantic relationships, terminology, multiwords, mentioned entities) are taken into account as well as the development of applications for searching for answers, summarizers, grammatical checkers, sentiment analysers, etc. This work, carried out by the ProLNat@GE team, aims to develop linguistic models for Galician, Spanish, Portuguese and English and has produced a free software project, Linguakit. For further information, see

The Spanish spoken in Galicia

Research focused on spoken Spanish through the development of new tools for the study and analysis of its constitutive aspects and its social function. The studies are based on the Corpus for the study of spoken Spanish / Corpus para el estudio del español oral (ESLORA) that encompasses semi-directed interviews carried out in Santiago de Compostela as part of the pan-Hispanic project PRESEEA and spontaneous colloquial conversations. One part of the investigation is aimed at morpho-syntactic tagging and the development of the search engine. The results allow the study of the dynamics and constitution of oral interactions from a multidimensional perspective: lexico-grammatical, discursive, pragmatic, sociolinguistic and ideological. Furthermore, this line is focused on the techniques for the elicitation of spoken samples and other complementary resources for the compilation of corpora. Additionally, the research offers insights into the possibilities of using spoken corpora in the field of teaching-learning Spanish as a second language.

Lexical availability

The studies into the lexical availability originated from French linguistics in the 1950’s and 60’s and its objective is the gathering and analysis of available lexicon in a determined speech community. In Hispanic tradition, pioneering work on lexical availability was carried out by López Morales (1973), but it was not until the 90’s that contributions in this domain were much further developed thanks to the Pan-Hispanic project. This project gathers together various research groups from both sides of the Atlantic that work under the same methodological bases so as to have an insight into the available lexicon of pre-graduates and build dictionaries of lexical availability for the different regions in the Hispanic world.

Historiography of Linguistics

Historiographic analysis of the Spanish and Galician linguistic tradition with emphasis on the history of grammar. Some group members have published works on the current situation of Spanish linguistic historiography, history of the Spanish grammaticography, history of the Spanish orthographic theory and history of Galician linguistics. In addition, the presence of linguistic ideologies in the development of Spanish linguistics is currently being researched. Materials related to the teaching of Spanish as a first, second or foreign language are also being analysed and evaluated, both from a historiographic perspective and regarding current technological applications.

Design and compilation of bibliographic databases

Development of databases for storing updated bibliographical information on linguistic and philological works referring to Spanish and Galician. In the design of these tools the challenges posed by new information technologies are taken into account:

  1. The diffusion over the Internet which requires the transformation and enrichment of the parameters which up until now have been used in comparable printed works in order to facilitate the retrieval of information each user is looking for.

  2. The creation of resources open to ongoing improvements.

  3. The inclusion of digital editions as new ways of disseminating the results of the scientific research.

Contrastive lexico-syntactic studies

A collection of contrastive studies with theoretical and applied interests with the following main objectives: a) the establishing of the combinatory potential of German, Spanish and Portuguese lexemes with semantic similarity, paying attention to the inter and the intra linguistic contrast; b) the application of the results specially relevant in the field of second and foreign language acquisition and, more specifically, in the contexts of production and translation. The theoretical interest is focused on the relationship between the lexical meaning and the combinatory characteristics.

Spanish as a foreign language

This line of research focuses on the following aspects:

  1. Study of the different aspects related to language teaching: learning factors and strategies, methodological indications, components of communicative competence, development of linguistic skills, etc.

  2. Analysis and elaboration of learning-teaching materials.

  3. Use and exploitation of linguistic and learners corpora (spoken and written) as a source for the description of the working of the language and of the interlanguage of non- native speakers and for the elaboration of learner-teacher materials.

  4. Analysis of different aspects of the teaching of Spanish as a foreign language: a) the attention paid to the intra-linguistic diversity in didactic materials; b) the learning strategies and the promotion of autonomous work through ICT; c) student mobility and language learning.

It is worth highlighting the experience of some team members in the development of a corpus of Spanish learners, the Corpus of learners of Spanish / Corpus de aprendices de español (CAES), a project supported and financed by the Instituto Cervantes.


The phraseological research that we are developing is focused on phraseological syntactic structures, recurring constructional patterns that are made up of certain fixed constituents (usually grammar words) and others which although they can be considered free slots, are under certain semantic-combinatory restrictions within the structure. From a semantic point of view, these constructions do not have a completely compositional meaning, as they usually present an additional pragmatic component which cannot be deduced from the sum of the constituents’ partial meaning.