Grammar & Resources

The group is centered on modeling linguistic knowledge, integrating interfaces between different areas of grammar and knowledge about how language is put to use. Joint work in formal phonology, lexicon, syntax and semantics allows building an integrated model of grammar, considering how it is represented in the human mind, as well as how it can be computationally modelled; work on L1 and L2 acquisition is at the core of this work. The integration of models of language representation and models of language use is achieved through the study of corpora.

The production of corpora and resources is justified by the goal of developing documentation and providing descriptions of contemporary European Portuguese, but also of understudied contact languages or varieties (Portuguese-based creoles, national varieties of Portuguese in Africa and Asia). The group also produces resources for the study of L1 and L2 acquisition in different settings. The group integrates CLARIN LP.

Research on L1 and L2 acquisition contributes to CLUL’s general purpose of effectively articulating fundamental and applied research, namely in the areas of Educational Linguistics and Clinical Linguistics.

General goals:

- To produce new resources for the study of Portuguese and Portuguese-based creoles;

- To pursue basic research on natural language modeling, integrating knowledge on interfaces between language modules;

- To continue the documentation and description of understudied creoles and new varieties of Portuguese that emerged in a context of language contact;

- To develop the study of language acquisition with an emphasis on language contact situations (see new international Heritage Language Consortium) and on the comparison between typical and atypical development;

- To explore the potential of comparative linguistics in the production of resources for translation and to promote connections with the industry in the area of translation.

Project	Date	Fin.
ParlaMint II - ParlaMint II	2022 - 2022
PALMA - Possession and Location: Microvariation in African Varieties of Portuguese (PALMA)	2019 - 2022	FCT
RECAP - RECAP: Resources for Portuguese Learning	2017 - 2018	FCG
CLARIN - CLARIN	2017 - 2021
Documentation of Sri Lanka Portuguese	2017 - 2019
LeCIEPLE - LeCIEPLE - Learner Corpus: da investigação ao ensino de Português Língua Estrangeira/Língua Segunda	2014 - 2015	FCG
Portuguese-based creoles of the Dravidian space: Diachrony and synchrony	2013 - 2018
TAXE - TAXE - Parataxis, Hypotaxis and Interface Syntax-Discourse	2013 - 2020
COPAS - COPAS - Contrast and Parallelism in Speech	2012 - 2015
CLAP - CLAP - Complement clauses in the Acquisition of Portuguese	2012	FCT
SynExtract - SynExtract - automatic extraction of synonymy relations for a cost-effective acquisition of language resources	2012 - 2014	FCT
SemiAutLex.PT - SemiAutLex.PT - Semi-automatic construction of relational lexica for Portuguese	2012 - 2014	FCT

Resources	Type
A Lexicon of Child European Portuguese - CEPLEXicon	Lexicon
A Portuguese Native Language Identification Dataset - NLI-PT	Database
Acquisition of European Portuguese Databank - AcEP	Database
Child-Adult Interaction Corpus - CAI	Corpus
Child-Adult interaction European Portuguese	Database
Consonantic Sequences Oral and Written Production Tasks - PORESC	Tool
Controlled Portuguese - CLG	Database
Corpora of PLE	Corpus
Corpus Almeida - European Portuguese / French	Corpus
Corpus Angolar	Corpus
Corpus C-ORAL-ROM	Corpus
Corpus CCF	Corpus
Corpus CINTIL	Corpus
Corpus Fadambo	Corpus
Corpus Leiria (1991)	Corpus
Corpus of Cape Verdean Portuguese	Corpus
Corpus of Sri Lanka Portuguese	Corpus
Corpus of the Diaries of the Portuguese Parliament annotated with PoS - PTPARL	Corpus
Corpus PESTRA	Corpus
Corpus Português Fundamental - Corpus PF	Corpus
Corpus Principense	Corpus
Corpus REDIP	Corpus
Corpus Santome	Corpus
Corpus SANTOS - European Portuguese	Corpus
Crosslinguistic Child Phonology Project - Português Europeu - CLCP-PE	Tool
Dados Orais de Cabo Verde - CV Words	Database
Demo de Subespecificação e Desambiguação de Escopo	Tool
Dictionary of Hindi-Portuguese-Hindi	Database
Diu Indo-Portuguese Data Set	Database
EP-Plurals	Tool
Learner Corpus of Portuguese L2 - COPLE2	Corpus
LT Corpus (Literary Corpus) - LT Corpus	Corpus
Modality Lexicon - MODAL-LEX-PT	Lexicon
Multifunctional Computational Lexicon of Contemporary Portuguese	Lexicon
Named Entity Recognizer - CRPC-NER	Tool
Nominal Multiword Lexical Units in European Portuguese	Lexicon
NPChunks: Corpus of 1000 sentences annotated with PoS and nominal chunks - NPChunks	Corpus
Online Corpus of Writing and Speech of Children in the Early Years of Schooling - EFFE-On	Corpus
Online Dictionary Portuguese-Slovak/Slovak-Portuguese	Database
Pereira&Freitas - EP	Corpus
Person-Machine Interaction in Natural Language - INQUER	Database
PhonoDis	Corpus
Phonological Awareness Tasks for First Grade School Children - TCFC	Tool
Portuguese Biographies - Bio-PT	Database
Portuguese Corpus Annotated for Modality - MODAL	Corpus
Portuguese Lexicon of Discourse Markers - LDM-PT	Lexicon
Portuguese Technical Lexica - LEXTEC	Lexicon
Portuguese Discourse Bank - CRPC-DB	Corpus
Quotations database - CRPC-quotations	Database
Ramalho – EP	Corpus
Reference Corpus of Contemporary Portuguese - CRPC	Corpus
Santome Structure Dataset	Database
Spoken Corpus Mozambique 1986-87 - SCM	Corpus
Spoken Portuguese - Geographical and Social Varieties	Corpus
Vocatives in European Portuguese	Corpus
Word Combination in European Portuguese - LEX-MWE-PT	Lexicon
WordNet.PT	Lexicon

Capítulo de Livro
Pinto, J., & Alexandre, N. (2023). Ensinar Português a Falantes de Espanhol Língua Materna/ Língua Segunda: para uma consciencialização lexical dos aprendentes. In C. Castro & Madeira, A. (Eds.), Desenvolvimento de Materiais Didáticos para Português Língua Não Materna (pp. 148-165). Lidel.
Gonçalves, A., & Vieira, S. (2022). Avaliação do conhecimento sintático. In Linguística clínica: Modelos, avaliação e intervenção (Maria João Freitas, Marisa Lousada, Dina Caetano Alves, pp. 293-320). Berlin: Language Science Press. http://doi.org/10. 5281/zenodo.7233235
Li, X., Santos, A. L., & Lobo, M. (2023). L3 Acquisition of Portuguese Clefts by L1-Mandarin L2-English speakers. In L3 after the Initial State (M. M. Brown-Bousfield, S. Flynn & E. Fernández-Berkes ). John benjamins. http://doi.org/https://doi.org/10.1075/sibil.65.08li
Gonçalves, R., Duarte, I., & Hagemeijer, T. (2023). Objetos diretos em variedades africanas do Português: um estudo de caso de microvariação. In S. F. Brandão & Vieira, S. (Eds.), Para o Estudo Comparativo de Variedades do Português (pp. 53-83). Berlin: De Gruyter. http://doi.org/https://doi.org/10.1515/9783110670257-005
Hagemeijer, T. (2024). São Tomé and Príncipe. In Ursuala Reutner (ed.), Manual of Romance Languages in Africa (pp. 609-623). Berlin: De Gruyter. http://doi.org/10.1515/9783110628869-027
Pinto, J. (2023). Are Teachers Developing Strategies to Enhance the Use of DLC in the Learning of Portuguese as a Foreign Language in English-Dominant Classrooms?. In L. Aronin & Melo-Pfeifer, S. (Eds.), Language Awareness and Identity (pp. 155-172). Springer International Publishing. http://doi.org/10.1007/978-3-031-37027-4_8
Cruz, M., Sendra, V. C., Castelo, J., & Frota, S. (2022). Asking questions across Portuguese varieties. In M. Cruz & S. Frota (Eds.), Prosodic variation (with)in languages: Intonation, phrasing and segments (series Studies in Phonetics and Phonology, edited by Martin J. Ball and Pascal van Lieshout) (pp. 36-70). Equinox Publishing. Retrieved from https://www.researchgate.net/publication/360321546_Asking_questions_across_Portuguese_varieties
Cruz, M., & Frota, S. (2022). Introduction. In M. Cruz & S. Frota (Eds.), Prosodic variation (with)in languages: Intonation, phrasing and segments (series Studies in Phonetics and Phonology, edited by Martin J. Ball and Pascal van Lieshout) (pp. 1-6). Equinox Publishing. Retrieved from https://www.researchgate.net/publication/360321546_Asking_questions_across_Portuguese_varieties
Costa, A., & Batalha, J. (2019). Para um mapa das fronteiras e das pontes na investigação em didática da gramática. In A linguística na formação do professor : das teorias às práticas (pp. 61-80). Faculdade de Letras da Universidade do Porto e Centro de Linguística da Universidade do Porto. http://doi.org/10.21747/978-989-8969-20-0/linga5
Duarte, I. (2024). Ibero-Romance I: Portuguese and Galician. In M. Loporcaro (Ed.), Romance Linguistics (Vol. (part of Oxford Research Encyclopedia of Linguistics ed. by M. Aronoff).). Oxford University Press. http://doi.org/https://doi.org/10.1093/acrefore/9780199384655.013.717
Castelo, A. (2023). A construção participada de materiais didáticos: de experiências com aprendentes chineses a um quadro orientador. In Desenvolvimento de materiais didáticos para Português como Língua Não Materna: experiências e desafios (pp. 122-136). Lisboa: LIDEL.
Castelo, A., & Braz, A. (2023). Português de Viva Voz / Portuguese Live: strengths and challenges of teaching a non-native language online. In Inovação e Tecnologia no Ensino de Línguas: pedagogias, práticas e recursos digitais (pp. 233-261). Lisboa: Universidade Aberta, Coleção Ciência e Cultura, Nº 23.
Mendes, A. (2024). The Reference Corpus of Contemporary Portuguese: Corpus Design and Case Study on Discourse Markers. In M. C. Campos & Vaamonde, G. (Eds.), Linguistic Corpora and Big Data in Spanish and Portuguese (pp. 145-178). Berlin / Boston: Peter Lang. http://doi.org/10.1515/9783110781465 https://www.degruyter.com/document/doi/10.1515/9783110781465/html
Cardoso, H. C. (2020). Contact and Portuguese-lexified creoles. In R. Hickey (Ed.), The Handbook of Language Contact (2nd ed., pp. 469-488). Hoboken, NJ: Wiley-Blackwell.
Cardoso, H. C. (2022). Descrições portuguesas das línguas de Timor-Leste na transição dos séculos XIX e XX. In R. Roque (Ed.), Timor Etnográfico: Etnografias Coloniais Portuguesas no Século XX (pp. 141-183). Lisboa: Imprensa de Ciências Sociais.
Cardoso, H. C. (2022). Indo-Portuguese contact seen from Goa. In R. S. Newman & da Silva, D. C. (Eds.), Traces on the Sea: Portuguese Interaction with Asia (pp. 63-92). Coimbra: Imprensa da Universidade de Coimbra.
Pinto, J., & Alexandre, N. (2025). What can a corpus do for foreign language teaching? An activity proposal for Chinese learners of Portuguese. In Grammatical Categories in Linguistics and Education (pp. 145-166). De Gruyter. http://doi.org/10.1515/9783111140803-007
Hagemeijer, T. (2025). Tomtomming stop epenthesis in Santome. In Areas, families, and pools aplenty: a Festschrift for Tom Güldemann (pp. 329-339). Humboldt-Universität. http://doi.org/https://doi.org/10.18452/32608
Lejeune, P., & Mendes, A. (2025). Portuguese defence proceedings and court applications: aiming for clarity and inclusiveness. In The Language of Lawyers (Jacqueline Visconti, pp. 383-398). De Gruyter. http://doi.org/10.1515/9783111340982-026
Lejeune, P., & Mendes, A. (2024). A Study on the French Functional Equivalents of some Modal and Metadiscursive Uses of the European Portuguese Marker lá. In Discourse Markers in Romance Languages: Cross-linguistic approaches in Romance and beyond (Cristina Popescu, pp. 39-54). Peter Lang.
Mendes, A., & Lejeune, P. (2024). Discourse markers in Portuguese. In Manual of Discourse Markers in Romance (Maj-Britt Mosegaard Hansen & Jacqueline Visconti, pp. 563-594). De Gruyter. http://doi.org/10.1515/9783110711202-019
Santos, A. L., Lobo, M., & Grolla, E. (2026). L1 Acquisition . In The Oxford Handbook of the Portuguese Language (A. M Carvalho & L. Oushiro, pp. 525-541). Oxford University Press. http://doi.org/https://doi.org/10.1093/9780191954481.003.0030

Book Edition
(2009). Gradual creolization; Studies celebrating Jacques Arends . (R. Selbach, Cardoso, H. C., & van den Berg, M., Eds.). Amsterdam: John Benjamins. http://doi.org/https://doi.org/10.1075/cll.34
Matos, G., Miguel, M., Duarte, I., & Faria, I. H. (1995). Interfaces in Linguistic Theory. . Selected papers from the International Conference on Interfaces in Linguisti. Associação Portuguesa de Linguistica / Ed. Colibri.
(2021). Multilingualism and third language acquisition: learning and teaching trends. (J. Pinto & Alexandre, N., Eds.). Language Science Press. http://doi.org/10.5281/zenodo.4449726 (Original work published 02/2021AD)

Book Review
Hagemeijer, T. (2007). Review of Huber, Magnus & Viveka Velupellai (orgs.). 2007. Synchronic and diachronic perspectives on contact languages (Creole Language Library, Vol. 32). Amsterdam, Philadelphia: John Benjamins] .
Hagemeijer, T. (2020). Negation and negative concord: The view from creoles. (V. Déprez & Henri, F., Eds.), Journal of Pidgin and Creole Languages. Amsterdam / Philadelphia: John Benjamins Publishing Company.
Hagemeijer, T. (2010). Review of Maurer, Philippe. Principense. Grammar, Texts, And Vocabulary Of The Afro-Portuguese Creole Of The Island Of Príncipe, Gulf Of Guinea. London: Battlebridge Publications. Journal Of Language Contact: Varia.

Artigo em Atas
Cruz, M., & Frota, S. (2014). Rhythm in central-southern varieties of European Portuguese: production and perception . In In Textos Selecionados do XXIX Encontro Nacional da Associação Portuguesa de Linguística (E., Bascelar-do-Nascimento and M. F. and Mota, M. A. and Segura, L. and Mendes, A. (eds), Vol. III, pp. 213-230). Lisbon: Fundação Calouste Gulbenkian.
Cruz, M., & Frota, S. (2013). Textos Seleccionados. In In Textos Seleccionados do XXVIII Encontro Nacional da Associação Portuguesa de Linguística. . Lisboa: APL.

Grammar & Resources

Membros

Integrated members with PhD

Integrated members without PhD

Colaboradores

Concluded