Grammar & Resources

The group is centered on modeling linguistic knowledge, integrating interfaces between different areas of grammar and knowledge about how language is put to use. Joint work in formal phonology, lexicon, syntax and semantics allows building an integrated model of grammar, considering how it is represented in the human mind, as well as how it can be computationally modelled; work on L1 and L2 acquisition is at the core of this work. The integration of models of language representation and models of language use is achieved through the study of corpora.

The production of corpora and resources is justified by the goal of developing documentation and providing descriptions of contemporary European Portuguese, but also of understudied contact languages or varieties (Portuguese-based creoles, national varieties of Portuguese in Africa and Asia). The group also produces resources for the study of L1 and L2 acquisition in different settings. The group integrates CLARIN LP.

Research on L1 and L2 acquisition contributes to CLUL’s general purpose of effectively articulating fundamental and applied research, namely in the areas of Educational Linguistics and Clinical Linguistics.

General goals:

- To produce new resources for the study of Portuguese and Portuguese-based creoles;

- To pursue basic research on natural language modeling, integrating knowledge on interfaces between language modules;

- To continue the documentation and description of understudied creoles and new varieties of Portuguese that emerged in a context of language contact;

- To develop the study of language acquisition with an emphasis on language contact situations (see new international Heritage Language Consortium) and on the comparison between typical and atypical development;

- To explore the potential of comparative linguistics in the production of resources for translation and to promote connections with the industry in the area of translation.

 

Resources Type
A Lexicon of Child European Portuguese - CEPLEXicon Lexicon
A Portuguese Native Language Identification Dataset - NLI-PT Database
Acquisition of European Portuguese Databank - AcEP Database
Child-Adult Interaction Corpus - CAI Corpus
Child-Adult interaction European Portuguese Database
Consonantic Sequences Oral and Written Production Tasks - PORESC Tool
Controlled Portuguese - CLG Database
Corpora of PLE Corpus
Corpus Almeida - European Portuguese / French Corpus
Corpus Angolar Corpus
Corpus C-ORAL-ROM Corpus
Corpus CCF Corpus
Corpus CINTIL Corpus
Corpus Fadambo Corpus
Corpus Leiria (1991) Corpus
Corpus of Cape Verdean Portuguese Corpus
Corpus of Sri Lanka Portuguese Corpus
Corpus of the Diaries of the Portuguese Parliament annotated with PoS - PTPARL Corpus
Corpus PESTRA Corpus
Corpus Português Fundamental - Corpus PF Corpus
Corpus Principense Corpus
Corpus REDIP Corpus
Corpus Santome Corpus
Corpus SANTOS - European Portuguese Corpus
Crosslinguistic Child Phonology Project - Português Europeu - CLCP-PE Tool
Dados Orais de Cabo Verde - CV Words Database
Demo de Subespecificação e Desambiguação de Escopo Tool
Dictionary of Hindi-Portuguese-Hindi Database
Diu Indo-Portuguese Data Set Database
Learner Corpus of Portuguese L2 - COPLE2 Corpus
LT Corpus (Literary Corpus) - LT Corpus Corpus
Modality Lexicon - MODAL-LEX-PT Lexicon
Multifunctional Computational Lexicon of Contemporary Portuguese Lexicon
Named Entity Recognizer - CRPC-NER Tool
Nominal Multiword Lexical Units in European Portuguese Lexicon
NPChunks: Corpus of 1000 sentences annotated with PoS and nominal chunks - NPChunks Corpus
Online Corpus of Writing and Speech of Children in the Early Years of Schooling - EFFE-On Corpus
Online Dictionary Portuguese-Slovak/Slovak-Portuguese Database
Pereira&Freitas - EP Corpus
Person-Machine Interaction in Natural Language - INQUER Database
PhonoDis Corpus
Phonological Awareness Tasks for First Grade School Children - TCFC Tool
Portuguese Biographies - Bio-PT Database
Portuguese Corpus Annotated for Modality - MODAL Corpus
Portuguese Lexicon of Discourse Markers - LDM-PT Lexicon
Portuguese Technical Lexica - LEXTEC Lexicon
Portuguese Discourse Bank - CRPC-DB Corpus
Quotations database - CRPC-quotations Database
Ramalho – EP Corpus
Reference Corpus of Contemporary Portuguese - CRPC Corpus
Santome Structure Dataset Database
Spoken Corpus Mozambique 1986-87 - SCM Corpus
Spoken Portuguese - Geographical and Social Varieties Corpus
Vocatives in European Portuguese Corpus
Word Combination in European Portuguese - LEX-MWE-PT Lexicon
WordNet.PT Lexicon
Artigo em Atas
Alexandre, N. (2010). Uma análise de CP não expandido para o sistema de complementadores do Crioulo de Cabo Verde. In Textos Seleccionados do XXV ENAPL 2009 (A. Costa, P. Barbosa & I. Falé, pp. 111-126). Lisboa: Colibri.
Costa, A., Alexandre, N., Santos, A. L., & Soares, N. (2008). Efeitos de modelização no input: o caso da aquisição de conectores. In Textos Seleccionados do XXIII ENAPL 2007 (S. Frota & A. L. Santos, pp. 131-142). Lisboa: Colibri.
Alexandre, N. (2007). Interrogativas-Q em Crioulo de Cabo Verde: Movimento explícito/implícito ou sem movimento?. In Textos Seleccionados do XXII ENAPL 2006 (M. Lobo & M. Coutinho, pp. 41-55). Lisboa: Colibri.
Alexandre, N. (2006). Processos de relativização e marcadores relativos em Crioulo de Cabo Verde. In Textos Seleccionados do XXI ENAPL 2005 (F. Oliveira & J. Barbosa, pp. 83-95). Lisboa: Colibri.
Alexandre, N., Soares, V., & Verdial Soares, N. (2005). O Domínio Nominal em CCV: o puzzle dos Bare Nouns. In XX Encontro Nacional da APL (pp. 337-350). Lisboa: Fundação Calouste Gulbenkian.
Alexandre, N., & Hagemeijer, T. (2004). The Nominal Domain in Santome. In Los Criollos de Base Ibérica: ACBLPE 2003 (M. Fernández & N. Vázquez, pp. 85-100). Madrid/Frankfurt: Iberoamericana e Vervuert.
Alexandre, N., & Hagemeijer, T. (2002). Pronomes resumptivos e abandono de preposição nos crioulos atlânticos de base lexical portuguesa. In XVII Encontro Nacional da APL (pp. 17-29). Lisboa: Colibri.
Alexandre, N. (2001). Proposta de representação dos DPs relativizados: a análise [NP CP NP]. In XVI Encontro Nacional da APL (pp. 35-46). Lisboa: Colibri.
Alexandre, N. (2000). Reflexões sobre a estrutura dos DPs relativizados: a análise [DP Dº CP] de Kayne 1994. In XV Encontro da APL (pp. 55-74). Braga: Gráfica de Coimbra.
Alexandre, N. (1999). Estratégias de Relativização em Português Europeu: o caso das relativas resumptivas. In XIV Encontro da APL, Braga: Gráfica de Coimbra (pp. 29-39).
del Río, I., & Mendes, A. (2018). Error annotation in a Learner Corpus of Portuguese. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). (Original work published may)
Mendes, A., del Río, I., Stede, M., & Dombek, F. (2018). A Lexicon of Discourse Markers for Portuguese-LDM-PT. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). (Original work published may)
Lejeune, P., & Mendes, A. (2018). Discourse relations with explicit and implicit arguments: The case of European Portuguese aliás. In Proceedings of the Cross-Linguistic Discourse Annotation: Applications and Perspectives, Final Action Conference TextLink. Toulouse.
Zeyrek, D., Mendes, A., & Kurfalı, M. (2018). Multilingual Extension of PDTB-Style Annotation: The Case of TED Multilingual Discourse Bank. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). (Original work published May 7-12, 2018)
Sequeira, J., Gonçalves, T., Quaresma, P., Mendes, A., & Hendrickx, I. (2018). A Multi- versus a Single-classifier Approach for the Identification of Modality in the Portuguese Language. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). (Original work published May 7-12, 2018)
Santos, A. L., Jesus, A., & Abalada, S. (2019). How do children interpret novel control verbs?. In Proceedings of the 43rd annual Boston University Conference on Language Development (Megan M. Brown & Brady Dailey, pp. 585-598). Somerville, MA: Cascadilla Press.
Abalada, S., Cardoso, A., & Cabarrão, V. (2010). Proposta de Classificação Semântica de Unidades Lexicais Multipalavra Nominais. In XXV Encontro Nacional da Associação Portuguesa de Linguística. Textos Seleccionados (Ana Maria Brito, Fátima Silva, João Veloso & Alexandra Fiéis, pp. 81-94). Porto: Edições Colibri/APL.
Abalada, S., Cardoso, A., & Cabarrão, V. (2011). O Vocativo em Português Europeu: Estudo de Parâmetros Prosódicos em Vocativos com Diferentes Distribuições. In XXVI Encontro Nacional da Associação Portuguesa de Linguística. Textos Seleccionados (Armanda Costa, Isabel Falé & Pilar Barbosa, pp. 1-16). Lisboa: Edições Colibri/APL.
Abalada, S. (2012). Aquisição das Periferias Esquerda e Direita em Português Europeu. In XXVII Encontro Nacional da Associação Portuguesa de Linguística. Textos Seleccionados (Armanda Costa, Cristina Flores & Nélia Alexandre, pp. 45-65). Lisboa: Edições Colibri/APL.
Abalada, S. (2013). Acquisition of the Left and Right Peripheries in European Portuguese. In Advances in Language Acquisition: Proceedings of GALA 2011 (Stavroula Stavrakaki, Polyxeni Konstatinopoulou & Marina Lalioti, pp. 4-13). Cambridge: Cambridge Scholars Publishing.
Santos, A. L., Généreux, M., Cardoso, A., Agostinho, C., & Abalada, S. (2014). A corpus of European Portuguese child and child-directed speech. In Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014) (pp. 1488-1491). Reykjavik: European Language Resources Association (ELRA).
Abalada, S., & Cardoso, A. (2015). Prosodic Effects of Syntactic Distribution in Vocatives in European Portuguese. In Parenthetical verbs (Stefan Schneider, Julie Glikman & Mathieu Avanzi, pp. 4-13). Berlin: De Gruyter.
Martins, A., Santos, A. L., & Duarte, I. (2018). Comprehension of relative clauses vs. control structures in SLI and ASD children. In Proceedings of the 42nd annual Boston University Conference on Language Development (Anne B. Bertolini and Maxwell J. Kaplan, pp. 493-506). Somerville, MA: Cascadilla Press.
Romeo, L., Mendes, S., & Bel, N. (2014). Using unmarked contexts in nominal lexical semantic classification. In 25th International Conference on Computational Linguistics - COLING 2014 (pp. 508-519). Dublin, Irlanda. (Original work published 2014)
Marrafa, P., Amaro, R., & Mendes, S. (2014). LexTec - a rich language resource for technical domains in Portuguese. In 9th International Conference on Language Resources and Evaluation - LREC 2014 (pp. 1044-1050). Reykjavik, Islândia. (Original work published 2014)
Necsulescu, S., Mendes, S., & Bel, N. (2014). Combining dependency information and generalization in a pattern-based approach to the classification of lexical-semantic relation instances. In 9th International Conference on Language Resources and Evaluation - LREC 2014 (pp. 4308-4315). Reykjavik, Islândia. (Original work published 2014)
Romeo, L., Mendes, S., & Bel, N. (2014). A cascade approach for complex-type classification. In 9th International Conference on Language Resources and Evaluation - LREC 2014 (pp. 4451-4458). (Original work published 2014)
Romeo, L., Mendes, S., & Bel, N. (2013). Towards the automatic classification of complex-type nominals. In 6th International Conference on Generative Approaches to the Lexicon – GL 2013 (pp. 21-28). Pisa, Itália.
Amaro, R., & Mendes, S. (2012). Towards merging common and technical lexicon wordnets. In 3rd Workshop on Cognitive Aspects of the Lexicon (CogALex-III) at the 24th International Conference on Computational Linguistics – COLING 2012 (pp. 147-160). Bombaim, Índia. (Original work published 2012)
Romeo, L., Mendes, S., & Bel, N. (2012). Using Qualia Information to Identify Lexical Semantic Classes in an Unsupervised Clustering Task. In 24th International Conference on Computational Linguistics – COLING 2012 (pp. 1029-1038). Bombaim, Índia. (Original work published 2012)