Grammar & Resources

The group is centered on modeling linguistic knowledge, integrating interfaces between different areas of grammar and knowledge about how language is put to use. Joint work in formal phonology, lexicon, syntax and semantics allows building an integrated model of grammar, considering how it is represented in the human mind, as well as how it can be computationally modelled; work on L1 and L2 acquisition is at the core of this work. The integration of models of language representation and models of language use is achieved through the study of corpora.

The production of corpora and resources is justified by the goal of developing documentation and providing descriptions of contemporary European Portuguese, but also of understudied contact languages or varieties (Portuguese-based creoles, national varieties of Portuguese in Africa and Asia). The group also produces resources for the study of L1 and L2 acquisition in different settings. The group integrates CLARIN LP.

Research on L1 and L2 acquisition contributes to CLUL’s general purpose of effectively articulating fundamental and applied research, namely in the areas of Educational Linguistics and Clinical Linguistics.

General goals:

- To produce new resources for the study of Portuguese and Portuguese-based creoles;

- To pursue basic research on natural language modeling, integrating knowledge on interfaces between language modules;

- To continue the documentation and description of understudied creoles and new varieties of Portuguese that emerged in a context of language contact;

- To develop the study of language acquisition with an emphasis on language contact situations (see new international Heritage Language Consortium) and on the comparison between typical and atypical development;

- To explore the potential of comparative linguistics in the production of resources for translation and to promote connections with the industry in the area of translation.

 

Resources Type
A Lexicon of Child European Portuguese - CEPLEXicon Lexicon
A Portuguese Native Language Identification Dataset - NLI-PT Database
Acquisition of European Portuguese Databank - AcEP Database
Child-Adult Interaction Corpus - CAI Corpus
Child-Adult interaction European Portuguese Database
Consonantic Sequences Oral and Written Production Tasks - PORESC Tool
Controlled Portuguese - CLG Database
Corpora of PLE Corpus
Corpus Almeida - European Portuguese / French Corpus
Corpus Angolar Corpus
Corpus C-ORAL-ROM Corpus
Corpus CCF Corpus
Corpus CINTIL Corpus
Corpus Fadambo Corpus
Corpus Leiria (1991) Corpus
Corpus of Cape Verdean Portuguese Corpus
Corpus of Sri Lanka Portuguese Corpus
Corpus of the Diaries of the Portuguese Parliament annotated with PoS - PTPARL Corpus
Corpus PESTRA Corpus
Corpus Português Fundamental - Corpus PF Corpus
Corpus Principense Corpus
Corpus REDIP Corpus
Corpus Santome Corpus
Corpus SANTOS - European Portuguese Corpus
Crosslinguistic Child Phonology Project - Português Europeu - CLCP-PE Tool
Dados Orais de Cabo Verde - CV Words Database
Demo de Subespecificação e Desambiguação de Escopo Tool
Dictionary of Hindi-Portuguese-Hindi Database
Diu Indo-Portuguese Data Set Database
Learner Corpus of Portuguese L2 - COPLE2 Corpus
LT Corpus (Literary Corpus) - LT Corpus Corpus
Modality Lexicon - MODAL-LEX-PT Lexicon
Multifunctional Computational Lexicon of Contemporary Portuguese Lexicon
Named Entity Recognizer - CRPC-NER Tool
Nominal Multiword Lexical Units in European Portuguese Lexicon
NPChunks: Corpus of 1000 sentences annotated with PoS and nominal chunks - NPChunks Corpus
Online Corpus of Writing and Speech of Children in the Early Years of Schooling - EFFE-On Corpus
Online Dictionary Portuguese-Slovak/Slovak-Portuguese Database
Pereira&Freitas - EP Corpus
Person-Machine Interaction in Natural Language - INQUER Database
PhonoDis Corpus
Phonological Awareness Tasks for First Grade School Children - TCFC Tool
Portuguese Biographies - Bio-PT Database
Portuguese Corpus Annotated for Modality - MODAL Corpus
Portuguese Lexicon of Discourse Markers - LDM-PT Lexicon
Portuguese Technical Lexica - LEXTEC Lexicon
Portuguese Discourse Bank - CRPC-DB Corpus
Quotations database - CRPC-quotations Database
Ramalho – EP Corpus
Reference Corpus of Contemporary Portuguese - CRPC Corpus
Santome Structure Dataset Database
Spoken Corpus Mozambique 1986-87 - SCM Corpus
Spoken Portuguese - Geographical and Social Varieties Corpus
Vocatives in European Portuguese Corpus
Word Combination in European Portuguese - LEX-MWE-PT Lexicon
WordNet.PT Lexicon
Artigo em Atas
Costa, A., Faria, I. H., & Matos, G. (1999). Competitive information sources in referential ambiguity resolution. In Psycholinguistics on the Threshold of the Year 2000 — Proceedings of 5th International Congress of the International Society of Applied Pshycholinguistics (ISAPL 97) (Pinto, M. G.; Veloso, J.; Maia, B., pp. 133-138). Porto: Faculdade de Letras da Universidade do Porto. Retrieved from https://apl.pt/wp-content/uploads/2017/12/1997-16.pdf
Matos, G. (1995). Estruturas Binárias e Monocêntricas em Sintaxe — algumas observações sobre a coordenação de projecções máximas. In Actas do X Encontro Nacional da Associação Portuguesa de Linguística, 1994 (pp. 301-315). Évora: Edições Colibri, APL. Retrieved from https://apl.pt/wp-content/uploads/2017/12/1994-23.pdf
Costa, A., Faria, I. H., & Matos, G. (1998). Ambiguidade referencial na identificação do sujeito em estruturas coordenadas. In Actas do XIII Encontro Nacional da Associação Portuguesa de Linguística, 1997 (Mota, M.A,; Marquilhas, R. , pp. 173-188). Lisboa: Edições Colibri / APL . Retrieved from https://apl.pt/wp-content/uploads/2017/12/1997-16.pdf
Matos, G., Miguel, M., & Freitas, J. (1997). Functional Categories in Early Acquisition of European Portuguese. In Proceedings of Gala' 97 Conference on Language Acquisition (Sorace, A.; Heycock, C.; Shillcock, R., pp. 115-120).
Matos, G. (1996). A Sintaxe e a Morfo-Sintaxe nas Gramáticas Descritivas do Século XX. In Actas do XI Encontro Nacional da Associação Portuguesa de Linguística, 1995 (Duarte, I.; Miguel, M. , pp. 105-121). Lisboa: Edições Colibri / APL. Retrieved from https://apl.pt/wp-content/uploads/2017/12/1995-10-2.pdf
Matos, G. (1989). Elipse do SV em estruturas predicativas com ser e estar. In Actas do IV Encontro Nacional da Associação Portuguesa de Linguística (pp. 41-67). Lisboa: Reprografia da Associação de Estudantes da Faculdade de Letras de Lisboa . Retrieved from https://apl.pt/wp-content/uploads/2017/12/1988-5.pdf
Kurfalı, M., Sibel, O., Zeyrek, D., & Mendes, A. (2020). TED-MDB Lexicons: Tr-EnConnLex, Pt-EnConnLex. In Proceedings of the First Workshop on Computational Approaches to Discourse (Chloé Braud et al., Eds., pp. 148-153). Association for Computational Linguistics.
Crible, L., & Mendes, A. (2018). Designing a corpus-based lexicon for spoken DRDs: semantic considerations. In Proceedings of the Cross-Linguistic Discourse Annotation: Applications and Perspectives, Final Action Conference TextLink (L.M. Ho-Dac & Phillip Mueller, Eds., pp. 29-33). University of Toulouse.
Freitas, M. J., Vigário, M., & Frota, S. (2004). The acquisition of the Prosodic Word in European Portuguese. In Second Lisbon Meeting on Language Acquisition. Lisboa.
Hagemeijer, T., Mendes, A., Gonçalves, R., Cornejo, C., Madureira, R., & Généreux, M. (2022). The PALMA Corpora of African Varieties of Portuguese. In N. Calzolari, Béchet, F., Blache, P., Choukri, K., Declerck, T., Goggi, S., et al. (Eds.), Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022) (Marseille, 20-25 June 2022. Paris: European Language Resources Association (ELRA), pp. 5047-5053).
Segura, L. (1996). Aspectos fonéticos do Barlavento do Algarve: as vogais finais acentuadas. In I. Duarte & Leiria, I. (Eds.), Actas do Congresso Internacional sobre o Português Vol. II (1994) (pp. 345-358). Lisboa: APL e Eds Colibri.
Rodrigues, C., & Gomes, J. (2023). "Otraves" o mesmo "faitico": a proficiência ortográfica nos dígrafos e de crianças alentejanas e transmontanas do 2.º ano de escolaridade. In C. Amorim & Zhou, C. (Eds.), Atas do II Phonoshuttle OPO-LIS: Ponte aérea de fonologia (pp. 53-62). Retrieved from https://ler.letras.up.pt/uploads/ficheiros/19671.pdf
Edição de Atas
Mendonca, V., Sardinha, A., Coheur, L., & Santos, A. L. (2020). Query Strategies, Assemble! Active Learning with Expert Advice for Low-resource Natural Language Processing. 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). IEEE. http://doi.org/10.1109/fuzz48607.2020.9177707
Dataset
Gonçalves, R., Hagemeijer, T., Cornejo, C., Alcantâra, C., Madureira, R., Généreux, M., & Mendes, A. (2021). PALMA Corpus São Tomé e Príncipe . Lisboa: Centro de Linguística da Universidade de Lisboa.
Hagemeijer, T., Madureira, R., Cornejo, C., Justino, V., Campos, M., Gonçalves, R., et al. (2021). PALMA Corpus Moçambique. Lisboa: Centro de Linguística da Universidade de Lisboa.
Miguel, A., Cornejo, C., Madureira, R., Silva, D., Hagemeijer, T., Gonçalves, R., et al. (2021). PALMA Corpus Angola. Lisboa: Centro de Linguística da Universidade de Lisboa.
E-edition
Colaço, M., Gonçalves, A., Freitas, M. J., & Gomes, J. (2022). A casa na quinta: das palavras às frases. Lisboa: Direção Geral de Educação. Retrieved from https://redge.dge.mec.pt/ilha/por4/
Journal Paper
Flores, C., Santos, A. L., Jesus, A., & Marques, R. (2017). Age and input effects in the acquisition of mood in Heritage Portuguese. Journal Of Child Language, 44(4), 795-828. http://doi.org/10.1017/s0305000916000222
Santos, A. L., Gonçalves, A., & Hyams, N. (2014). Complementos de verbos percetivos. Causativos E De Controlo De Objeto Em Português Europeu: Dados Da Aquisição. In Xxix Encontro Nacional Da Apl, 2013.
Almeida, M. C. (2006). Blend-Bildungen - und was dahinter steckt. Portugiesisch Kontrastiv Gesehen Und Anglizismen Weltweit, 10., 241-259.
Duarte, I. (2013). Construções de Topicalização, in Gramática do Português. Vol. I, I, 401-426.
Hagemeijer, T., & Holm, J. (2008). On the Creole Portuguese of São Tomé (West Africa). Annotated translation from the German of “Ueber das Negerportugiesische von S. Thomé (Westafrika. ). ” Sitzungsberichte der kaiserlichen Akademie der Wissenschaften zu Wien 101(2): 889-917. [1882]. Contact Languages: Critical Concepts In Linguistics, I, 131-156.
Marques, R. (2013). Construções de grau, in Gramática do Português. Eduardo Paiva Raposo Et Al. Lisboa: Fundação Calouste Gulbenkian, Cap, 40, 2139-2163.
Marques, R. (2012). Covert Modals and (Non-) Implicative Readings of too/enough Constructions, Covert Patterns of Modality. W. Abraham & E. Leiss. Cambridge: Cambridge Scholars Publishing, Pp. 238-266. Isbn, 238-266.
Marques, R. (2003). Semantic and Pragmatic Constraints on Mood Selection, in Meaning Through Language Contrast. Vol. 1, 1, 129-146.
Matos, G., & Brito, A. M. (2013). The alternation between improper indirect questions and restrictive relatives. Linguistik Aktuell/Linguistics Today, 197, 83-116. http://doi.org/https://doi.org/10.1075/la
Matos, G., & Brito, A. (2008). Comparative clauses and cross linguistic variation: a syntactic approach. Empirical Issues In Syntax And Semantics. Bonamy, O; Hofherr, P. (Eds), 7, 307–329. Retrieved from http://www.cssp.cnrs.fr/eiss7/
Mota, M. A., Rodrigues, C., & Soalheiro, E. (2003). Padrões flexionais nos pretéritos fortes. Pe Falado Setentrional, In Razão E Emoções , II - Volume de Homenagem a Maria Helena Mira Mateus, 129-155.
Vigário, M., Frota, S., & Freitas, M. J. (2009). Phonetics and Phonology. Interactions And Interrelations. Current Issues In Linguistic Theory, 306.
Costa, J., Fiéis, A., Freitas, M. J., Lobo, M., & Santos, A. L. (2014). New Directions in the Acquisition of Romance Languages. Selected Proceedings Of The Romance Turn V. Cambridge Scholars Publishing. Isbn.