Project PI
João Malaca Casteleiro


(1996-1998 - Programa da Comissão Europeia - DGXIII, Telematics Application of Common Interest - Contrato LE2 - 4017)

LE-PAROLE is a project that uses linguistic and informatic resources already available in the European countries in order to built corpora and lexicons according to integrated models of constitution and materials description. The use of common tools makes multilanguage connections possible and gives response to a great number of applications. For each language, a 20 million word corpus was built with harmonized design, composition and codification, including a 250.000 word tagged subcorpus. Each language lexicon is composed of 20.000 entries with syntactic and morphosyntactic information.

These materials are available, for sale, on ELDA's catalogue:

  • a 3 million words corpus with the following constitution: newspapers (65%), books (20%), magazines (5%) and varia (10%); this corpus includes a 250.000 words subcorpus (with approximately the same distribution as the main corpus) morphosyntactically annotated, following standard criteria of the PAROLE project;
  • a lexicon with 20.000 lemma with morphosyntactic and syntactic information.
Consorzio Pisa Richerche
CLUL - Centro de Linguística da Universidade de Lisboa
Det Danske Sprog- og Litteraturselskab
Fundación Bosh Gimpera Universitat de Barcelona
Goteborgs Universitet - Institutionen for Svenska Spraket
Institiuid Teangeolaiochta Eireann
Institut d'Estudis Catalans
Institut fur Deutsche Sprache
Institute for Language and Speech Processing
Instituut voor Nederlandse Lexicologie
University of Birmingham
University of Helsinki
Université de Liège
INESC - Instituto de Engenharia de Sistemas e Computadores