(1996-1998 - Programa da Comissão Europeia - DGXIII, Telematics Application of Common Interest - Contrato LE2 - 4017)
LE-PAROLE is a project that uses linguistic and informatic resources already available in the European countries in order to built corpora and lexicons according to integrated models of constitution and materials description. The use of common tools makes multilanguage connections possible and gives response to a great number of applications. For each language, a 20 million word corpus was built with harmonized design, composition and codification, including a 250.000 word tagged subcorpus. Each language lexicon is composed of 20.000 entries with syntactic and morphosyntactic information.
These materials are available, for sale, on ELDA's catalogue:
- a 3 million words corpus with the following constitution: newspapers (65%), books (20%), magazines (5%) and varia (10%); this corpus includes a 250.000 words subcorpus (with approximately the same distribution as the main corpus) morphosyntactically annotated, following standard criteria of the PAROLE project;
- a lexicon with 20.000 lemma with morphosyntactic and syntactic information.