REsources Developed At CLLE CLLE: Cognition, Langues, Langage, Ergonomie

Version française

Glawinette is a derivational lexicon of French built from the GLAWI machine-readable dictionary. The entries of Glawinette are pairs of morphologically related lexemes like accomplir_V:accomplissement_N. Glawinette provides the word family (morphological family) of each of its entries and a characterization of the derivational relations which the pair of lexemes are in.

ENGLAWI ENGLAWI is a free English machine-readable dictionary encoded in XML format. It is a structured and normalized version of the English edition of Wiktionary. ENGLAWI includes simple words, compounds and multiword expressions (inflected forms and lemmas), etymologies, pronunciations, definitions, translations, semantic and morphological relations, etc.
ENGLAFF ENGLAFF is an inflectional lexicon extracted from ENGLAWI, which contains more than 1,179,000 entries, each including an inflected form, its lemma, and a morphosyntactic tag.
DIVAE DIVAE (Diatopic Variation of English) is a lexicon extracted from Wiktionary, which includes 30,822 entries that correspond to 19,468 distinct words used in 74 English-speaking areas.
WIND WIND (Wiktionary INclusion Dates) contains the inclusion dates of Wiktionary and Wiktionnnaire headwords in their nomenclatures.
Lexeur Lexeur is a lexicon which contains 5974 derivational families of nouns ending in -eur. It has been built semi-automatically, from the TLFi word list and Web queries.
DiCo Corpus DiCo (Dictionnaires Comparés) contains a lot of information on some French dictionaries (Académie, Hachette, Larousse, Robert), such as the headwords added to or removed from dictionaries' nomenclature.
Treelex++ Treelex++ is a syntactico-semantic lexicon based on TreeLex. It contains 1161 single-frame verbs (i.e. verbs with a single syntactic structure) present in TreeLex with human-added aspectual properties.
DinaVmouv The DinaVmouv base is a lexicon including more than a thousand motion verbs in French. In addition to the list of verbs, the DinaVmouv base provides a minimal semantic description of each verb (type of motion conveyed, lexical aspect, manner), as well as its definitions extracted from two electronic dictionaries, namely the TLFi and/or GLAWI.
PREF-IT PREF-IT (PREFixed ITalian verbs lexicon) is composed by 1680 Italian verbs morphologically constructed from nominal or adjectival bases by prefixation process.
FOULOPHONIE FOULOPHONIE is a lexicon including 7757 words used in French-speaking areas and countries, together with their definition in that place. This resource is extracted from Wiktionnaire, the French edition of Wiktionary.
GLAFFIT GLAFF-IT is a free large-scale morphophonological Italian lexicon extracted from GLAWIT. Each entry includes a wordform, its morphosyntactic description, its lemma and phonetic transcriptions.
GLAWIT GLAWIT is an Italian machine-readable dictionary encoded in XML format. It is a structured and normalized version of Wikizionario (the Italian language edition of Wiktionary).
WIKIMORPH-SR WIKIMORPH-SR is a morphosyntactic lexicon for Serbian that can be used in POS-tagging, parsing and lemmatisation. The lexicon was developed as a part of the ParCoLab project. It was mainly extracted from the serbo-croation edition of Wiktionary. It contains 1 226 638 different wordforms corresponding to 117 445 different lemmas and to 3 066 214 unique combinations <wordform, lemma, morphosyntactic description>.
GLAWI GLAWI is a free XML French Machine-Readable Dictionary extracted from Wiktionnaire, the French Wiktionary. Standing for "GLÀFF and WiktionaryX", GLAWI is made up of data embedded the the two resources: lemmas and inflected forms (simple forms, compounds and multiword expressions), together with their etymologies, pronunciations, definitions, translations, semantic and morphological relations, etc.
Démonette A morphological lexical database of French organized as a derivational network, in which entries are pairs of words (Word1, Word2) belonging to the same morphological family. An entry is described by 31 fields including the morphosyntactic category and the semantic type of both word, as well as the definition of Word1 with respect to Word2.
GLAFF GLÀFF is a large French versatile lexicon grounded on Wiktionnaire, the French Wiktionary. GLÀFF contains, for each entry, its morphosyntactic description, its lemma and a phonetic transcription.
PsychoGLÀFF is a version of GLÀFF especially designed for psycholinguistics research.
WiktionaryX Wiktionary, the Wikimedia Foundation's free collaborative dictionary, contains definitions, semantic relations and translation links. We have converted the French and English editions into an XML format: WiktionaryX.
Verbaction French lexicon of action nouns morphologically related to verbs. This lexicon contains verb:noun pairs such that:
  • the noun is morphologically related to the verb ;
  • the noun can be used to denote the action or activity denoted by the verb.
Morphonette a morphological network of French. Morphonette is a set of morphologiacal filaments of the form entry:relative:sub-series such that:
  • the relative is a member of the entry's morphological family ;
  • the sub-series is a set of words which enter into analogies with the entry and its relative.
Famorpho-Fr morphological families of the TLF dictionary's head words starting with FR-.
Treelex TreeLex is a subcategorisation lexicon automatically extracted from a syntactically annotated corpus (Treebank). It contains about 2000 contemporary French verbs (types) and 2000 adjectives with their valence frames.