REsources Developed At CLLE-ERSS CLLE-ERSS research unit

Version française
DiCo Corpus The Dico Corpus contains a lot of information on some French dictionaries (Académie, Hachette, Larousse, Robert), such as the headwords added to or removed from dictionaries' nomenclature.
Treelex++ Treelex++ is a syntactico-semantic lexicon based on TreeLex. It contains 1161 single-frame verbs (i.e. verbs with a single syntactic structure) present in TreeLex with human-added aspectual properties.
DinaVmouv The DinaVmouv base is a lexicon including more than a thousand motion verbs in French. In addition to the list of verbs, the DinaVmouv base provides a minimal semantic description of each verb (type of motion conveyed, lexical aspect, manner), as well as its definitions extracted from two electronic dictionaries, namely the TLFi and/or GLAWI.
PREF-IT PREF-IT (PREFixed ITalian verbs lexicon) is composed by 1680 Italian verbs morphologically constructed from nominal or adjectival bases by prefixation process.
FOULOPHONIE FOULOPHONIE is a lexicon including 7757 words used if French-speaking areas and countries, together with their definition in that place. This resource is extracted from Wiktionnaire, the French edition of Wiktionary.
GLAFFIT GLAFF-IT is a free large-scale morphophonological Italian lexicon extracted from GLAWIT. Each entry includes a wordform, its morphosyntactic description, its lemma and phonetic transcriptions.
GLAWIT GLAWIT is an Italian machine-readable dictionary encoded in XML format. It is a structured and normalized version of Wikizionario (the Italian language edition of Wiktionary).
WIKIMORPH-SR wikimorph-sr is a morphosyntactic lexicon for Serbian that can be used in POS-tagging, parsing and lemmatisation. The lexicon was developed as a part of the ParCoLab project. It was mainly extracted from the serbo-croation edition of Wiktionary. It contains 1 226 638 different wordforms corresponding to 117 445 different lemmas and to 3 066 214 unique combinations <wordform, lemma, morphosyntactic description>.
GLAWI GLAWI is a free XML French Machine-Readable Dictionary extracted from Wiktionnaire, the French Wiktionary. Standing for "GLÀFF and WiktionaryX", GLAWI is made up of data embedded the the two resources: lemmas and inflected forms (simple forms, compounds and multiword expressions), together with their etymologies, pronunciations, definitions, translations, semantic and morphological relations, etc.
Démonette A morphological lexical database of French organized as a derivational network, in which entries are pairs of words (Word1, Word2) belonging to the same morphological family. An entry is described by 31 fields including the morphosyntactic category and the semantic type of both word, as well as the definition of Word1 with respect to Word2.
GLÀFF is a large French versatile lexicon grounded on Wiktionnaire, the French Wiktionary. GLÀFF contains, for each entry, its morphosyntactic description, its lemma and a phonetic transcription.
PsychoGLÀFF is a version of GLÀFF especially designed for psycholinguistics research.
WiktionaryX Wiktionary, the Wikimedia Foundation's free collaborative dictionary, contains definitions, semantic relations and translation links. We have converted the French and English editions into an XML format: WiktionaryX.
Verbaction French lexicon of action nouns morphologically related to verbs. This lexicon contains verb:noun pairs such that:
  • the noun is morphologically related to the verb ;
  • the noun can be used to denote the action or activity denoted by the verb.
Morphonette a morphological network of French. Morphonette is a set of morphologiacal filaments of the form entry:relative:sub-series such that:
  • the relative is a member of the entry's morphological family ;
  • the sub-series is a set of words which enter into analogies with the entry and its relative.
Famorpho-Fr morphological families of the TLF dictionary's head words starting with FR-.
Treelex TreeLex is a subcategorisation lexicon automatically extracted from a syntactically annotated corpus (Treebank). It contains about 2000 contemporary French verbs (types) and 2000 adjectives with their valence frames.