REDAC
REsources Developed At CLLE-ERSS CLLE-ERSS research unit






PsychoGLAFF

PsychoGLÀFFVersion française
a large-scale inflectional French lexicon designe for psycholinguistics
Description
PsychoGLÀFF is a French large-scale inflectional lexicon grounded o Wiktionnaire, the French language edition of Wiktionary. It is derived from GLĂ€FF and is especially oriented towards psycholinguistics. PsychoGLĂ€FF has the following structure: Each line describes an entry. An entry includes fields separated by the | character :
  1. the wordform
  2. the morphosyntactic tag in GRACE format
  3. the lemma
  4. the phonological transcription encoded in IPA
  5. the phonological transcription encoded in SAMPA
  6. the absolute frequency of the categorized form in Frantext 20e corpus
  7. the relative frequency (per million words) of the categorized form in Frantext 20e corpus
  8. the absolute frequency of the categorized lemma in Frantext 20e corpus
  9. the relative frequency (per million words) of the categorized lemma in Frantext 20e corpus
  10. the absolute frequency of the categorized form in LM10 corpus
  11. the relative frequency (per million words) of the categorized form in LM10 corpus
  12. the absolute frequency of the categorized lemma in LM10 corpus
  13. the relative frequency (per million words) of the categorized lemma in LM10 corpus
  14. the absolute frequency of the categorized form in FrWac corpus
  15. the relative frequency (per million words) of the categorized form in FrWac corpus
  16. the relative frequency (per million words) of the categorized form in FrWac corpus
  17. the absolute frequency of the categorized lemma in FrWac corpus
  18. the length of the wordform (number of characters)
  19. the length of the phonological transcription (number of phonemes)
  20. the syllabification and the CV structure of the word
  21. the number of syllables
  22. the ratio between the number of syllables and the consonants composing the phonological form
  23. the geometric mean of the conditional character probabilities of bigrams (calculated on the wordform)
  24. the geometric mean of the conditional character probabilities of trigrams (calculated on the wordform)
  25. the geometric mean of the conditional character probabilities of 4-grams (calculated on the wordform)
  26. the geometric mean of the conditional phoneme probabilities of bigrams (calculated on the phonological transcription)
  27. the geometric mean of the conditional phoneme probabilities of trigrams (calculated on the phonological transcription)
  28. the geometric mean of the conditional phoneme probabilities of 4-grams (calculated on the phonological transcription)

Quantitative analysis concerning the orthographic and phonological neighborhood of the wordform are currently being implemented and will be released in the next version.


Authors
Basilio Calderone, Nabil Hathout et Franck Sajous

Person in charge
Basilio Calderone
Contact :

License
PsychoGLĂ€FF is released under the Creative Commons By-SA 3.0 license (Attribution-ShareAlike 3.0 Unported). Read a summary of this license.

Download
References
Basilio Calderone, Nabil Hathout and Franck Sajous. (2014). From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource. Proceedings of the 16th EURALEX Conference. Bolzano, Italy. [ PDF ] [ Bibtex ]