REsources Developed At CLLE-ERSS CLLE-ERSS research unit


This documentation is based on (Sajous and Hathout, 2015) and (Hathout and Sajous, 2016).



Inside pos tags, the definitions (plural) element may include several definition (singular) children, each describing a word sense. A definition contains a gloss and possibly one or more usage examples. Definitions may include labels that give attitudinal, diatopic, diachronic, diafrequential information or indicate that the word belongs to a specialized language.

Each gloss and example is available under 4 versions:

  • the original wikicode;
  • a plain text version;
  • an XML version that formally encodes specific information: markups encode wiki typesetting (boldface, italic, etc.), dates, foreign words, mathematical/chemical formulae and external/inner links. See further description;
  • a syntactic parsing of the text in CoNLL format produced by the Talismane parser.

XML structure

<!ELEMENT definitions (definition)*> <!ELEMENT definition (gloss?, example*)> <!ELEMENT gloss (labels?, wiki?, xml, txt, parsed?)> <!ELEMENT example (labels?, wiki?, xml, txt, parsed?)> <!ELEMENT labels (label)*> <!ELEMENT label EMPTY> <!ATTLIST label type (attitudinal|diachronic|diafrequential|diatopic|domain|gram|loan|other|sem|usage) "other" value CDATA #REQUIRED> The description of wiki, xml, txt and parsed is available here.
Labels are also described in a separate page.

Back to GLAWI's [ main documentation page ] [ project page ]