This documentation is based on (Sajous and Hathout, 2015) and (Hathout and Sajous, 2016).


94% of GLAWI's entries contain one or several phonemic transcriptions in API. They correspond to pronunciations that occur at different places in Wiktionnaire's articles:1

  • inside pos sections, transcriptions may be present:
    • in the ligne de forme, the line just below the main syntactic category heading after the written form (this line is also used to mention grammatical features).
    • in the paradigm table that gives the different inflected forms of a lemma and the corresponding transcriptions.
  • out of pos sections, and after them, in a Prononciation section

The two screenshots below illustrate the pronunciation elements that can be found in the entry sorcière `witch':

Example of the entry sorcière: transcriptions at the POS (noun) level:

Example of the entry sorcière: transcriptions at the article level:

As shown by the second screenshot, transcriptions occurring in the Prononciation section at the end of the articles may mention diatopic variations. In the example above, national variants distinguish hexagonal French and French from Canada (Quebec). Regional variants may also be mentioned, as illustrated by the screenshot above: moins `minus, less' is pronounced differently in Paris, perceived as "standard" French and in Marseille area (Haut Languedoc):

1 471 missing pronunciations have been taken from the resource developed by Boris New. These pronunciations are identified by the attribute in the dev version of GLAWI.

XML structure

pronunciations <!ELEMENT pronunciations (pron*)> <!ELEMENT pron (#PCDATA)> <!ATTLIST pron area CDATA #IMPLIED>

The pronunciations elements may be found as children of text and pos elements. The optional area attribute (indicating diatopic variations) of the pron elements may occur when the pronunciations element is a direct child of text.
Pronunciations may also be found in inflectional paradigms (see the corresponding section of the pos elements' documentation).
Below is the XML structure that corresponds to the noun sorcière commented above:

