REDAC
REsources Developed At CLLE CLLE: Cognition, Langues, Langage, Ergonomie






GLAWI

GLAWI
Documentation
This documentation is based on (Sajous and Hathout, 2015) and (Hathout and Sajous, 2016).

Paradigms and inflections

Description

When a POS section describes one or several inflected forms, the inflectionInfos element enumerates the morphosyntactic features of the forms and their lemmas. When it describes a lemma, the inflectional paradigm element and its inflection children give all the inflected forms of the paradigm (when they are present in Wiktionnaire).


XML structure

  • inflectionInfos <!ELEMENT inflectionInfos (inflectedForm)*> <!ELEMENT inflectedForm EMPTY> <!ATTLIST inflectedForm gracePOS CDATA #REQUIRED lemma CDATA #IMPLIED>

    The inflectionInfos element enumerates the morphosyntactic features of the forms and their lemmas. As illustrated by the verb forms mousse of the verb mousser ‘to foam’ above, this information is reported in the attributes gracePOS and lemma of inflectedForm children. This information is extracted either from the plain-text definitions, such as Première personne du singulier du présent de l'indicatif de mousser, Deuxième personne du singulier de l'impératif de mousser, etc. or from the inflection tables (as seen in the page), generated by templates (in the wikicode) such as:

    {{fr-verbe-flexion|mousser|ind.p.1s=oui|ind.p.3s=oui|sub.p.1s=oui|sub.p.3s=oui|imp.p.2s=oui}}
  • paradigm <!ELEMENT paradigm (inflection|equivMasc|equivFem)*> <!ELEMENT inflection EMPTY> <!ATTLIST inflection form CDATA #IMPLIED gracePOS CDATA #REQUIRED prons CDATA #IMPLIED> <!ELEMENT equivMasc EMPTY> <!ATTLIST equivMasc form CDATA #IMPLIED gracePOS CDATA #REQUIRED prons CDATA #IMPLIED> <!ELEMENT equivFem EMPTY> <!ATTLIST equivFem form CDATA #IMPLIED gracePOS CDATA #REQUIRED prons CDATA #IMPLIED> In the example mousse, the paradigm is given for the first noun section (singular and plural). Paradigm can be extracted from wikicode's agreement templates (e.g. the {{fr-rég|mus}} that produces the Singulier/Pluriel ‘Singular/Plural’ table in Wiktionnaire's page), from pages dedicated to inflected forms (e.g. the page describing the form mousses states that mousses is the plural form of the noun mousse), as well as two inflected forms of the verb mousser.

    inflection element is used for all adjectives inflections (i.e. masculine plural or feminine forms) and nouns plurals.
    In Wiktionnaire, masculine/feminine equivalents are often treated as inflected forms of the same paradigm. In GLAWI, they appear in equivMasc and equivFem markups. For example, the noun POS section of sorcier `wizard' gives the four following forms in its "agreement table":

    • sorcier (masculine, singular)
    • sorciers (masculine, plural)
    • sorcière (feminine, singular)
    • sorcières (feminine, plural)

    These forms are reported in the XML as follows:

    The three elements inflection, equivMasc and equivFem have the same attributes:

    • form: the inflection's written form;
    • gracePOS: the morphosyntactic features, in GRACE format;
    • prons: transcriptions (optional). Multiple transcriptions are separated by semi-columns.

GRACE format

The gracePOS attribute given in inflectedForm and inflection follow the GRACE format (Rajman et al., 1997). It is used for nouns, verbs and adjectives as described below.
Nouns
CodeDescription
Nc[mf][sp] Common nouns
+ gender (m: masculine, f: feminine)
+ number (s: singular, p: plural)
Adjectives
CodeDescription
Afp[mf][sp] Positive adjective
+ gender (m: masculine, f: feminine)
+ number (s: singular, p: plural)
All adjectives have been tagged as 'qualificative positive' (Afp).

CodeDescription
Vmn---- Infinitive
Vmpp--- Present participle
Vm-ps-[sp][mf] Past participle
+ number attribute [s/p]
+ gender attribute [m/f]
Vm[ism][pifs][123][sp]- Inflected verb form
+ mood (i: indicative, s: subjunctive, m: imperative)
+ tense (p: present, i: imperfect, f: future, s: past)
+ person ([123])
+ number (s: singular, p: plural)

Question marks may occur when a feature is not mentioned in Wiktionnaire. For example, a common noun mentioned as masculine with no information about its number will result in a Ncm? attribute value.

Reference
Rajman, M., Lecomte, J., and Paroubek, P. (1997). Format de description lexicale pour le français. Partie 2 : Description morpho-syntaxique. Technical report, EPFL & INaLF. GRACE GTR-3-2.1.

Back to GLAWI's [ main documentation page ] [ project page ]