|
GLAWI
GLAWI
Documentation
This documentation is based on (Sajous and Hathout, 2015)
and (Hathout and Sajous, 2016).
Paradigms and inflections
Description
When a POS section describes one or several inflected forms,
the inflectionInfos element enumerates the morphosyntactic features of the forms and their lemmas.
When it describes a lemma,
the inflectional paradigm element and its
inflection children
give all the inflected forms of the paradigm (when they are present in Wiktionnaire).
XML structure
-
inflectionInfos
The inflectionInfos element enumerates the morphosyntactic features of the forms and their lemmas.
As illustrated by the verb forms mousse
of the verb mousser ‘to foam’ above, this information is reported in
the attributes gracePOS and lemma of
inflectedForm children.
This information is extracted either from the plain-text definitions, such as Première personne du singulier du présent de l'indicatif de mousser,
Deuxième personne du singulier de l'impératif de mousser, etc. or from the
inflection tables (as seen in the page), generated by templates (in the wikicode) such as:
{{fr-verbe-flexion|mousser|ind.p.1s=oui|ind.p.3s=oui|sub.p.1s=oui|sub.p.3s=oui|imp.p.2s=oui}}
- paradigm
In the example mousse, the paradigm is given for the first noun section (singular and plural).
Paradigm can be extracted from wikicode's agreement templates (e.g. the {{fr-rég|mus}}
that produces the Singulier/Pluriel ‘Singular/Plural’ table in Wiktionnaire's page),
from pages dedicated to inflected forms (e.g. the page describing the form mousses
states that mousses is the plural form of the noun mousse),
as well as two inflected forms of the verb mousser.
inflection element is used for all adjectives inflections (i.e. masculine plural or feminine forms)
and nouns plurals.
In Wiktionnaire, masculine/feminine equivalents are often treated as inflected forms of the same paradigm.
In GLAWI, they appear in equivMasc and equivFem markups.
For example, the noun POS section
of sorcier `wizard'
gives the four following forms in its "agreement table":
- sorcier (masculine, singular)
- sorciers (masculine, plural)
- sorcière (feminine, singular)
- sorcières (feminine, plural)
These forms are reported in the XML as follows:
The three elements
inflection,
equivMasc
and equivFem have the same attributes:
- form: the inflection's written form;
- gracePOS: the morphosyntactic features, in GRACE format;
- prons: transcriptions (optional). Multiple transcriptions are separated by semi-columns.
GRACE format
The gracePOS attribute given in inflectedForm and inflection
follow the GRACE format (Rajman et al., 1997). It is used for nouns, verbs and adjectives as described below.
Nouns |
Code | Description |
Nc[mf][sp] |
Common nouns
+ gender (m: masculine, f: feminine)
+ number (s: singular, p: plural)
|
|
Adjectives |
Code | Description |
Afp[mf][sp] |
Positive adjective
+ gender (m: masculine, f: feminine)
+ number (s: singular, p: plural)
|
|
All adjectives have been tagged as 'qualificative positive' (Afp).
Code | Description |
Vmn---- |
Infinitive |
Vmpp--- |
Present participle |
Vm-ps-[sp][mf] |
Past participle + number attribute [s/p] + gender attribute [m/f] |
Vm[ism][pifs][123][sp]- |
Inflected verb form
+ mood (i: indicative, s: subjunctive, m: imperative)
+ tense (p: present, i: imperfect, f: future, s: past)
+ person ([123])
+ number (s: singular, p: plural)
|
Question marks may occur when a feature is not mentioned in Wiktionnaire.
For example, a common noun mentioned as masculine with no information about its number will result in a Ncm? attribute value.
Reference
Rajman, M., Lecomte, J., and Paroubek, P. (1997).
Format de description lexicale pour le français. Partie 2 : Description morpho-syntaxique. Technical report, EPFL & INaLF. GRACE GTR-3-2.1.
Back to GLAWI's [ main documentation page ] [ project page ]
|