REDAC
REsources Developed At CLLE CLLE: Cognition, Langues, Langage, Ergonomie






ENGLAWI - documentation of the labels element

ENGLAWI's documentation
Linguistic labels

Linguistic labels

Description

Linguistic labels are indicators found in definitions that signal a particular usage of a word: period, geographic area, specialized domain or subculture slang, etc.
We inventoried and normalized labels in order to:

  • remove labels from the textual content of glosses and examples;
  • encode them formally in dedicated children markups of glosses and examples elements (see an illustration in the definitions page).

We assigned the linguistic labels to categories (diatopic, diachronic, attitudinal, etc.) that are not encoded in Wiktionary. These categories are the labels' type attribute. Examples of attribute values, per type, are given below.

XML structure of labels

<!ELEMENT labels (label)*> <!ELEMENT label (#PCDATA)> <!ATTLIST label type (attitudinal|diachronic|diafreq|diatopic|domain|gram|of|other|sem) #REQUIRED modifier (almost|chiefly|generally|largely|less|mildly|more|moreOften|now|nowChiefly|nowLargely|nowLess |nowMore|nowMostly|nowOften|nowOnly|nowUsually|often|only|originally|particularly|perhaps |possibly|primarily|slightly|sometimes|somewhat|usually|very) #IMPLIED value CDATA #REQUIRED>

Main labels types and values

The figures given in the following tables are those found in the version of GLAWI extracted from Wiktionary's 01/06/2017 dump.

diafrequential9473diatopic32594
rare9151UK9205
nonce word274US8428
common48dialect3589
diachronic37779Australia2295
obsolete19812Scotland1971
archaic7828Canada1146
dated4822New Zealand1100
historical4774Ireland840
neologism451Northern England581
no longer productive44India571
1800s28South Africa570
19th century20regional301
semantics4779Geordie245
figuratively2567Southern US188
by extension1493Cockney rhyming slang141
literally259Singapore135
especially182Philippines81
specifically89Yorkshire78
hyperbolic65Jamaica66
loosely46Caribbean66
metonymically40and 50 other areas997
metaphor20domain134694
by analogy18organic chemistry8862
attitudinal32615zoology6484
slang11597chemistry6122
informal7892medicine5872
vulgar2339computing5494
derogatory2106mineralogy4523
humorous1530biochemistry4351
non-standard1517mathematics4275
pejorative1480physics4255
euphemistic928anatomy4221
poetic701law4184
offensive668botany3551
childish316music3396
literary303biology3351
ethnic slur302nautical2559
proscribed259pathology2117
formal218military2070
eye dialect74Internet2007
hypercorrect73astronomy1999
sarcastic66linguistics1917
affectionate59and 364 other domains81610
leetspeak56grammatical52427
and 6 other attitudinal labels131transitive19865
of X (selectionnal restriction)1066idiomatic9264
of a person349intransitive7981
of an animal53uncountable4191
of food40countable3816
of leaves39colloquial3399
of a woman38plural980
of a plant38ambitransitive484
of a man36plurale tantum392
of a horse35combination390
of a function32attributively365
of clothing30not comparable290
of a sound26reflexive160
of people25comparable91
of an antibody24ergative81
of an organism21postpositive79
of a verb20onomatopoeia65
of a vehicle20singulare tantum61
of a liquid20imperative58
of a person or animal19conjunctive52
of an object19cardinal49
and 11 other selectionnal restriction labels182and 13 other grammatical labels314


Back to ENGLAWI's [ main documentation page ] [ project page ]