Linguistic labels
Description
Linguistic labels are indicators found in definitions
that signal a particular usage of a word: period, geographic area, specialized domain or subculture slang, etc.
We inventoried and normalized labels in order to:
- remove labels from the textual content of glosses and examples;
- encode them formally in dedicated children markups of glosses and examples elements
(see an illustration in the definitions page).
We assigned the linguistic labels to categories (diatopic, diachronic, attitudinal, etc.)
that are not encoded in Wiktionary.
These categories are the labels' type attribute.
Examples of attribute values, per type, are given below.
XML structure of labels
Main labels types and values
The figures given in the following tables are those found in the version of GLAWI extracted
from Wiktionary's 01/06/2017 dump.
rare | 9151 | UK | 9205 |
nonce word | 274 | US | 8428 |
common | 48 | dialect | 3589 |
Australia | 2295 |
obsolete | 19812 | Scotland | 1971 |
archaic | 7828 | Canada | 1146 |
dated | 4822 | New Zealand | 1100 |
historical | 4774 | Ireland | 840 |
neologism | 451 | Northern England | 581 |
no longer productive | 44 | India | 571 |
1800s | 28 | South Africa | 570 |
19th century | 20 | regional | 301 |
Geordie | 245 |
figuratively | 2567 | Southern US | 188 |
by extension | 1493 | Cockney rhyming slang | 141 |
literally | 259 | Singapore | 135 |
especially | 182 | Philippines | 81 |
specifically | 89 | Yorkshire | 78 |
hyperbolic | 65 | Jamaica | 66 |
loosely | 46 | Caribbean | 66 |
metonymically | 40 | and 50 other areas | 997 |
metaphor | 20 |
by analogy | 18 | organic chemistry | 8862 |
zoology | 6484 |
slang | 11597 | chemistry | 6122 |
informal | 7892 | medicine | 5872 |
vulgar | 2339 | computing | 5494 |
derogatory | 2106 | mineralogy | 4523 |
humorous | 1530 | biochemistry | 4351 |
non-standard | 1517 | mathematics | 4275 |
pejorative | 1480 | physics | 4255 |
euphemistic | 928 | anatomy | 4221 |
poetic | 701 | law | 4184 |
offensive | 668 | botany | 3551 |
childish | 316 | music | 3396 |
literary | 303 | biology | 3351 |
ethnic slur | 302 | nautical | 2559 |
proscribed | 259 | pathology | 2117 |
formal | 218 | military | 2070 |
eye dialect | 74 | Internet | 2007 |
hypercorrect | 73 | astronomy | 1999 |
sarcastic | 66 | linguistics | 1917 |
affectionate | 59 | and 364 other domains | 81610 |
leetspeak | 56 |
and 6 other attitudinal labels | 131 | transitive | 19865 |
idiomatic | 9264 |
of a person | 349 | intransitive | 7981 |
of an animal | 53 | uncountable | 4191 |
of food | 40 | countable | 3816 |
of leaves | 39 | colloquial | 3399 |
of a woman | 38 | plural | 980 |
of a plant | 38 | ambitransitive | 484 |
of a man | 36 | plurale tantum | 392 |
of a horse | 35 | combination | 390 |
of a function | 32 | attributively | 365 |
of clothing | 30 | not comparable | 290 |
of a sound | 26 | reflexive | 160 |
of people | 25 | comparable | 91 |
of an antibody | 24 | ergative | 81 |
of an organism | 21 | postpositive | 79 |
of a verb | 20 | onomatopoeia | 65 |
of a vehicle | 20 | singulare tantum | 61 |
of a liquid | 20 | imperative | 58 |
of a person or animal | 19 | conjunctive | 52 |
of an object | 19 | cardinal | 49 |
and 11 other selectionnal restriction labels | 182 | and 13 other grammatical labels | 314 |
Back to ENGLAWI's [ main documentation page ] [ project page ]