REDAC
REsources Developed At CLLE-ERSS CLLE-ERSS research unit






WiktionaryX
XML version of the free collaborative dictionary
An updated version of WiktionaryX, including new additional lexical information, is available : see GLAWI's page.

Description

Wiktionary is the lexical companion to Wikipedia. This multilingual dictionary includes glosses, examples, semantic relations and translation links that anyone can modify.

Its content is available as dumps that includes all articles' wiki code. This format is not structured and hardly suits NLP usage. The English and French Wiktionary's dumps have been converted into a workable XML format.


Person in charge
Franck Sajous
Contact :

Licence
WiktionaryX is available under a Creative Commons By-SA licence (the same licence as Wiktionary, from which it has been extracted).

Download

References
BIBTEX ]

Sample
Below is a sample of the English dump which reproduces (partly) the XML element for the computer entry:
<entry form="computer" pageid="2798">
        <lexeme pos="N" id="en_N_computer#0">
                <defs>
                        <toplevel-def>
                                <gloss domain="Computing">A programmable device that performs mathematical
calculations and logical operations, especially one that can process, store and retrieve large amounts of data
very quickly.</gloss> </toplevel-def> <toplevel-def> <gloss register="Dated">A person employed to perform computations.</gloss> <example>(1927, J. B. S. Haldane, Possible Worlds and Other Essays, p. 173)Only a
few years ago Mr. Powers, an American computer, disproved a hypothesis about prime numbers which had held the field
for more than 250 years.</example> </toplevel-def> </defs> <syn> <item target="en_N_automatic data processing machine">automatic data processing machine</item> <item target="en_N_processor">processor</item> <item target="en_N_'puter">'puter</item> <item target="en_N_machine">machine</item> </syn> <hypo> <item target="en_N_desktop">desktop</item> <item target="en_N_laptop">laptop</item> <item target="en_N_computer">computer</item> </hypo> <trans> <item lang="ar" target="ar_N_حاسوب">حاسوب</item> <item lang="ar" target="ar_N_كمبيوتر">كمبيوتر</item> <item lang="hy" target="hy_N_համակարգիչ">համակարգիչ</item> <item lang="bn" target="bn_N_কম্পিউটার">কম্পিউটার</item> <item lang="bg" target="bg_N_компютър">компютър</item> <item lang="cmn" target="cmn_N_電腦">電腦</item> <item lang="cmn" target="cmn_N_电脑">电脑</item> <item lang="cmn" target="cmn_N_計算機">計算機</item> <item lang="cmn" target="cmn_N_计算机">计算机</item> <item lang="hr" target="hr_N_računalo">računalo</item> <item lang="hr" target="hr_N_kompjutor">kompjutor</item> <item lang="cs" target="cs_N_počítač">počítač</item> <item lang="nl" target="nl_N_computer">computer</item> <item lang="eo" target="eo_N_komputilo">komputilo</item> <item lang="et" target="et_N_arvuti">arvuti</item> <item lang="et" target="et_N_kompuuter">kompuuter</item> <item lang="et" target="et_N_raal">raal</item> <item lang="fi" target="fi_N_tietokone">tietokone</item> <item lang="fr" target="fr_N_ordinateur">ordinateur</item> <item lang="de" target="de_N_Computer">Computer</item> <item lang="de" target="de_N_Rechner">Rechner</item> <item lang="el" target="el_N_υπολογιστής">υπολογιστής</item> <item lang="el" target="el_N_ηλεκτρονικός υπολογιστής">ηλεκτρονικός υπολογιστής</item> <item lang="el" target="el_N_ΗΥ">ΗΥ</item> <item lang="he" target="he_N_מחשב">מחשב</item> <item lang="hi" target="hi_N_संगणक">संगणक</item> <item lang="hi" target="hi_N_कंप्यूटर">कंप्यूटर</item> <item lang="hu" target="hu_N_számítógép">számítógép</item> <item lang="id" target="id_N_komputer">komputer</item> <item lang="ga" target="ga_N_ríomhaire">ríomhaire</item> <item lang="it" target="it_N_calcolatore">calcolatore</item> <item lang="it" target="it_N_computer">computer</item> <item lang="it" target="it_N_elaboratore">elaboratore</item> <item lang="ja" target="ja_N_コンピュータ">コンピュータ</item> <item lang="ja" target="ja_N_電子計算機">電子計算機</item> <item lang="kam" target="kam_N_kompiuta">kompiuta</item> <item lang="ki" target="ki_N_mompyuta">mompyuta</item> <item lang="ki" target="ki_N_kompyuta">kompyuta</item> <item lang="kis" target="kis_N_ekompyuta">ekompyuta</item> <item lang="ko" target="ko_N_컴퓨터">컴퓨터</item> <item lang="ko" target="ko_N_전자계산기">전자계산기</item> <item lang="ko" target="ko_N_電子計算機">電子計算機</item> <item lang="la" target="la_N_computatrum">computatrum</item> <item lang="la" target="la_N_ordinatrum">ordinatrum</item> <item lang="lv" target="lv_N_dators">dators</item> <item lang="lv" target="lv_N_kompjūters">kompjūters</item> <item lang="lt" target="lt_N_kompiuteris">kompiuteris</item> <item lang="luy" target="luy_N_ekompyuta">ekompyuta</item> <item lang="mk" target="mk_N_сметач">сметач</item> <item lang="mk" target="mk_N_компјутер">компјутер</item> <item lang="ms" target="ms_N_komputer">komputer</item> <item lang="mt" target="mt_N_kompjuter">kompjuter</item> <item lang="mi" target="mi_N_rorohiko">rorohiko</item> <item lang="mer" target="mer_N_kompyuta">kompyuta</item> <item lang="nv" target="nv_N_béésh bee akʼeʼelchíhí tʼáá bí
nitsékeesígíí"
>béésh bee akʼeʼelchíhí tʼáá bí nitsékeesígíí</item> <item lang="no" target="no_N_datamaskin">datamaskin</item> <item lang="fa" target="fa_N_رایانه">رایانه</item> <item lang="fa" target="fa_N_کامپیوتر">کامپیوتر</item> <item lang="pl" target="pl_N_komputer">komputer</item> <item lang="pt" target="pt_N_computador">computador</item> <item lang="ro" target="ro_N_computer">computer</item> <item lang="ro" target="ro_N_calculator">calculator</item> <item lang="ru" target="ru_N_компьютер">компьютер</item> <item lang="se" target="se_N_dihtor">dihtor</item> <item lang="sa" target="sa_N_अभिकलित्र">अभिकलित्र</item> <item lang="gd" target="gd_N_coimpiutair">coimpiutair</item> <item lang="gd" target="gd_N_rianadair">rianadair</item> <item lang="gd" target="gd_N_annalair">annalair</item> <item lang="sk" target="sk_N_počítač">počítač</item> <item lang="sl" target="sl_N_računalnik">računalnik</item> <item lang="st" target="st_N_khomputa">khomputa</item> <item lang="es" target="es_N_computador">computador</item> <item lang="es" target="es_N_computadora">computadora</item> <item lang="es" target="es_N_ordenador">ordenador</item> <item lang="sw" target="sw_N_tarakilishi">tarakilishi</item> <item lang="sv" target="sv_N_dator">dator</item> <item lang="tl" target="tl_N_kompyuter">kompyuter</item> <item lang="tg" target="tg_N_компутар">компутар</item> <item lang="tg" target="tg_N_компютер">компютер</item> <item lang="ta" target="ta_N_கணினி">கணினி</item> <item lang="th" target="th_N_คอมพิวเตอร์">คอมพิวเตอร์</item> <item lang="tr" target="tr_N_bilgisayar">bilgisayar</item> <item lang="vi" target="vi_N_máy vi tính">máy vi tính</item> <item lang="vi" target="vi_N_máy điện toán">máy điện toán</item> <item lang="vi" target="vi_N_máy tính">máy tính</item> <item lang="yi" target="yi_N_קאָמפּיוטער">קאָמפּיוטער</item> <item lang="bg" target="bg_N_изчислител">изчислител</item> <item lang="cs" target="cs_N_sčítač">sčítač</item> <item lang="eo" target="eo_N_komputisto">komputisto</item> <item lang="pl" target="pl_N_rachmistrz">rachmistrz</item> </trans> </lexeme> </entry>