WiktionaryX XML version of the free collaborative dictionary
An updated version of WiktionaryX, including new additional lexical information,
is available : see GLAWI's page.
Description
Wiktionary is the lexical companion to Wikipedia.
This multilingual dictionary includes glosses, examples, semantic relations and translation links that anyone can modify.
Its content is available as dumps
that includes all articles' wiki code.
This format is not structured and hardly suits NLP usage.
The English and French Wiktionary's dumps have been converted into a workable XML format.
E. Navarro, F. Sajous, B. Gaume, L. Prévot, S. Hsieh, I. Kuo, P. Magistry and Chu-Ren Huang (2009).Wiktionary and NLP: Improving synonymy networks.
In Proceedings of the ACL Workshop on The People's Web Meets NLP: Collaboratively Constructed Semantic Resources.
ACL-IJCNLP 2009, Suntec, Singapore.
Sample
Below is a sample of the English dump which reproduces (partly) the XML element for the computer entry:
<entryform="computer"pageid="2798"><lexemepos="N"id="en_N_computer#0"><defs><toplevel-def><glossdomain="Computing">A programmable device that performs mathematical calculations and logical operations, especially one that can process, store and retrieve large amounts of data very quickly.</gloss></toplevel-def><toplevel-def><glossregister="Dated">A person employed to perform computations.</gloss><example>(1927, J. B. S. Haldane, Possible Worlds and Other Essays, p. 173)Only a few years ago Mr. Powers, an American computer, disproved a hypothesis about prime numbers which had held the field for more than 250 years.</example></toplevel-def></defs><syn><itemtarget="en_N_automatic data processing machine">automatic data processing machine</item><itemtarget="en_N_processor">processor</item><itemtarget="en_N_'puter">'puter</item><itemtarget="en_N_machine">machine</item></syn><hypo><itemtarget="en_N_desktop">desktop</item><itemtarget="en_N_laptop">laptop</item><itemtarget="en_N_computer">computer</item></hypo><trans><itemlang="ar"target="ar_N_حاسوب">حاسوب</item><itemlang="ar"target="ar_N_كمبيوتر">كمبيوتر</item><itemlang="hy"target="hy_N_համակարգիչ">համակարգիչ</item><itemlang="bn"target="bn_N_কম্পিউটার">কম্পিউটার</item><itemlang="bg"target="bg_N_компютър">компютър</item><itemlang="cmn"target="cmn_N_電腦">電腦</item><itemlang="cmn"target="cmn_N_电脑">电脑</item><itemlang="cmn"target="cmn_N_計算機">計算機</item><itemlang="cmn"target="cmn_N_计算机">计算机</item><itemlang="hr"target="hr_N_računalo">računalo</item><itemlang="hr"target="hr_N_kompjutor">kompjutor</item><itemlang="cs"target="cs_N_počítač">počítač</item><itemlang="nl"target="nl_N_computer">computer</item><itemlang="eo"target="eo_N_komputilo">komputilo</item><itemlang="et"target="et_N_arvuti">arvuti</item><itemlang="et"target="et_N_kompuuter">kompuuter</item><itemlang="et"target="et_N_raal">raal</item><itemlang="fi"target="fi_N_tietokone">tietokone</item><itemlang="fr"target="fr_N_ordinateur">ordinateur</item><itemlang="de"target="de_N_Computer">Computer</item><itemlang="de"target="de_N_Rechner">Rechner</item><itemlang="el"target="el_N_υπολογιστής">υπολογιστής</item><itemlang="el"target="el_N_ηλεκτρονικός υπολογιστής">ηλεκτρονικός υπολογιστής</item><itemlang="el"target="el_N_ΗΥ">ΗΥ</item><itemlang="he"target="he_N_מחשב">מחשב</item><itemlang="hi"target="hi_N_संगणक">संगणक</item><itemlang="hi"target="hi_N_कंप्यूटर">कंप्यूटर</item><itemlang="hu"target="hu_N_számítógép">számítógép</item><itemlang="id"target="id_N_komputer">komputer</item><itemlang="ga"target="ga_N_ríomhaire">ríomhaire</item><itemlang="it"target="it_N_calcolatore">calcolatore</item><itemlang="it"target="it_N_computer">computer</item><itemlang="it"target="it_N_elaboratore">elaboratore</item><itemlang="ja"target="ja_N_コンピュータ">コンピュータ</item><itemlang="ja"target="ja_N_電子計算機">電子計算機</item><itemlang="kam"target="kam_N_kompiuta">kompiuta</item><itemlang="ki"target="ki_N_mompyuta">mompyuta</item><itemlang="ki"target="ki_N_kompyuta">kompyuta</item><itemlang="kis"target="kis_N_ekompyuta">ekompyuta</item><itemlang="ko"target="ko_N_컴퓨터">컴퓨터</item><itemlang="ko"target="ko_N_전자계산기">전자계산기</item><itemlang="ko"target="ko_N_電子計算機">電子計算機</item><itemlang="la"target="la_N_computatrum">computatrum</item><itemlang="la"target="la_N_ordinatrum">ordinatrum</item><itemlang="lv"target="lv_N_dators">dators</item><itemlang="lv"target="lv_N_kompjūters">kompjūters</item><itemlang="lt"target="lt_N_kompiuteris">kompiuteris</item><itemlang="luy"target="luy_N_ekompyuta">ekompyuta</item><itemlang="mk"target="mk_N_сметач">сметач</item><itemlang="mk"target="mk_N_компјутер">компјутер</item><itemlang="ms"target="ms_N_komputer">komputer</item><itemlang="mt"target="mt_N_kompjuter">kompjuter</item><itemlang="mi"target="mi_N_rorohiko">rorohiko</item><itemlang="mer"target="mer_N_kompyuta">kompyuta</item><itemlang="nv"target="nv_N_béésh bee akʼeʼelchíhí tʼáá bí nitsékeesígíí">béésh bee akʼeʼelchíhí tʼáá bí nitsékeesígíí</item><itemlang="no"target="no_N_datamaskin">datamaskin</item><itemlang="fa"target="fa_N_رایانه">رایانه</item><itemlang="fa"target="fa_N_کامپیوتر">کامپیوتر</item><itemlang="pl"target="pl_N_komputer">komputer</item><itemlang="pt"target="pt_N_computador">computador</item><itemlang="ro"target="ro_N_computer">computer</item><itemlang="ro"target="ro_N_calculator">calculator</item><itemlang="ru"target="ru_N_компьютер">компьютер</item><itemlang="se"target="se_N_dihtor">dihtor</item><itemlang="sa"target="sa_N_अभिकलित्र">अभिकलित्र</item><itemlang="gd"target="gd_N_coimpiutair">coimpiutair</item><itemlang="gd"target="gd_N_rianadair">rianadair</item><itemlang="gd"target="gd_N_annalair">annalair</item><itemlang="sk"target="sk_N_počítač">počítač</item><itemlang="sl"target="sl_N_računalnik">računalnik</item><itemlang="st"target="st_N_khomputa">khomputa</item><itemlang="es"target="es_N_computador">computador</item><itemlang="es"target="es_N_computadora">computadora</item><itemlang="es"target="es_N_ordenador">ordenador</item><itemlang="sw"target="sw_N_tarakilishi">tarakilishi</item><itemlang="sv"target="sv_N_dator">dator</item><itemlang="tl"target="tl_N_kompyuter">kompyuter</item><itemlang="tg"target="tg_N_компутар">компутар</item><itemlang="tg"target="tg_N_компютер">компютер</item><itemlang="ta"target="ta_N_கணினி">கணினி</item><itemlang="th"target="th_N_คอมพิวเตอร์">คอมพิวเตอร์</item><itemlang="tr"target="tr_N_bilgisayar">bilgisayar</item><itemlang="vi"target="vi_N_máy vi tính">máy vi tính</item><itemlang="vi"target="vi_N_máy điện toán">máy điện toán</item><itemlang="vi"target="vi_N_máy tính">máy tính</item><itemlang="yi"target="yi_N_קאָמפּיוטער">קאָמפּיוטער</item><itemlang="bg"target="bg_N_изчислител">изчислител</item><itemlang="cs"target="cs_N_sčítač">sčítač</item><itemlang="eo"target="eo_N_komputisto">komputisto</item><itemlang="pl"target="pl_N_rachmistrz">rachmistrz</item></trans></lexeme></entry>