Looking for XML and dictionaries I found a free dictionary with said
amount of words. It is based on the "Webster's Revised Unabridged
Dictionary Version published 1913" and on WordNet, a semantic network
created by the Cognitive Science Department of Princeton University"
..This electronic dictionary is published under the GPL. It is using
"GCIDE_XML".
There are also standards for this kind of content; ISO 12620, a JAVA
application from IBM "Dictionary and Thesaurus API for Java", an article
how the OED Oxford English Dictionary uses SGML for its electronic
content [http://www.ariadne.ac.uk/issue24/oed-tech/]
I am still only scratching the surface...
http://www.w3.org/2001/sw/Europe/reports/thes/
Semantic web..
The list goes on. What I am now looking for is an aricle that explains
in simple terms what the relations between all these systems are and
what they are good for.
Thanks,
Gerard
>When you answer e-mail from the mailing list, please do not repeat
>everything in the digest. It makes some people very upset.. Please
>feel free to contribute on the wiktionary of your choice. There is much
>work that needs to be done, both in English and your own language.
>Ray (Eclecticology)
Thanks Ray,
I saw my mistake too late. I hope I'll do better next time.
Your contributes seem very interesting but too difficult to me (I'm a simple
linguistic/lexicography PHD). I'll follow you at the moment and as soon as
I'll somethings quite cleaver to say I'll do it.
If you needs somthing in French or Italian make me know.
Thank you
Marìka
There is a need for using XML with wiktionary.
The definitions of a word in wiktionary, can be structured in a fixed
way. For each word/phrase you have a:
*Indication what language it is in
*Name of the word/phrase
*Definition of the word/phrase
*Translations
*Pronounciation
*Synonyms
*Antonyms etc
I do not try to be complete here, but my point is, the data is
structured. Other organisations that work with words already structure
their data using XML for instance GEMET. When Wiktionary is structured
exlicitly, the result will be that the import and export from Wiktionary
becomes possible and it will become possible for other dictionary/
glossary project to get a changed content in XML format that is specific
for working with words. This will enhance the importance of wiktionary
and it will help achieve out aim which is open accessible dictionary
content.
The flip side of the coin is that we can get {changed) content from
other dictionary/ glossary projects for inclusion in wiktionary.
The GEMET data is available as XML data and, it would be great to import
it straight in from XML.
*Issues:
#Using XML standards for dictionary content.
#Importing data / Exporting data using the current wiktionary content.
#Structuring wiktionary using MySQL tables.
#Importing data / Exporting data using the future wiktionary structure.
NB I have posted this on META as well
Thanks,
GerardM
Hi, as I said before I created a language table to be used to code lists
of terms in order to easily transfer lists of translations between one
and another wiki.
I inserted English and German for now and will add Italian asap. If
Gerard doesn't have time I'll add Dutch as well (please let me know).
So all other languages are needed. I would therefore kindly ask you to
complete the list for your language/wiktionary or to ask for help on the
relative wiktionary.
For the moment I inserted more info in my wiki here:
http://wiki.wesolveitnet.com/wakka.php?wakka=HelpNeeded
There's also the download link - please absolutely let me know if you
start working on a language in order to avoid double work. I am going to
pass this message to some translator's lists as well asking for help.
Thank you and have a great day!
Sabine
--
Sabine Cretella
s.cretella(a)wordsandmore.it
www.wordsandmore.it
Meetingplace for translators
www.wesolveitnet.com
A message I just received from Daniele:
************************************************************************
-------------------
XML is the new best standard to export information without a fixed format.
It allows us to import the data and show them as you want.
The problem of XML file is the type of encoding of the file.
We will know the type of encoding used in wiktionary to store data.
Other problem is the program to access XML data.
To create an all-platform-compatible program, best way is using Java.
All Windows, Mac, Linux, PocketPC, SmartPhone, PalmOS, Symbian, BeOS and
other OS have a version of Java Virtual Machine.
J2ME for embedded devices but it's a subdomain of J2SE.
-------------------
by
Ing. Lele
*************************************************************************
I just had a look at the Japanese and Chinese version - as much as I can understand wiktionary and wikipedia use utf-8 encoding, but not with European languages.
Is there a reason for this?
Now let's se a moment this scenario:
1) there are different wiktionaries
2) there are different free dictionaries
3) thera are different multilingual tables with spedfic content
Everywhere people work on one theme: creating Open Content in terms of translations. Work is being done two, three, four or even more times. To my opinion this is not necessary.
Working e.g. with MySQL databases xml is not a problem - and if we can concentrate on one single structure that feeds all of it - this would really be great. Efforts wouldn't be doubled anymore - instead of double or triple work we will have double or triple results.
Can you imagine any how to's - and moreover: how to organise this work?
Since many translators don't like working online I could imagine a dictionary software like the one Daniele wants to create for data input and proof reading - maybe combining things with OmegaT. Then take the encoded data and transfer it to the wiktionary.
This way we would have.
1) entries to be used in all wiktionaries (as editing should be in a multilanguage file or at least in part of it)
2) multilanguage tables of specific themes from where we easily can create bilingual tables
3) general dictionaries for the offline-user who can go to wiktionary when he desires additional information
4) what else?
For now: going back to my table :-)
Ciao, Sabine
--
Sabine Cretella
s.cretella(a)wordsandmore.it
www.wordsandmore.it
Meetingplace for translators
www.wesolveitnet.com
Ashar Voultoiz wrote:
> Hello !
>
> Sometime ago there was a lot of discussion about a database shema
> change. I am wondering if that's a dead-end horse or if we can
> eventually start discussion again and eventually got the changes in
> for version 1.4 ?
>
> I have some ideas of new fields that can help, and probably a friend
> who is a database designer who is willing to help review our shema
> (although he is writing a book actually).
>
>
> Or maybe we just add new tables and new fields whenever we want new
> features, but I am not sure it is the best solution.
>
When there is room for some new database tables, I am working on some
idea's for wiktionary. These tables would be relational.
I am thinking about these tables and relations
Word table
Language table
Meaning table
WordType table
Translation table
Idiom table
Synonym / Antonym etc table
A Wiktionary page has one or more Words
A Word has a Language
A Word has one or more Meanings
A Meaning has a WordType
A Word has one or more Translations
A Translation may have a Word in another Language, and a defenition in
the same Language.
Synonym etc has another Word in the same Language and an indication what
kind of relation exists.
These are the initial ideas that I have, I am still toying with them. I
think they will allow one database to contain all words in all
languages. It will result in a more integrated wiktionary whereby
efforts will be a benefit within the whole wiktionary environment.
NB At this stage I am thinking about structure; user interface I have
not yet considered the user interface. The user interface has to be
intuitive and has to accomodate several languages.
Thanks,
GerardM
I am working on uploading the GEMET data into Wiktionary. The GEMET data
does specify who phrased the defention to a term (they are typically
institutions). From a GNU-FDL point of view we need to attribute where
this infomation came from.
There are three ways of doing this:
*Upload them with a "GEMET" user so that the source is them and not me
and have the source in there as text.
*Upload them with a user depending of the institution that defined the term
*Upload them with my own user, and have the source GEMET and the
institution in text.
Personally I think the second one is the most correct. The consequence
is however that many users will be created. Any other ideas ??
Thanks,
GerardM
Hi,
I'm Marika, I have subscribed to your mailing because of I'm interesting in
dictionary.
Thank you
Marìka
----- Original Message -----
From: <wiktionary-l-request(a)Wikipedia.org>
To: <wiktionary-l(a)Wikipedia.org>
Sent: Sunday, September 05, 2004 3:31 AM
Subject: Wiktionary-l Digest, Vol 4, Issue 3
> Send Wiktionary-l mailing list submissions to
> wiktionary-l(a)Wikipedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mail.wikipedia.org/mailman/listinfo/wiktionary-l
> or, via email, send a message with subject or body 'help' to
> wiktionary-l-request(a)Wikipedia.org
>
> You can reach the person managing the list at
> wiktionary-l-owner(a)Wikipedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wiktionary-l digest..."
>
>
> Today's Topics:
>
> 1. ISO-639 + Glossaries / vocabulary lists / thematical lists
> (Sabine Cretella)
> 2. Re: ISO-639 + Glossaries / vocabulary lists / thematical
> lists (Gerard Meijssen)
> 3. Re: The need for XML in a wiktionary context (Sabine Cretella)
> 4. Re: ISO-639 + Glossaries / vocabulary lists / thematical
> lists (Sabine Cretella)
> 5. Re: The need for XML in a wiktionary context (Ray Saintonge)
> 6. Re: The need for XML in a wiktionary context (Gerard Meijssen)
> 7. Re: The need for XML in a wiktionary context (Andrew Dunbar)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 04 Sep 2004 11:19:56 +0200
> From: Sabine Cretella <sabine_cretella(a)yahoo.it>
> Subject: [Wiktionary-l] ISO-639 + Glossaries / vocabulary lists /
> thematical lists
> To: Gerard Meijssen <gerardm(a)myrealbox.com>
> Cc: wiktionary-l(a)Wikipedia.org
> Message-ID: <413988BC.4020004(a)yahoo.it>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Hi Gerard and all of you,
>
> thinking about the code I was just considering some points.
>
> What I noted on the page you gave me for the Italian version of the
> ISO-code is that you use a mixed version for language identifiers - the
> two letter code and where there's no two letter code the three letter
> code - is this correct? I also noted that not all languages are present
> in the ISO-3-letter-code - so they are standardised, but not completely.
> This would obviously lead to an own wiktionary standard.
>
> I am asking as I thought about compiling a list of the used language
> codes for wiktionary and then add the several translations of the
> languages names asking freinds and colleagues to complete the list.
> Normally in the translation world the two letter code is used.
>
> I'll then add the list to my sourceforge project (wsi-glossary:
> http://sourceforge.net/projects/wsi-glossary/) you can see who is
> contributing right now with integrations to the lists here:
> http://wiki.wesolveitnet.com/wakka.php?wakka=WsiGlossaryContributors.
>
> I should modify licensing (mine up to now was the same as the one used
> for the OmegaT manual to GNU FDL - I have to check out if this is
> possible without problems on sourceforge net. I am to new to OpenContent
> to know all about this - so another thing to be done immediately).
>
> If you are working on a multilanguage list e.g. of trees, birds,
> vegetables etc. etc. please consider seriously to have these lists
> integrated by other people as well and have it ready somewhere for
> download or just integrate it into wsi-glossary. Certain kinds of work
> can be done even by schools in language lessons - e.g. the Italian
> Thesaurus for OpenOffice.org was created with the help of a school where
> the teachers were the team leaders and during the classes the pupils did
> something that made "sense" to them. Having them work directly in
> wiktionary online is impossible for most schools as computers don't have
> Internet access (or only a few of them) and so working on tables is much
> easier.
>
> If you prefer not to hand out the list: give out single terms or gourps
> of terms like this:
>
> I need these term(s)
> house
> cat
> mouse
> etc.
>
> in the following languages:
> German
> French
> Italian
> etc.
>
> I can then publish these parts or on my portal or send the request to
> different lists of translators - so step by step it is possible to
> integrate and improve.
>
> Best wishes from Italy,
>
> Sabine
>
> --
> Sabine Cretella
> s.cretella(a)wordsandmore.it
> www.wordsandmore.it
> Meetingplace for translators
> www.wesolveitnet.com
>
>
>
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Sat, 04 Sep 2004 11:45:16 +0200
> From: Gerard Meijssen <gerardm(a)myrealbox.com>
> Subject: [Wiktionary-l] Re: ISO-639 + Glossaries / vocabulary lists /
> thematical lists
> To: Sabine Cretella <sabine_cretella(a)yahoo.it>
> Cc: wiktionary-l(a)Wikipedia.org
> Message-ID: <41398EAC.1000801(a)myrealbox.com>
> Content-Type: text/plain; charset=windows-1252; format=flowed
>
> Sabine Cretella wrote:
>
> > Hi Gerard and all of you,
> >
> > thinking about the code I was just considering some points.
> >
> > What I noted on the page you gave me for the Italian version of the
> > ISO-code is that you use a mixed version for language identifiers -
> > the two letter code and where there's no two letter code the three
> > letter code - is this correct? I also noted that not all languages are
> > present in the ISO-3-letter-code - so they are standardised, but not
> > completely. This would obviously lead to an own wiktionary standard.
> >
> > I am asking as I thought about compiling a list of the used language
> > codes for wiktionary and then add the several translations of the
> > languages names asking freinds and colleagues to complete the list.
> > Normally in the translation world the two letter code is used.
> >
> > I'll then add the list to my sourceforge project (wsi-glossary:
> > http://sourceforge.net/projects/wsi-glossary/) you can see who is
> > contributing right now with integrations to the lists here:
> > http://wiki.wesolveitnet.com/wakka.php?wakka=WsiGlossaryContributors.
> >
> > I should modify licensing (mine up to now was the same as the one used
> > for the OmegaT manual to GNU FDL - I have to check out if this is
> > possible without problems on sourceforge net. I am to new to
> > OpenContent to know all about this - so another thing to be done
> > immediately).
> >
> > If you are working on a multilanguage list e.g. of trees, birds,
> > vegetables etc. etc. please consider seriously to have these lists
> > integrated by other people as well and have it ready somewhere for
> > download or just integrate it into wsi-glossary. Certain kinds of work
> > can be done even by schools in language lessons - e.g. the Italian
> > Thesaurus for OpenOffice.org was created with the help of a school
> > where the teachers were the team leaders and during the classes the
> > pupils did something that made "sense" to them. Having them work
> > directly in wiktionary online is impossible for most schools as
> > computers don't have Internet access (or only a few of them) and so
> > working on tables is much easier.
> >
> > If you prefer not to hand out the list: give out single terms or
> > gourps of terms like this:
> >
> > I need these term(s)
> > house
> > cat
> > mouse
> > etc.
> >
> > in the following languages:
> > German
> > French
> > Italian
> > etc.
> >
> > I can then publish these parts or on my portal or send the request to
> > different lists of translators - so step by step it is possible to
> > integrate and improve.
> >
> > Best wishes from Italy,
> >
> > Sabine
> >
> Wikimedia does use two letter ISO 639 codes and when they do not exist
> they do use the three letter codes. There are missing ISO codes. There
> are also the SIL codes but personally I think mixing these three codes
> makes a mess. Preferably ISO adds missing codes for languages.
>
> For cooperation to work best, things like XML can be considered. GEMET
> uses it, they have people knowledgable regarding thesauri XML open
content.
>
> When you have an application that can import and export XML data, you
> can work off line locally and export the data at the end of the day. The
> start might be the Italian Open Office list and add definitions in
> Italian export it and share it with the world. An even better start
> might be words in another wikipedia with an Italian translation; the
> translations TO Italian are then already known.
>
> The most important thing is to prevent double work and the continued
> checking of the stuff that is available. Start with producing
> definitions for all the Languages. Many translations are available on
> the nl:wiktionary. The articles can be copied to it:wiktionary just add
> content to some templates. They do need checking as well... :)
>
> Thanks,
> Gerard
>
>
> ------------------------------
>
> Message: 3
> Date: Sat, 04 Sep 2004 11:58:10 +0200
> From: Sabine Cretella <sabine_cretella(a)yahoo.it>
> Subject: Re: [Wiktionary-l] The need for XML in a wiktionary context
> To: wiktionary-l(a)Wikipedia.org
> Message-ID: <413991B2.20806(a)yahoo.it>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Geerd, please have a look at this:
>
>
> *********************************************************
> <glossword>
> -
> <line>
> <term t1="A" t2="AT" id="4">attuatore</term>
> -
> <defn>
> <trns lang="106">Trieb</trns>
> <trns lang="106">Arbeitszylinder</trns>
> Trieb ->azionatore (comandi idraulici); Arbeitshylinder -> ölhydraulisch
> </defn>
> </line>
> -
> <line>
> <term t1="B" t2="BA" id="62">batteriostatico</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">bakteriostatisch</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="B" t2="BO" id="54">bovino</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">bovin</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="C" t2="CE" id="33">cellule ematiche</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">Blutzellen</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="C" t2="CO" id="37">concentrazione serica</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">Serumkonzentration</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="E" t2="EM" id="27">ematico</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">Hämato-</trns>
> <trns lang="106">Blut-</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="E" t2="EN" id="8">endovenoso</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">intravenös</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="E" t2="ET" id="5">etambutolo</term>
> -
> <defn>
> <trns lang="106">Ethambutol</trns>
> <trns lang="103">ethambutol</trns>
> <src>http://www.gesundheit.de/roche/ro10000/r10785.html</src>
> </defn>
> </line>
> -
> <line>
> <term t1="F" t2="FL" id="24">fleboclisi</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">intravenöse Infusion</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="I" t2="IN" id="10">iniezione endovenosa</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">intravenöse Injektion</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="L" t2="LE" id="77">legno impiallacciato</term>
> -
> <defn>
> <abbr lang="025"/>
> <trns lang="106">furniertes Holz</trns>
> <src>eurodicautom</src>
> </defn>
> </line>
> -
> <line>
> <term t1="L" t2="LE" id="71">legno tamburato</term>
> -
> <defn>
> <abbr lang="025"/>
> <trns lang="106">furniertes Holz</trns>
> <src>eurodicautom</src>
> </defn>
> </line>
> -
> <line>
> <term t1="M" t2="MA" id="44">mass media</term>
> -
> <defn>
> <trns lang="106">Massenmedien</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="O" t2="OC" id="51">ocratossina</term>
> -
> <defn>
> <abbr lang="054"/>
> <trns lang="106">Ochratoxin</trns>
> -
> <src>
> http://www.verbraucherministerium.de/forschungsreport/rep2-99/ochra.htm
> </src>
> -
> <src>
> Schimmelpilze sind in der Lage, unter sehr unterschiedlichen Bedingungen
> giftige Substanzen (Mykotoxine) zu bilden, die in der gesamten
> Nahrungskette vorkommen können. Zu den bekanntesten Vertretern zählt das
> Ochratoxin A, das von bestimmten Penicillium- und Aspergillus-Arten
> gebildet wird. Dieses Toxin kann die Nieren und das Immunsystem
> schädigen und zeigt im Tierversuch eine kanzerogene Wirkung. Der Aufgabe
> des Gesetzgebers, einen umfassenden Schutz der Verbraucher zu
> gewährleisten, ist das Bundesministerium für Gesundheit (BMG)
> nachgekommen und hat eine wissenschaftliche Studie über Ochratoxin A
> initiiert und gefördert, die 1996 begonnen und 1999 abgeschlossen wurde.
> Das Projekt hatte zum Ziel, Verzehrsdaten auf epidemiologischer
> Grundlage zu ermitteln und Ochratoxin A in relevanten Lebensmitteln
> sowie im Blutserum von Probanden zu bestimmen. Aus der Verknüpfung der
> Ergebnisse von Verzehrsdaten, Lebensmitteluntersuchungen und
> Blutserum-Analysen können eine fundierte Beurteilung der tatsächlichen
> Exposition der Bevölkerung in Deutschland erstellt und Empfehlungen für
> eine Höchstmengenregelung abgeleitet werden. (weiteres auf der Website)
> </src>
> </defn>
> </line>
> -
> <line>
> <term t1="P" t2="PE" id="17">per via endovenosa</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">intravenös</trns>
> </defn>
> </line>
> -
> <line>
> <term t1="S" t2="SO" id="68">soluzione fisiologica</term>
> -
> <defn>
> <abbr lang="027"/>
> <trns lang="106">Ringer Lösung</trns>
> </defn>
> </line>
> </glossword>
>
> ****************************************************
> This was the original glossary project where I tried to ask people for
> co-operation, but online I had just two members trying to help ...
> that's why I passed to .csv tables that can then easily be converted
> itno xml in a second stage (most people don't like working online even
> if they are connected 24 hrs a day ...). Glossword (www.glossword.info)
> is an OpenSource project - Sorceforge page:
> http://sourceforge.net/projects/glossword/
>
> Instead of ISO-language code Glossword uses numbers to identify
> languages. This is a minor problem as it is easy to change to ISO using
> search and replace.
>
> My installation of the software can be found here:
> www.dict.wesolveitnet.com. If you'd like to try around there, just let
> me know I'll then create a user account with access to all dics. Really
> I am not sure if this can be useful ... maybe just for conversion issues?
>
> Ciao, Sabine
>
> *************
>
> Sabine Cretella
> s.cretella(a)wordsandmore.it
> www.wordsandmore.it
> Meetingplace for translators
> www.wesolveitnet.com
>
>
>
> >
>
>
> ------------------------------
>
> Message: 4
> Date: Sat, 04 Sep 2004 12:16:42 +0200
> From: Sabine Cretella <sabine_cretella(a)yahoo.it>
> Subject: [Wiktionary-l] Re: ISO-639 + Glossaries / vocabulary lists /
> thematical lists
> To: Marc Prior <mail(a)marcprior.de>, wiktionary-l(a)Wikipedia.org
> Message-ID: <4139960A.9080909(a)yahoo.it>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Hi Gerard,
>
> > Wikimedia does use two letter ISO 639 codes and when they do not exist
> > they do use the three letter codes. There are missing ISO codes. There
> > are also the SIL codes but personally I think mixing these three codes
> > makes a mess. Preferably ISO adds missing codes for languages.
>
> I completely agree to this.
>
> >
> > For cooperation to work best, things like XML can be considered. GEMET
> > uses it, they have people knowledgable regarding thesauri XML open
> > content.
>
> XML is one of the best solutions as the standard for CAT (computer aided
> translation) software is tmx for memories and tbx for glossaries - and
> this is nothing else than "definite" xml codes.
>
> >
> > When you have an application that can import and export XML data, you
> > can work off line locally and export the data at the end of the day.
> > The start might be the Italian Open Office list and add definitions in
> > Italian export it and share it with the world. An even better start
> > might be words in another wikipedia with an Italian translation; the
> > translations TO Italian are then already known.
>
> Hmmm ... data needs to be multilanguage, doesn't it? Or at least must be
> identified by language tags. To edit bilingual data we could use OmegaT
> (that adds language tags to the tmx 1.1 file) - then we could use the
> tmx-file to import data. When new terms are added to a list using the
> old tmx file they will be automatically given as translated so the
> translator just needs to translate the missing part that is stored in a
> new translation memory file in tmx format. OmegaT is java based,
> therefore platform independent and Open Source ... the created files of
> single "words" on the other hand could then be used as glossary files.
> Maybe we could try this out with the translations of the ISO
> languages-table.
>
> >
> > The most important thing is to prevent double work and the continued
> > checking of the stuff that is available. Start with producing
> > definitions for all the Languages. Many translations are available on
> > the nl:wiktionary. The articles can be copied to it:wiktionary just
> > add content to some templates. They do need checking as well... :)
>
> I'll do that right now - I just created a table with the codes and now
> insert the missing English names.
>
> Ciao, Sabine
>
> >
>
>
> ------------------------------
>
> Message: 5
> Date: Sat, 04 Sep 2004 03:57:01 -0700
> From: Ray Saintonge <saintonge(a)telus.net>
> Subject: Re: [Wiktionary-l] The need for XML in a wiktionary context
> To: wiktionary-l(a)Wikipedia.org
> Message-ID: <41399F7D.5020507(a)telus.net>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> The current format is adequate. Your proposal makes no mention of the
> possible sacrifices in terms of ease of editing, a key feature in all
> the wikis. How will flexibility of format be maintained?
>
> Ec
>
> Gerard Meijssen wrote:
>
> > There is a need for using XML with wiktionary.
> >
> > The definitions of a word in wiktionary, can be structured in a fixed
> > way. For each word/phrase you have a:
> > *Indication what language it is in
> > *Name of the word/phrase
> > *Definition of the word/phrase
> > *Translations
> > *Pronounciation
> > *Synonyms
> > *Antonyms etc
> >
> > I do not try to be complete here, but my point is, the data is
> > structured. Other organisations that work with words already structure
> > their data using XML for instance GEMET. When Wiktionary is structured
> > exlicitly, the result will be that the import and export from
> > Wiktionary becomes possible and it will become possible for other
> > dictionary/ glossary project to get a changed content in XML format
> > that is specific for working with words. This will enhance the
> > importance of wiktionary and it will help achieve out aim which is
> > open accessible dictionary content.
> >
> > The flip side of the coin is that we can get {changed) content from
> > other dictionary/ glossary projects for inclusion in wiktionary.
> >
> > The GEMET data is available as XML data and, it would be great to
> > import it straight in from XML.
> >
> > *Issues:
> > #Using XML standards for dictionary content.
> > #Importing data / Exporting data using the current wiktionary content.
> > #Structuring wiktionary using MySQL tables.
> > #Importing data / Exporting data using the future wiktionary structure.
>
>
>
>
>
> ------------------------------
>
> Message: 6
> Date: Sat, 04 Sep 2004 13:45:56 +0200
> From: Gerard Meijssen <gerardm(a)myrealbox.com>
> Subject: Re: [Wiktionary-l] The need for XML in a wiktionary context
> To: wiktionary-l(a)Wikipedia.org
> Message-ID: <4139AAF4.3080906(a)myrealbox.com>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Ray Saintonge wrote:
>
> > The current format is adequate. Your proposal makes no mention of the
> > possible sacrifices in terms of ease of editing, a key feature in all
> > the wikis. How will flexibility of format be maintained?
>
> XML is not to edited by hand. You are absolutely right about that.
> However, the current format is not without its problems. At this moment
> an English word cannot be re-used easily in other languages. Things are
> free formatted at the moment. It would be a good thing if we start
> thinking about creating some database structures for use within
> wiktionary. It would rid us of these dratted templates like {{en}} and
> {{-en-}}. They work, it is the best thing around but they are ugly.
>
> What I propose at this time is to get us thinking about importing and
> exporting in an XML format. And considering changes to enhance the
> functionality within all wiktionaries and the functionality to the
> outside world.
>
> One of the aims of wikimedia is to create open content. By having our
> data in our proprietary format, we do not achieve what can be achieved.
>
> Thanks,
> Gerard
>
>
> ------------------------------
>
> Message: 7
> Date: Sun, 5 Sep 2004 02:30:57 +0100 (BST)
> From: Andrew Dunbar <hippietrail(a)yahoo.com>
> Subject: Re: [Wiktionary-l] The need for XML in a wiktionary context
> To: wiktionary-l(a)Wikipedia.org
> Message-ID: <20040905013057.52937.qmail(a)web53702.mail.yahoo.com>
> Content-Type: text/plain; charset=iso-8859-1
>
> --- Gerard Meijssen <gerardm(a)myrealbox.com> wrote:
> > There is a need for using XML with wiktionary.
> I agree.
>
> > The definitions of a word in wiktionary, can be
> > structured in a fixed way.
> I disagree. But it depends on *how much* structure you
> want.
>
> > For each word/phrase you have a:
> > *Indication what language it is in
> > *Name of the word/phrase
> > *Definition of the word/phrase
> > *Translations
> > *Pronounciation
> > *Synonyms
> > *Antonyms etc
> Actually If we only wanted to structure these parts it
> would work ok. Many other properties of words and
> phrases are a lot more difficult, such as
> part-of-speech.
>
> > I do not try to be complete here, but my point is,
> > the data is structured. Other organisations that
> > work with words already structure their data using
> > XML for instance GEMET.
> > When Wiktionary is structured exlicitly, the result
> > will be that the import and export from Wiktionary
> > becomes possible and it will become possible for
> > other dictionary/ glossary project to get a changed
> > content in XML format that is specific for working
> > with words. This will enhance the importance of
> > wiktionary and it will help achieve out aim which is
> > open accessible dictionary content.
> >
> > The flip side of the coin is that we can get
> > {changed) content from other dictionary/ glossary
> > projects for inclusion in wiktionary.
> >
> > The GEMET data is available as XML data and, it
> > would be great to import it straight in from XML.
> >
> > *Issues:
> > #Using XML standards for dictionary content.
> > #Importing data / Exporting data using the current
> > wiktionary content.
> > #Structuring wiktionary using MySQL tables.
> > #Importing data / Exporting data using the future
> > wiktionary structure.
> >
> > NB I have posted this on META as well
>
> I do think a dictionary requires structure which an
> encyclopedia does not. A very loose structure like you
> have described would be a benefit for Wiktionary.
> The problems I see are these:
> 1. Once we have some structure people will push for
> more structure such as part-of-speech, not
> realizing
> how difficult that is to get right in a
> multilingual
> dictionary.
> 2. To work with the wiki software we can have a tool/
> script/routine which maps from internal XML into
> wiki/HTML so it can be displayed.
> 3. People will have to input XML, or we need a
> friendly
> interface which can take input from non-expert
> users
> and turn it into correct XML.
>
> Number 3 would mean a *lot* of work for developers.
>
> Andrew (hippietrail).
>
> > Thanks,
> > GerardM
> >
> > _______________________________________________
> > Wiktionary-l mailing list
> > Wiktionary-l(a)Wikipedia.org
> >
> http://mail.wikipedia.org/mailman/listinfo/wiktionary-l
> >
>
> =====
> http://linguaphile.sf.net/cgi-bin/translator.plhttp://www.abisource.com
>
>
>
>
>
> ___________________________________________________________ALL-NEW Yahoo!
Messenger - all new features - even more fun! http://uk.messenger.yahoo.com
>
>
> ------------------------------
>
> _______________________________________________
> Wiktionary-l mailing list
> Wiktionary-l(a)Wikipedia.org
> http://mail.wikipedia.org/mailman/listinfo/wiktionary-l
>
>
> End of Wiktionary-l Digest, Vol 4, Issue 3
> ******************************************
>
There are several namespaces in the Latin Wiktionary that are still untranslated, and if anybody can implement the translations or direct me to how to do it I'd be much grateful;
"Project" namespace, currently "Wikipedia" (!) should be "Victionarium"
Project talk = Disputatio Victionarii
"Template" namespace, currently "Template", should be "Exemplum"
Template talk = Disputatio exempli
"Help" namespace, currently "Help", should be "Auxilium"
Help talk = Disputatio auxilii
"Category" namespace, currently "Category", should be "Categoria"
Category talk = Disputatio categoriae
Apologies if this has been sent before >_<
*Muke!
--
website: http://frath.net/
LiveJournal: http://kohath.livejournal.com/
deviantArt: http://kohath.deviantart.com/
FrathWiki, a conlang and conculture wiki:
http://wiki.frath.net/
Hoi,
I have written
http://meta.wikimedia.org/wiki/Using_templates_in_Wiktionary to explain
how the system with templates works in nl:wiktionary. I have written it
now because of the GEMET data that will be copied in propably many
wiktionaries.
The system described works. It is however a kludge to be used untill
there is a better way of doing the things that are tried to achieve.
The article is particularly written for Sabine Cretella and Jimbo Wales.
I promissed Jimbo last Saturday to try to explain it better.
Thanks,
Gerard