Wiktionary-l September 2004

wiktionary-l@lists.wikimedia.org

13 participants
36 discussions

Wikimedia Fundraising drive 2004
by Angela 18 Sep '04

18 Sep '04

Wikimedia are holding a cross-project fundraising drive from September 20 to October 3. We hope to raise $50,000 during this two week period. Donations can be made via the new fundraising pages at http://wikimediafoundation.org/wiki/Fundraising which are being finalized this weekend. This was discussed at a meeting yesterday. A summary is at http://meta.wikimedia.org/wiki/Fundraising_meeting%2C_September_2004 and the full log is at http://meta.wikimedia.org/wiki/Fundraising_meeting%2C_September_2004/Log The fundraising drive will be publicized via a site notice, which will read: "Wikimedia Fundraising Drive 2004. Help us raise $50,000. See our fundraising page for details." This default can be translated by editing [[MediaWiki:Sitenotice]] on your own wiki. See http://meta.wikimedia.org/wiki/Fundraising_site_notice for further instructions. If the message is protected and your wiki has no admins, please translate it on [[MediaWiki talk:Sitenotice]] of your own wiki and ask a steward to make the edit for you at http://meta.wikimedia.org/wiki/Requests_for_permissions#Fundraising_notices Please translate this message and ensure people on your project are aware of it. Thank you. Angela

1 0

ISO-639 + Glossaries / vocabulary lists / thematical lists
by Sabine Cretella 18 Sep '04

18 Sep '04

Hi Gerard and all of you, thinking about the code I was just considering some points. What I noted on the page you gave me for the Italian version of the ISO-code is that you use a mixed version for language identifiers - the two letter code and where there's no two letter code the three letter code - is this correct? I also noted that not all languages are present in the ISO-3-letter-code - so they are standardised, but not completely. This would obviously lead to an own wiktionary standard. I am asking as I thought about compiling a list of the used language codes for wiktionary and then add the several translations of the languages names asking freinds and colleagues to complete the list. Normally in the translation world the two letter code is used. I'll then add the list to my sourceforge project (wsi-glossary: http://sourceforge.net/projects/wsi-glossary/) you can see who is contributing right now with integrations to the lists here: http://wiki.wesolveitnet.com/wakka.php?wakka=WsiGlossaryContributors. I should modify licensing (mine up to now was the same as the one used for the OmegaT manual to GNU FDL - I have to check out if this is possible without problems on sourceforge net. I am to new to OpenContent to know all about this - so another thing to be done immediately). If you are working on a multilanguage list e.g. of trees, birds, vegetables etc. etc. please consider seriously to have these lists integrated by other people as well and have it ready somewhere for download or just integrate it into wsi-glossary. Certain kinds of work can be done even by schools in language lessons - e.g. the Italian Thesaurus for OpenOffice.org was created with the help of a school where the teachers were the team leaders and during the classes the pupils did something that made "sense" to them. Having them work directly in wiktionary online is impossible for most schools as computers don't have Internet access (or only a few of them) and so working on tables is much easier. If you prefer not to hand out the list: give out single terms or gourps of terms like this: I need these term(s) house cat mouse etc. in the following languages: German French Italian etc. I can then publish these parts or on my portal or send the request to different lists of translators - so step by step it is possible to integrate and improve. Best wishes from Italy, Sabine -- Sabine Cretella s.cretella(a)wordsandmore.it www.wordsandmore.it Meetingplace for translators www.wesolveitnet.com

4 6

The ultimate Wiktionary RFC
by Gerard Meijssen 17 Sep '04

17 Sep '04

Hoi, On META I have written a piece http://meta.wikimedia.org/wiki/The_Ultimate_wikitonary In it I describe what I think the Ultimate wiktionary would be like. In the article I assume that things like etymology, pronounciation all the good stuff that we need in a wiktionary are a given. Therefore this article is about how it works and what functionality makes it possible. The article is intended as a Request For Comments. I also want to direct your attention to the article http://meta.wikimedia.org/wiki/Proposal_for_a_Wiktionary_proof_of_concept .. In it I propose to implement a subset of what is required for a wiktionary that understands XML. I really welcome comments. Thanks, GerardM

1 0

How to get more cooperation between wikipedia and wiktionary
by Gerard Meijssen 14 Sep '04

14 Sep '04

On the nl:wikipedia, there were many articles that contained the names of the subject in different languages. There was opposition to this, as this is something you expect in a dictionary and not in an encyclopedia. There were some itterations before we came up with our current solution. On nl:Wiktionary we refer to articles in wikipedia and use the template {{-info-}} to produce a text with an interlink to the wikipedia article. On Wikipedia we now refer to articles in wiktionary and use the template {{wikt}} to produce a text with an interlink to the wiktionary article. The benefits are: *There are more articles in nl:wiktionary as a result *We did not lose the information that was on nl:wikipedia *We have richer information overall *The best bit is: more people contribute to nl:wiktionary. Everybody is happy with this solution. :) Thanks, GerardM

1 0

Chinese, traditional and simplified
by Gerard Meijssen 14 Sep '04

14 Sep '04

There is a big thing on the wikipedia-l about writing up Chinese. One thing I gleaned from this discussion is that zh-tw and zh-cn are used to indicate respectively traditional and simplified Chinese. As it is relevant to wiktionary to have both correct spellings, I propose to use these codes as well as the zh code to indicate Chinese words. I hope someone has a good suggestion for Serbian, cyrillic and alphabetic. There are more language that are written in different charactersets. I am looking forward to suggestions. Thanks, GerardM

4 4

Proposal for a Wiktionary proof of concept
by Gerard Meijssen 13 Sep '04

13 Sep '04

Hoi, As discussed in [[The need for XML re: wiktionary]] and [[Tables for Wiktionary]] on Meta and the wiktionary mailinglist, it is necessary to structure the wiktionary content in order to be able to share the content of wiktionary. The complexity of doing this is huge. Not only do we need to describe all kinds of data to be able to include this in our database. We have to watch and make it not too difficult for a would be contributor. We also have to produce XML or something like it to publish our content. To do all this in one go is a bit much. So I propose to do something that is simpler first. We may use the GEMET data for inclusion within Wikimedia. This is a rich and important body of knowledge and we can use it not only in wiktionary but also in wikispecies. We have been given the SQL stuff from the GEMET relational database. So we can change this to fit Wikimedia. GEMET has its own XML definition. We have therefore these important components: *We have a SQL definition to fit the data *We have the complete data from the GEMET available in XML format *It provides a subset of what is required in structuring Wiktionary It is an important resource in its own right; the GEMET data It gives us the ability to handle open content glossary/thesauri It gives us an idea how Wiktionary could/should evolve The SQL defenitions are posted on Meta. I do not see how to add them on bugzilla. *[[:Image:Gemet.sql]] *[[:Image:GEMET_status.sql]] *[[:Image:GEMET relation type.sql]] Thanks, GerardM

1 0

mouses and rats
by Sabine Cretella 11 Sep '04

11 Sep '04

Before going ahead to take this example Italian has a definite word for rat and topo doesn't mean rat. topo = mouse ratto/topo di fogna/in coll. language "topone" = rat topo d'acqua = this is a kind of rat and not of mouse topo becomes rat only in combination with other words, but never alone. To confirm this I just asked a colleague of mine (as even translators can be wrong) There is a problem of this kind but normally this happens when for example there is no translation for a subspecies into the other language. I don't have an exact example now, but I am sure, we will need it. This is also a reason for me to work in lists as when asking colleagues to check the cross translations normally these things come out easily. Going back to work. - For now I'll read the messages but answer only as soon as I have finished my job, sorry. Ciao, Sabine -- Sabine Cretella s.cretella(a)wordsandmore.it www.wordsandmore.it Meetingplace for translators www.wesolveitnet.com

6 7

Re: [Wiktionary-l] Opening up the Wiktionary content
by Jimmy (Jimbo) Wales 10 Sep '04

10 Sep '04

Because I am so far behind on everything, I have been unable to study this issue to my satisfaction. I hope to return to it soon. In the meantime, I just wanted to say that I found the discussion interesting, and I advise everyone to move forward cautiously and thoughtfully. Not very helpful, I know, but so long as we do that, I'm sure the right answers will become apparent. --Jimbo

1 0

Opening up the Wiktionary content
by Gerard Meijssen 10 Sep '04

10 Sep '04

Anthere, Jimbo, At this moment, we are in the position of getting the cooperation of all kinds of outside people with regard to Wiktionary. This is documented to a large extend on META and on the wiktionary-l mailinglist. Because of the quality of the data that we are allowed to enter and because of the quality of the persons that are involved, it would really help if we are able to import but also export wiktionary data using XML structures. The benefits are: *We will open up to other open dictionary content for inclusion in wiktionary. *We will open up our content to other intrested parties. *The wiktionary content will not be only in our own "proprietary" format. *We will be able to import data from one wiktionary in the next. This will not only enhance the quality of the wiktionaries; it will also enhance the reputation of the wiktionaries. *We will prevent the duplication of effort. Much effort now goes in doing the same job over and over again. An example; the word "Nederlands" in nl:wiktionary has 68 exact translations; these words exist as well. Technically these words can be copied to ALL other wiktionaries. This is not possible at the moment. The human effort is huge and wasted as it can be automated. When a consensus arrives on how to do this, it will also mean that some programming will be required. Without your backing, I expect that nothing is going to happen. When we have arrived at a road map, it will mean that a lot of work will be needed to make the wiktionary data conformant for inclusion in the new scheme and have the technique to get this done. We cannot reasonably ask this of the wiktionary community and the wikitech community when this road map is not co-owned by the wikimedia board. Questions: *Acknowledge that you will consider this as being strategic to the development of Wiktionary. *Give a timeframe in which we will know that we can plan with a reasonable chance of getting things implemented. (basically when will we have an answer to the first question) Thanks, Gerard Meijssen (GerardM)

7 26

Re: [Wiktionary-l] Opening up the Wiktionary content
by Gerard Meijssen 09 Sep '04

09 Sep '04

Reacting toAndrew Dunbar. A word may have several meanings in a language. Each meaning has its own definition. Your "topo" would propably best translate to rodent which includes both mice and rats. When there is no exact word or phrase for mouse in Italian, it should not be only translated with the Italian topo. I am sure a description in Italian for a mouse is possible. At this moment in time we do not have a new quick system. What we are discussing is the need for publishing our content in a re-usable intermediary way like XML this should allow us to publish our current content. How this XML will be used is what you talk about and you are right it should be used with care. However, an en:wikionary English word, its pronounciation, its translations its usage can all be validly used in nl:wiktionary, the definition of the meanings the etymology need translation. We can make an interwiki to en: and we can find these for the time we have not done that yet. It is not only {{lang}} but also {{-trans-}} ((-syn-}} {{-ant-}} etc. that will get their local meaning in the local setup. This is not ideal. These codes and their associated content should end up in a proper database so that all of this can be done by the software. Now when a good non en:wictionary editor starts with a word and comes up with inaccuracies in the en:wiktionary, would you not like to know about that? Would it not be valuable to you that you can benefit from the work ? Articles on wiktionary in META can be found by their inclusion in the category:wiktionary. As to copying and pasting, this is the technique that is currently open to us. We do not have something better. Because of copy and paste I was able to add loads of translations to English, Japanese, Vietnamese words. Opening up the wiktionary content will be hard. That is why I want us to discuss it first before we commit to it. Once we decide that we need this, the database will change, the way we enter content will change, much content needs to be revisited. All this while the basic content stays the same. We still want all the data that we have, but we will be able to share it. I have asked the wikimedia board to consider open content strategic; this is one way of eventually getting developer attention. There are open content English on-line dictionaries bigger than wiktionary that use XML, would it not be great if we could cooperate ?? Use their researched content and them using our content ?? One aim for content could be to have a definition in wiktionary for all Open Office words... Thanks, GerardM

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Wiktionary-l September 2004