Hoi,
As a result of our exploring opportunities for Ultimate Wiktionary, I was pointed to LISA the Localisation Industry Standards Association. I downloaded some information and was asked afterwards for some information. This in turn led to the question if I was willing to write an article about Wikimedia and localisation. So I did. It can be found here: http://www.lisa.org/globalizationinsider/
Thanks, GerardM
Gerard Meijssen wrote:
Hoi,
As a result of our exploring opportunities for Ultimate Wiktionary, I was pointed to LISA the Localisation Industry Standards Association. I downloaded some information and was asked afterwards for some information. This in turn led to the question if I was willing to write an article about Wikimedia and localisation. So I did. It can be found here: http://www.lisa.org/globalizationinsider/
The following is what he says there
Using this imperfect system of templates has taught us that 80% of the lexicological content can be expressed using templates.
What evidence is there of this? Sure, we can get closer to that when translations are viewed as mechanical acts and we can ignore all subtleties of language. In reality such an attitude only goves a lot of pap.
The next step will be for us to combine all the language-independent content in a database.
That's a very small part of the content.
Our challenge will be to translate the user interface in as many languages as possible.
That much is already being done without Gerard's UW
This is the first hurdle to make the Ultimate Wiktionary accessible in any language. The next step is to encourage people with language knowledge to contribute to the Wiktionary by providing descriptions and etymological information for various terms.
This too is already being done in the English Wiktionary. That's what lexicography is all about.
The Ultimate Wiktionary will become extremely relevant, based on the special content that it will contain. For example, we plan to include the GEMET thesaurus <http://www.eionet.eu.int/GEMET>, the ecological resource of the European Community (EC).
Although I have no doubt that this is useful information there should be no preference given to EC terminology. It needs to be made clear that different terminology used in countries which share a language with some member of the EC will be on an equal footing.
It will also be possible for users to add content in other languages, making the original thesaurus even more accessible and more valuable to more people. We hope to be able to cooperate with organizations such as the EC in order to host other glossaries and thesauri. As everyone is invited to contribute content, we envision this content being translated into many more languages and thus resulting in increased trade opportunities for the EC.
This looks like some kind of hidden agenda. Many of us have resisted any appearance of being dominated by the thinking and ideas of the United States. Any attempt to impose EC dominance should be resisted just as strongly.
The current Wiktionaries will be converted to the Ultimate Wiktionary.
You say this with far too much conviction. At other times you appear to make the prediction that participants in the various projects will see your UW as so great that they will melt into your arms. As a sceptic I can accept that statement. It allows me to wait until there is something real to comment about; it allows others with more technical experience to amend your software to suit the needs of Wikimedia. I cannot and will not accept your flat out statement that it "will be converted" any more than I can suspend rationality long enough to accept Christ as my personal saviour.
This means that people will have access to the Dutch Wiktionary with many words in Papiamento, the Italian Wiktionary with many Neapolitan words, and the Kurdish Wiktionary with words in many different Kurdish dialects. The goal with the Ultimate Wiktionary is to overcome the fractured nature of individual Wiktionaries. By combining them into one central repository, people will be able to access a much greater variety of content, thus enabling the Ultimate Wiktionary to be greater than the sum of its separate Wiktionary parts.
This is in sharp contrast to what you said in other parts of the article
Contrary to what the typical LISA Member has available, we do not have an organizational structure that decides what to do next. We do not have policies that determine what content is to be available in all Wikipedias. We do not translate content as a rule.
or again
As the projects grow, we find that they have different values and a different view of “the Truth.” These are the issues where culture comes into play.
What you praise here about the Wikipedias you would condemn in Wiktionary. The difference in values and cultures is just as strong in Wiktionary as it is in Wikipedia. It needs to be respected just as much.
Ec
PS: It is not my usual habit to crosspost my comments, an I seriously considered posting this on only one of the lists, but since Gerard has chosen to put his POV on three lists it seemed appropriate that at least the initial rebuttal should be similarly distributed.
Ray Saintonge wrote:
Gerard Meijssen wrote:
Hoi,
As a result of our exploring opportunities for Ultimate Wiktionary, I was pointed to LISA the Localisation Industry Standards Association. I downloaded some information and was asked afterwards for some information. This in turn led to the question if I was willing to write an article about Wikimedia and localisation. So I did. It can be found here: http://www.lisa.org/globalizationinsider/
The following is what he says there
Using this imperfect system of templates has taught us that 80% of the lexicological content can be expressed using templates.
What evidence is there of this? Sure, we can get closer to that when translations are viewed as mechanical acts and we can ignore all subtleties of language. In reality such an attitude only goves a lot of pap.
The evidence can be found in the practice of the it: or nl: and other wiktionaries. Please explain "goves a lot of pap".
The next step will be for us to combine all the language-independent content in a database.
That's a very small part of the content.
Given that the language-independent content is over 80%, I wonder what a big part is..
Our challenge will be to translate the user interface in as many languages as possible.
That much is already being done without Gerard's UW
There will be an User Interface part to the Ultimate Wiktionary, that will have to be localised.
This is the first hurdle to make the Ultimate Wiktionary accessible in any language. The next step is to encourage people with language knowledge to contribute to the Wiktionary by providing descriptions and etymological information for various terms.
This too is already being done in the English Wiktionary. That's what lexicography is all about.
At issue is the scope that you have. If you think the English Wiktionary everything, you forget about all the other Wiktionaries. If work done on the English Wiktionary is only to benefit the English Wiktionary your scope is somewhat limited.
The Ultimate Wiktionary will become extremely relevant, based on the special content that it will contain. For example, we plan to include the GEMET thesaurus <http://www.eionet.eu.int/GEMET>, the ecological resource of the European Community (EC).
Although I have no doubt that this is useful information there should be no preference given to EC terminology. It needs to be made clear that different terminology used in countries which share a language with some member of the EC will be on an equal footing.
We have already introduced a glossary with medical information in the Dutch Wiktionary, the GEMET database contains data for many languages including American English. The definitions are by reputable institutions including US universities. So please know what you talk about. The GEMET data is given to us to be published under the GNU-FDL. So what is your problem? If we can obtain other worthwhile resources and combine them in UW we will because it enriches the UW as a resource.
It will also be possible for users to add content in other languages, making the original thesaurus even more accessible and more valuable to more people. We hope to be able to cooperate with organizations such as the EC in order to host other glossaries and thesauri. As everyone is invited to contribute content, we envision this content being translated into many more languages and thus resulting in increased trade opportunities for the EC.
This looks like some kind of hidden agenda. Many of us have resisted any appearance of being dominated by the thinking and ideas of the United States. Any attempt to impose EC dominance should be resisted just as strongly.
I will be asking the Dutch government if we can host a glossary of words and their meaning in the various Dutch governmental organisations. When we want to inform what a word means and you equate informing what a word means for an organisation with letting this organisation dominate you, you disallow making this information public. From my point of view you are barking up the wrong tree, with a dictionary a glossary you inform and if the EC is more Open than the US, good for them..
The aim of Wiktionary is all words of all languages, would it not be great if we can have organisations look at their vocabulary .. even if it means that it makes plain how far off they are from what is commonly meant by this vocabulary they are ??
The current Wiktionaries will be converted to the Ultimate Wiktionary.
You say this with far too much conviction. At other times you appear to make the prediction that participants in the various projects will see your UW as so great that they will melt into your arms. As a sceptic I can accept that statement. It allows me to wait until there is something real to comment about; it allows others with more technical experience to amend your software to suit the needs of Wikimedia. I cannot and will not accept your flat out statement that it "will be converted" any more than I can suspend rationality long enough to accept Christ as my personal saviour.
As a sceptic, it does not allow you to bide your time and wait for the results. You do what you must. In the mean time I will work to make the UW a success.
This means that people will have access to the Dutch Wiktionary with many words in Papiamento, the Italian Wiktionary with many Neapolitan words, and the Kurdish Wiktionary with words in many different Kurdish dialects. The goal with the Ultimate Wiktionary is to overcome the fractured nature of individual Wiktionaries. By combining them into one central repository, people will be able to access a much greater variety of content, thus enabling the Ultimate Wiktionary to be greater than the sum of its separate Wiktionary parts.
This is in sharp contrast to what you said in other parts of the article
I do not understand what you mean; but let me try to explain. We want an user interface for every lanuguage, selectable in the user preferences. When content has been imported into the UW, the content included in the Dutch Wiktionary will make Papiamento, English, Italian etc content available. With more resources imported into the UW, the infomation will be enriched. I do not rubbish the accomplishments of the other wiktionaries, I want to make them available to all people of all languages. When an English word is known because of theire being an entry created in association with a word in Italian, it will be available for everyone never mind what user interface is used.
Contrary to what the typical LISA Member has available, we do not have an organizational structure that decides what to do next. We do not have policies that determine what content is to be available in all Wikipedias. We do not translate content as a rule.
or again
As the projects grow, we find that they have different values and a different view of “the Truth.” These are the issues where culture comes into play.
What you praise here about the Wikipedias you would condemn in Wiktionary. The difference in values and cultures is just as strong in Wiktionary as it is in Wikipedia. It needs to be respected just as much.
I do not see the point that you are making. A dictionary is deterministic; it informs of the meaning(s) of a word and other information that can be found about a word. We will have them all. An encyclopedia tries to explain what things are and how they relate. There is in my mind a huge difference in the amount of culture you will find in an encyclopedia and in a dictionary.
Ec
PS: It is not my usual habit to crosspost my comments, an I seriously considered posting this on only one of the lists, but since Gerard has chosen to put his POV on three lists it seemed appropriate that at least the initial rebuttal should be similarly distributed.
I have published in a periodical. The posting on the three lists was to inform you about this. These postings are as much as anything to prevent the feeling some people have that things are done in secret. They are not. Many people knew I was going to publish on the LISA periodical and hey, today you can read the result :) ..
The creation of this article was a lengthy affair. I have discussed issues with many wikimedians. If Ray is of the opinion that it is my POV, he is correct I published it and I am proud of it. I hope and expect that in time Ultimate Wiktionary will prove a success. If people are interested in how I envision certain things, they can ask me and we can discuss things. I am eager to know where we need to improve on the ideas that we have. But please use arguments in stead of sceptical comments based on .. what ??
Thanks, GerardM
wikimedia-l@lists.wikimedia.org