I think that GNU FDL is perfectly fine for Ultimate Wiktionary and there is no need to change the license. The license is perfectly compatible with the .DICT format, so there should be no problems at all.
Thinking about how to import data from wiktionary in the ultimate wiktionary may pose a few puzzles with respect to FDL compliance, but I don't see any significant problems. The import script should keep track of who contributed to a chunk of data and take note of that fact. The history may be a little more problematic, and I think we will want to get advice on exactly how to do it.
But changing the license to something else would require throwing away all existing work in wiktionary, which seems quite unwise to me.
--Jimbo
Jimmy Wales wrote:
I think that GNU FDL is perfectly fine for Ultimate Wiktionary and there is no need to change the license. The license is perfectly compatible with the .DICT format, so there should be no problems at all.
I'm glad to hear that.
Thinking about how to import data from wiktionary in the ultimate wiktionary may pose a few puzzles with respect to FDL compliance, but I don't see any significant problems. The import script should keep track of who contributed to a chunk of data and take note of that fact. The history may be a little more problematic, and I think we will want to get advice on exactly how to do it.
Importing histories into Gerard's project will not remove them from the existing Wiktionaries. Simply putting a dated link to the source Wiktionary should do it. That would only become a problem if the existing Wiktionary deletes the article or otherwise makes it unworkable. In the existing transwiki technique article histories are moved with the term in question. This gives a nice list of who edited and when, but records of what changes these people made are unavailable. In some cases, particularly as regards very stubby articles, it has proved more practical to discard the transwikied material and its history completely, and start a whole new article.
But changing the license to something else would require throwing away all existing work in wiktionary, which seems quite unwise to me.
Not really, since the existing wiktionaries would continue as they have all along.
Ec
Ray Saintonge wrote:
Not really, since the existing wiktionaries would continue as they have all along.
I am not agreeing or disagreeing or taking any position on this at all. Please consider my questions about this to be merely data gathering. :-)
Why should we have such a formal break? Why shouldn't we instead transition from one to the other in a seamless way? I worry a lot about duplication of effort, etc.
Are there objections from old-school wiktionarians to the ultimate wiktionary plans? Is it possible that we could address these objections through code?
I'm very sorry that I don't already know all of this, but I'm sure you can all get me up to speed quite quickly.
--Jimbo
Jimmy Wales wrote:
Ray Saintonge wrote:
Not really, since the existing wiktionaries would continue as they have all along.
I am not agreeing or disagreeing or taking any position on this at all. Please consider my questions about this to be merely data gathering. :-)
Why should we have such a formal break? Why shouldn't we instead transition from one to the other in a seamless way? I worry a lot about duplication of effort, etc.
Are there objections from old-school wiktionarians to the ultimate wiktionary plans? Is it possible that we could address these objections through code?
I'm very sorry that I don't already know all of this, but I'm sure you can all get me up to speed quite quickly.
To a large extent I see the fundamental difference between Gerard and me as rooted in different visions about the role and nature of dictionaries. I'll try not to misrepresent his point of view.
I see Gerard's vision as being based in the need for technical solutions for machine translations. In theory with his method one should be able to look up a word in one's own language and immediately be able to find a corresponding term in whatever other language one desires. I at least agree with him that such an approach would require far more sophisticated software than what we now employ. Much of what he proposes appears to be very highly dependent on templates and technical codification rather that plain language editing, and I'm afraid that that would scare away many potential new contributors who don't feel comfortable with the more technical approach.
For my part translation is a secondary function of a dictionary. Documenting the history of a word, citing quotations that support uses of the word, and commentary on the usages of a word are more interesting and important. I recently did a little of this to raise awareness of the divergence of [[gourmand]] in English and French. I find our present software essentially adequate for the task.
Gerard has been talking about his Ultimate Wiktionary for a long time, but so far I have not seen examples of what Gerard's Wiktionary will look like, how it will work or how it will be editable. Perhaps if he presented more concrete examples attitudes could change.
I guess "old-school" is probably a good term. I have very little involvement on the technical side of things. I do find it a chore to insert a picture or a table, but I figure it out when I have to. When templates appear in an article that I am editing, I need to make extra effort just to track where some of them come from or what they mean. If I, as a person who has been here for over three years, am having trouble with this, it must be worse for a non-technical person who just wants to indulge his love of words.
Ec
On 5/30/05, Ray Saintonge saintonge@telus.net wrote:
I see Gerard's vision as being based in the need for technical solutions for machine translations. In theory with his method one should be able to look up a word in one's own language and immediately be able to find a corresponding term in whatever other language one desires. I at least agree with him that such an approach would require far more sophisticated software than what we now employ. Much of what he proposes appears to be very highly dependent on templates and technical codification rather that plain language editing, and I'm afraid that that would scare away many potential new contributors who don't feel comfortable with the more technical approach.
I think its a safe assumption that it isn't for machine translation. Plenty of crappy machine translators already :). The problem with Wiktionary as it stands is that there's really one way to view it - as a web page. If you want to query it (translations of the word 'mouse', all Romanian nouns) you can't really. Its not 'machine readable' (perhaps what you meant). Its not flexible, if there's desire to change the formatting of the articles (entry formating is a much less obvious and more complicated job then in Wikipedia) it would be a huge undertaking. We'd have to see it to judge it obviously. It should be decided that it is the way to go or that it is not... we shouldn't ever have two competing Wiktionaries.
For my part translation is a secondary function of a dictionary. Documenting the history of a word, citing quotations that support uses of the word, and commentary on the usages of a word are more interesting and important. I recently did a little of this to raise awareness of the divergence of [[gourmand]] in English and French. I find our present software essentially adequate for the task.
Gerard has been talking about his Ultimate Wiktionary for a long time, but so far I have not seen examples of what Gerard's Wiktionary will look like, how it will work or how it will be editable. Perhaps if he presented more concrete examples attitudes could change.
I guess "old-school" is probably a good term. I have very little involvement on the technical side of things. I do find it a chore to insert a picture or a table, but I figure it out when I have to. When templates appear in an article that I am editing, I need to make extra effort just to track where some of them come from or what they mean. If I, as a person who has been here for over three years, am having trouble with this, it must be worse for a non-technical person who just wants to indulge his love of words.
Ec
Ray Saintonge saintonge@telus.net wrote:
For my part translation is a secondary function of a dictionary. Documenting the history of a word, citing quotations that support uses of the word, and commentary on the usages of a word are more interesting and important. I recently did a little of this to raise awareness of the divergence of [[gourmand]] in English and French. I find our present software essentially adequate for the task.
Gerard has been talking about his Ultimate Wiktionary for a long time, but so far I have not seen examples of what Gerard's Wiktionary will look like, how it will work or how it will be editable. Perhaps if he presented more concrete examples attitudes could change.
I agree. As I have said ... probably many times now, 99% of Wiktionary articles are stubs.[1] Taking these as a model for a stricter dictionary format may relegate the "Ultimate" Wiktionary to the status of a mere glossary. But without even having had a mock-up of what the product will look like or how it will work, noone can tell. A lot of the work for this is apparently being done by a small few behind closed doors, which is not very encouraging at all.
*Muke! [1] And having five thousand one-word translations attached to each won't make them any less stubby.
Muke Tever wrote:
Ray Saintonge saintonge@telus.net wrote:
For my part translation is a secondary function of a dictionary. Documenting the history of a word, citing quotations that support uses of the word, and commentary on the usages of a word are more interesting and important. I recently did a little of this to raise awareness of the divergence of [[gourmand]] in English and French. I find our present software essentially adequate for the task.
Gerard has been talking about his Ultimate Wiktionary for a long time, but so far I have not seen examples of what Gerard's Wiktionary will look like, how it will work or how it will be editable. Perhaps if he presented more concrete examples attitudes could change.
I agree. As I have said ... probably many times now, 99% of Wiktionary articles are stubs.[1] Taking these as a model for a stricter dictionary format may relegate the "Ultimate" Wiktionary to the status of a mere glossary. But without even having had a mock-up of what the product will look like or how it will work, noone can tell. A lot of the work for this is apparently being done by a small few behind closed doors, which is not very encouraging at all.
*Muke!
[1] And having five thousand one-word translations attached to each won't make them any less stubby.
If you want stubs look at the Ido Wiktionary. 8-) Ec
Ray Saintonge saintonge@telus.net wrote:
I agree. As I have said ... probably many times now, 99% of Wiktionary articles are stubs.
If you want stubs look at the Ido Wiktionary. 8-)
I hit special:randompage once on en: and io: and got:
http://en.wiktionary.org/wiki/Benignant http://io.wiktionary.org/wiki/Seenessel
Not much difference there, except that en: has a part of speech and io: has a language index category...
nl: doesn't fare much better, returning http://nl.wiktionary.org/wiki/yei which manages language, category, part of speech, and definition... But the word is spelled entirely wrong (according to [1] and en: it should probably be ေရ, though since Google search ignores Burmese text [!!] I can't really confirm that)
(For my part, randompage on la: brought up http://la.wiktionary.org/wiki/Thursday which isn't all that hot either.)
Anyway, this reminds me of why I don't find duplication of effort a problem in general. I don't trust any wiktionary for words outside its native language, for one. Too many people import lists of translations and don't do any fact-checking (I had to respell several Kalaallisuut number words in en recently) or even reality- checking (someone put in "cicňnnia" as the Sardinian for [[stork]] some time ago--I had to hunt down and fix a lot of Sardinian mojibake in several articles imported from the same source when I ran across that one).
nl: I've found to be particularly bad about this, as it won't just add the translation to the lists without checking, they'll actually create full articles for them (like 'yei' above, or another word under the [[nl:water]] list, Dagaare "koO", which appears to be an ASCII rendering for koɔ...).
Will the UW have any way to note that information was added by a non-native speaker?
IMO the more effort put in (can't really say it was _duplicated_, as outside of very specialized technical terms and the communalized SAE semantics, just because something translates an English word doesn't mean it's the best translation of the French, Greek, or Chinese word that also translates the English..., and at the very least that has to be checked), the more chances we have to find discrepancies and make a better dictionary by checking them against each other.
*Muke! [1] http://www.ayinepan.com/literature/search_word.php?word=w
Muke Tever wrote:
Ray Saintonge saintonge@telus.net wrote:
I agree. As I have said ... probably many times now, 99% of Wiktionary articles are stubs.
If you want stubs look at the Ido Wiktionary. 8-)
I hit special:randompage once on en: and io: and got:
http://en.wiktionary.org/wiki/Benignant http://io.wiktionary.org/wiki/Seenessel
Not much difference there, except that en: has a part of speech and io: has a language index category...
One big difference, the word is spelled binignant not Benignant
nl: doesn't fare much better, returning http://nl.wiktionary.org/wiki/yei which manages language, category, part of speech, and definition... But the word is spelled entirely wrong (according to [1] and en: it should probably be ေရ, though since Google search ignores Burmese text [!!] I can't really confirm that)
If you were a part of the nl.community you would have changed it. As it is, all these wiktionaries are islands, there is no cooperation. We waste our time and effort.
(For my part, randompage on la: brought up http://la.wiktionary.org/wiki/Thursday which isn't all that hot either.)
Anyway, this reminds me of why I don't find duplication of effort a problem in general. I don't trust any wiktionary for words outside its native language, for one. Too many people import lists of translations and don't do any fact-checking (I had to respell several Kalaallisuut number words in en recently) or even reality- checking (someone put in "cicňnnia" as the Sardinian for [[stork]] some time ago--I had to hunt down and fix a lot of Sardinian mojibake in several articles imported from the same source when I ran across that one).
So you spend a lot of effort on the en.wiktionary. From my point of view it does me no good. It does not help. You give excellent arguments why a UW is what we need.
nl: I've found to be particularly bad about this, as it won't just add the translation to the lists without checking, they'll actually create full articles for them (like 'yei' above, or another word under the [[nl:water]] list, Dagaare "koO", which appears to be an ASCII rendering for koɔ...).
When a word is added as a translation, a "placeholder" will be created in the UW, this is just a word with a language. If koO should be koɔ, it is plain wrong and, this to spot this you need a big community of people. This is exactly one argument why an UW would be beneficial as it would increase the size of the community. On your authority I changed koO to koɔ, something you would have done if you felt part of this community.
The argument that there should not be a full lemma is wrong. The aim of the wiktionary is to have all words in all languages.
Will the UW have any way to note that information was added by a non-native speaker?
This is relatively trivial to do. I blogged about this feature; I would tie it in with the "babel" functionality as I use it on my Meta and nl.wikipedia userpage.
IMO the more effort put in (can't really say it was _duplicated_, as outside of very specialized technical terms and the communalized SAE semantics, just because something translates an English word doesn't mean it's the best translation of the French, Greek, or Chinese word that also translates the English..., and at the very least that has to be checked), the more chances we have to find discrepancies and make a better dictionary by checking them against each other.
Again, you give arguments why we need a community to do these kind of things. It also means that we should share the work. Now everything done in one project will need to be done in another. A huge waste of effort because at this time we do not make a dictionary we make too many wiktionaries.
Thanks, GerardM
*Muke!
[1] http://www.ayinepan.com/literature/search_word.php?word=w
Gerard Meijssen gerard.meijssen@gmail.com wrote:
I hit special:randompage once on en: and io: and got:
http://en.wiktionary.org/wiki/Benignant http://io.wiktionary.org/wiki/Seenessel
Not much difference there, except that en: has a part of speech and io: has a language index category...
One big difference, the word is spelled binignant not Benignant
Not in English. From a real dictionary: http://www.bartleby.com/61/96/B0189600.html
Cf. also Latin benignus.
If you're referring to the capitalization, the article states that it is 'benignant'. Capitalized 'Benignant' only appears in the title (as is customary capitalization for English).
nl: doesn't fare much better, returning http://nl.wiktionary.org/wiki/yei which manages language, category, part of speech, and definition... But the word is spelled entirely wrong (according to [1] and en: it should probably be ေရ, though since Google search ignores Burmese text [!!] I can't really confirm that)
If you were a part of the nl.community you would have changed it. As it is, all these wiktionaries are islands, there is no cooperation. We waste our time and effort.
The time and effort is wasted adding things one doesn't know anything about, when the result is only going to be inaccurate.
Anyway, this reminds me of why I don't find duplication of effort a problem in general. I don't trust any wiktionary for words outside its native language, for one. Too many people import lists of translations and don't do any fact-checking (I had to respell several Kalaallisuut number words in en recently) or even reality- checking (someone put in "cicňnnia" as the Sardinian for [[stork]] some time ago--I had to hunt down and fix a lot of Sardinian mojibake in several articles imported from the same source when I ran across that one).
So you spend a lot of effort on the en.wiktionary. From my point of view it does me no good. It does not help. You give excellent arguments why a UW is what we need.
No, I don't put a lot of effort on en:, and havnt for some time really. "cicňnnia" was from when I was an en: regular, yeah. I ran across the errors in the Kalaallisuut when doing a reality check on stuff imported to la: from en:. The words had no google hits whatever outside of wiktionary (always a bad sign). So I went and found the right ones, fixed the la:, and the en: where I got them from, since it is a wiki after all.
I wouldn't have even been checking kl: words at all if I wasn't working to create a *new* source. If we only had one wiki, "arviniq" et al. would have sat there until 1) some kl-speaker came along to fix it or 2) some user wanted to make a good page for the kl word. (And if it was a nl-type user, they wouldn't have even checked the kl word before creating it, and both the bad translation and the bad article would stay until some kl-speaker came along to fix it.)
nl: I've found to be particularly bad about this, as it won't just add the translation to the lists without checking, they'll actually create full articles for them (like 'yei' above, or another word under the [[nl:water]] list, Dagaare "koO", which appears to be an ASCII rendering for koɔ...).
When a word is added as a translation, a "placeholder" will be created in the UW, this is just a word with a language. If koO should be koɔ, it is plain wrong and, this to spot this you need a big community of people. This is exactly one argument why an UW would be beneficial as it would increase the size of the community. On your authority I changed koO to koɔ, something you would have done if you felt part of this community.
As long as UW "placeholders" are just a word and a language, fine. But nl's "placeholders" are more than just a word with a language. Much of the time they also give a part of speech (which is usually the part of the speech of the Dutch word, regardless what it may be in the actual language -- hint: not always the same, especially for languages outside of Standard Average European) and a definition (which is also usually the same definition as the Dutch word, again regardless of how different it may be; in many cases they are broader).
The argument that there should not be a full lemma is wrong. The aim of the wiktionary is to have all words in all languages.
I prefer that no information be given than wrong information or information that has not been checked. All words in all languages does no good if they are not reliable.
IMO the more effort put in (can't really say it was _duplicated_, as outside of very specialized technical terms and the communalized SAE semantics, just because something translates an English word doesn't mean it's the best translation of the French, Greek, or Chinese word that also translates the English..., and at the very least that has to be checked), the more chances we have to find discrepancies and make a better dictionary by checking them against each other.
Again, you give arguments why we need a community to do these kind of things. It also means that we should share the work. Now everything done in one project will need to be done in another. A huge waste of effort because at this time we do not make a dictionary we make too many wiktionaries.
I would love to see a hundred independently-created dictionaries of Dagaare. A hundred and one, counting UW. :p I think it is a good thing that information be gathered separately by separate projects: it means people may use some discernment in what they add, and don't just copy things blindly; if one project fails on this, another may come through; if different projects disagree on data, we know there is a problem, which we don't know if all we get is one point of view. I'm not saying Ultimate Wiktionary shouldn't exist; I'm just saying that I don't agree with the idea that "duplication of effort" is without merit and something to be done away with entirely. Competition has its own merits.
*Muke!
Timwi timwi@gmx.net wrote:
Muke Tever wrote:
I prefer that no information be given than wrong information or information that has not been checked.
Gosh, you must be hating Wikipedia. ;-)
Only in places. Wikipedians are lousy etymologists, for one ;p
And the wiki process in general.
No, I just want a more reliable Wiktionary :|
*Muke!
Muke Tever wrote:
Ray Saintonge saintonge@telus.net wrote:
I agree. As I have said ... probably many times now, 99% of Wiktionary articles are stubs.
If you want stubs look at the Ido Wiktionary. 8-)
I hit special:randompage once on en: and io: and got:
http://en.wiktionary.org/wiki/Benignant http://io.wiktionary.org/wiki/Seenessel
Not much difference there, except that en: has a part of speech and io: has a language index category...
nl: doesn't fare much better, returning http://nl.wiktionary.org/wiki/yei which manages language, category, part of speech, and definition... But the word is spelled entirely wrong (according to [1] and en: it should probably be ေရ, though since Google search ignores Burmese text [!!] I can't really confirm that)
If such a thing happens in a little Wiktionary, only a very few will notice it. If it happens in UW, chances are a lot better somebody who knows what is wrong about it, will see it, since all our eyeballs will be focused on one version of the data. And then it can be fixed once and for all. If something gets fixed now in one Wiktionary, it is very well possible it is still entered wrongly in many others.
Will the UW have any way to note that information was added by a non-native speaker?
IMO the more effort put in (can't really say it was _duplicated_, as outside of very specialized technical terms and the communalized SAE semantics, just because something translates an English word doesn't mean it's the best translation of the French, Greek, or Chinese word that also translates the English..., and at the very least that has to be checked), the more chances we have to find discrepancies and make a better dictionary by checking them against each other.
The problem of the translations is a very real one and it's not going to be easy to solve it. That doesn't mean we should just give up though. It means we should look for a viable solution. Words across different languages can be linked to each other. For each language pair a link is needed. This would solve the problem nicely. It's not how I set up my implementation (my implementation is not what is being used for UW, so it's pretty irrelevant), but maybe it's a good way to do things in UW.
Jo
Jimmy Wales wrote:
Ray Saintonge wrote:
Not really, since the existing wiktionaries would continue as they have all along.
I am not agreeing or disagreeing or taking any position on this at all. Please consider my questions about this to be merely data gathering. :-)
Why should we have such a formal break? Why shouldn't we instead transition from one to the other in a seamless way? I worry a lot about duplication of effort, etc.
Are there objections from old-school wiktionarians to the ultimate wiktionary plans? Is it possible that we could address these objections through code?
Hi Jimbo et al,
What GerardM proposes as the ultimate Wiktionary is what I have been considering to do all along: a dictionary in a relational database. It's a pity I can't seem to find more time to contribute to its genesis. What it would mean is basically that we would all be working on the same set of data, instead of having each language on its balkanized Wiktionary . People who speak French get a French interface to the data. People who speak Swahili get an interface in swahili. Part of the interface could be to show that content exists in other languages for the description of a word, for an etymology, etc. It is a pity that there is an enormous duplication of effort and possibility of errors. An error may get corrected in one Wiktionary, but it isn't in the other Wiktionaries. The UW wants to solve these problems. Also if somebody in the Dutch Wiktionary knows the translation of a word to Frisian, this knowledge doesn't propagate to the other language Wiktionaries. Not in an efficient way anyway. With the UW if somebody adds a translation, this translation becomes available to everybody.
Of course, new problems will crop up. For instance, translations are not transitive for all but the simplest of terms and concepts. We need to find a solution to that problem. I was considering to take a step back and group meanings into one table and concepts in another to try to solve the problem. The issue with that is that it adds a layer of abstraction which might obscure things a bit. (It would become possible to create a 'meaning', where one already existed, thus resulting in two entries for the same 'meaning'). Such things have to be discovered and resolved. I'm sure we will find a way to accomplish that the Wiki way.
One would need quite a few layers of information:
words or lemmas with a certain spelling
language (this is the only one that has to be defined) part of speech many more properties that are optional, such as whether it is normally capitalized, whether it's a conjugated/flexed form, singular/plural, etc.
These words can be grouped to form expressions (and an expression can also be only one word) bathroom could be one entry, salle de bains another This is the 'dictionary layer, where the words/expressions can be described, etymologies can be added etc. This can be done in several languages. The user gets to see the language(s) he wants to see.
The next layer points to these expressions and combines all the ones that have the same meaning.
The way I conceived it, I added yet another layer 'concepts'. Here I grouped the meaning 'mum' and 'mother'. They are the same concept, but the meaning is different, meaning the translations have to be different (maman, mama, mamá), (mère, moeder, madre).
I used some more lookup tables and tables to create many to many relations which I left out of the picture here. The concepts can be illustrated with graphical material, sounds and videos and the pronunciations can be added with something like IPA and point to actual sound files so one can actually listen to them.
I posted the design of all this before, but apparently the fact that I chose to name my tables with a neutral language like Esperanto managed to make it inaccessible to developers.
Polyglot
Jimmy Wales wrote:
Ray Saintonge wrote:
Not really, since the existing wiktionaries would continue as they have all along.
I am not agreeing or disagreeing or taking any position on this at all. Please consider my questions about this to be merely data gathering. :-)
Why should we have such a formal break? Why shouldn't we instead transition from one to the other in a seamless way? I worry a lot about duplication of effort, etc.
Are there objections from old-school wiktionarians to the ultimate wiktionary plans? Is it possible that we could address these objections through code?
I'm very sorry that I don't already know all of this, but I'm sure you can all get me up to speed quite quickly.
The following are my further comments that I have added in the Wiktionary Beer Parlour, for those who don't haunt that place. Perhaps, before we get into copyright licensing issues, we should make sure that we are not pursuing a chimera.
In all fairness Gerard's proposal for his Ultimate Wiktionary has been around for some time. According to the timetable Prototype III is now available, but I have been unable to find a link to it. A version that is fully integrated with Mediawiki 1.6 is due by the middle of June. There is all manner of talk about how all the Wiktionaries can be integrated into one big Ultimate Wiktionary but there is precious little practical progress.
At some point Gerard expressed that when the Ultimate Wiktionary was operational we would not be forced to migrate, but predicted that we would all see this new project as the best thing in on-line dictionaries ever created. He further predicted that we would all find it so marvellous that we would be inspired to move everything there without any difficulty whatsoever. I accepted that at face value, and was prepared to wait until there was something in place before wasting any more time debating it. That explains my opinionated silence on the matter. A recent comment by Jimbo on the mailing list suggested to me that he has been led to believe that there is more to this project than can be supported by facts. That made it necessary to comment.
It's wonderful that Gerard has been able to talk €5,000 from a Dutch company to fund his proposal, but let's get real. Gerard's Ultimate Wiktionary is currently at the level of a techies pipedream, and we can only wonder what's in the pipe. Adapting to software that does not exist showes that UW really stands for "Ultimate Wheelspinning". We don't know how Mediawiki 1.6 will handle this by mid-June; Mediawiki 1.5 is only now in development and has not yet been released. Let's not become dependent on software that has yet to be written.
While I'm at it I should remark how effective these proposals have been at propelling the Ido Wiktionary to become the 7th largest Wiktionary with some 9,000 entries. Ido is a minor dialect of Esperanto. Of course most of the entries don't have a definition, and only show a translation of the word into one other language. I sometimes wonder if some of these people understand what a dictionary is all about. <small><small> I hope nobody is reading this on the Klingon homeworld. </small></small>
Ec
Hmmm ... I know for sure that Gerard is not the only one believing in this project - there are language professionals, like me, who want this - there are more possibilities you can even imagine - and I am not going to outline them here or anywhere else as it is far too difficult to understand if people don't even understand what the UW basically is about - the inital idea was saving time and effort, creating one place where to work instead of repeating the same work on 70+ different wiktionaries around (I hope the number is correct). Then step by step other contacts with organisations and of course people came up and this lead to further thinking, to additional features etc. And it is not finished here.
Ultimate Wiktionary is real - it is a fact and it will have success.
The planning of this project on wiktionary started last year in August - for some of us even sooner, since we had this view in mind for months/years - Gerard had an idea and implemented it with the use of templates - and I had a very similar idea, but did not know how to create what I wanted to create - so I just sat down and learned how to do. There's a Farsi speaking colleague in Canada who tried similar stuff - there are dictionary projects around that have in part what UW will have in the end. The combination of Mediawiki + Wikidata will allow for a real astonishing project.
Of course there will be some inexpected problems, like with every project ... wasn't wiktionary born out of a problem to wikipedians who did not want lexicological data?
Problems are there to be solved - and if you want to, if you really want to, you can always find the solution. And if this should depend from me: I definitely want to find the solution. It is part of my everyday work
Oh ... and as for the stub thingie ... if I now would like to complete a word adding for example a definition I don't know which are these words - I have to search in hundreds (or thousands) of words ... this is a waist of time. Using wikidata with a relational database in combination with mediawiki I will be able to see which are the German/Italian words that need a definition, I will have the possibility to search for example for all German words where the gender is missing (you just need a database query) - contributing will become much easier, take less time, become more qualified +++++.
Ido: I had a look at this project - they started to upload things - unfortunately the capitalisation is not off and this could create problems, but I suppose they have the data as tables and upload with a bot. So it could be data to be included easily in UW and it will be easy to start editing the missing parts and so the stub-time will be over very, very soon.
Isn't the real reason for so many negative comments that people are worrying about all the work they have done more than twice? Isn't it true that some of us repeatedly tried to explain that Wiktionary like it is/was at this/that stage is 80% metadata and wastes lots of time of contributing members?
You see: most innovations are seen as an intrusion - this is normal - it will take time for people to accept that there is more than just a normal wiki for lexicological content.
UW is going ahead - it will work - we will make it work.
Sorry, if I am maybe a bit too harsh - but starting to doubt about a project now ... it is a bit late to my opinion. Instead of doubting and trying to see negative I would like to kindly in vite you to give constructive contributions - ideas on what to implement, ask if this or that is already implemented. Help to create structures and help to make sure that the work you did is not going to be wasted..
Thank you and have a great day!
Sabine
___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it
Ray Saintonge wrote:
Jimmy Wales wrote:
I think that GNU FDL is perfectly fine for Ultimate Wiktionary and there is no need to change the license. The license is perfectly compatible with the .DICT format, so there should be no problems at all.
I'm glad to hear that.
I do not understand why you think so.
Thinking about how to import data from wiktionary in the ultimate wiktionary may pose a few puzzles with respect to FDL compliance, but I don't see any significant problems. The import script should keep track of who contributed to a chunk of data and take note of that fact. The history may be a little more problematic, and I think we will want to get advice on exactly how to do it.
Importing histories into Gerard's project will not remove them from the existing Wiktionaries. Simply putting a dated link to the source Wiktionary should do it. That would only become a problem if the existing Wiktionary deletes the article or otherwise makes it unworkable. In the existing transwiki technique article histories are moved with the term in question. This gives a nice list of who edited and when, but records of what changes these people made are unavailable. In some cases, particularly as regards very stubby articles, it has proved more practical to discard the transwikied material and its history completely, and start a whole new article.
To what wiktionary would you attribute information and on what level ?? When you interwiki, you expect that old style wiktionaries will continue to exist.. I expect that several wiktionaries, if not all, will eventually be discontinued.
From my point of view there is nothing wrong with very stubby information. The only criterium is that information is correct.
But changing the license to something else would require throwing away all existing work in wiktionary, which seems quite unwise to me.
Not really, since the existing wiktionaries would continue as they have all along.
Ec
Hoi, When a Wiktionary has its data imported into Ultimate Wiktionary and when its users have migrated to the Ultimate Wiktionary as well, why should they continue as they did before. It does not makes sense to suggest that they would. I will not work on any Wiktionay if the work of UW proves to be more efficient. Given the amount of work that I did on the nl.wiktionary it can not continue as it did before.
The idea that Wiktionary is similar to Wikipedia is also something that can be disputed. It is more of a lemma than an article. Wiktionary is in many respects like a list, the individual words cannot be copyrighted. The individual translations cannot be copyrighted. This is unlike Wikipedia. So if anything stubby lemmas in Wiktionary are not a problem. There is also the consideration that the content can come from many sources. It is therefore very difficult to state in the Ultimate Wiktionary who contributed to a word because it can originate from many sources. When you add for instance that a noun has a particular gender, you cannot claim copyright because that fact is in the public domain.
The GNU-FDL is not intended for data like wiktionary, this license was intended for manuals and it is not really appropriate for data like dictionaries. There are public domain dictionaries that will give you 90% of what you need to have in a modern dictionary.
When changing the license for Ultimate Wiktionary is a given, there are several ways of dealing with complaints from people who have contributed to a wiktionary. *You can import the same/similar data from several sources. So when a wiktionary editor complains he is removed as an editor and an other resource is attributed as its source. *This division is particularly easy for language independent stuff like translations wordtypes genders etc. Meanings and etymologies are more personal.
Given that it is only contributors can complain and given that changing the license will be done to make the UW more usefull, I doubt there will be many people that will complain. I also think that it will not be apreciated by the community when people complain as the reason why the license would be changed is to make it easier to fulfill the goals of the Wikimedia Foundation.
Thanks, GerardM
Gerard Meijssen gerard.meijssen@gmail.com wrote:
*This division is particularly easy for language independent stuff like translations wordtypes genders etc. Meanings and etymologies are more personal.
Even most of the etymology can be done automatically, I have been doing like that on la: by making the etymologies lists, as you might have seen at http://frath.net/stuff/translatio.gif ('protogermanic' and 'proto-indo- european' are not translated though... reconstructed languages don't have ISO codes).
Certainly annotations to the etymologies will be "more personal", but that should go equally for annotations to all other information -- assuming, hopefully, that such are allowed.
*Muke!
Gerard Meijssen wrote:
Ray Saintonge wrote:
Jimmy Wales wrote:
Thinking about how to import data from wiktionary in the ultimate wiktionary may pose a few puzzles with respect to FDL compliance, but I don't see any significant problems. The import script should keep track of who contributed to a chunk of data and take note of that fact. The history may be a little more problematic, and I think we will want to get advice on exactly how to do it.
Importing histories into Gerard's project will not remove them from the existing Wiktionaries. Simply putting a dated link to the source Wiktionary should do it. That would only become a problem if the existing Wiktionary deletes the article or otherwise makes it unworkable. In the existing transwiki technique article histories are moved with the term in question. This gives a nice list of who edited and when, but records of what changes these people made are unavailable. In some cases, particularly as regards very stubby articles, it has proved more practical to discard the transwikied material and its history completely, and start a whole new article.
To what wiktionary would you attribute information and on what level ?? When you interwiki, you expect that old style wiktionaries will continue to exist.. I expect that several wiktionaries, if not all, will eventually be discontinued.
It would be attributed to the Wiktionary that supplied the information. I don't know what you mean by "level".The rest of it is a question of differing visions
From my point of view there is nothing wrong with very stubby information. The only criterium is that information is correct.
I never expressed any opposition to stubby information. I only said that it was easier to rewrite some of those stubs.
But changing the license to something else would require throwing away all existing work in wiktionary, which seems quite unwise to me.
Not really, since the existing wiktionaries would continue as they have all along.
Ec
Hoi, When a Wiktionary has its data imported into Ultimate Wiktionary and when its users have migrated to the Ultimate Wiktionary as well, why should they continue as they did before. It does not makes sense to suggest that they would. I will not work on any Wiktionay if the work of UW proves to be more efficient. Given the amount of work that I did on the nl.wiktionary it can not continue as it did before.
The amount of work that you did? So is this a personal thing? There's a lot more to life than efficiency.
The idea that Wiktionary is similar to Wikipedia is also something that can be disputed. It is more of a lemma than an article. Wiktionary is in many respects like a list, the individual words cannot be copyrighted. The individual translations cannot be copyrighted. This is unlike Wikipedia. So if anything stubby lemmas in Wiktionary are not a problem. There is also the consideration that the content can come from many sources. It is therefore very difficult to state in the Ultimate Wiktionary who contributed to a word because it can originate from many sources. When you add for instance that a noun has a particular gender, you cannot claim copyright because that fact is in the public domain.
Any proposition can be disputed. One way to look at a dictionary is as an encyclopedia of and about words. Lemma? There is certainly more to a Wiktionary article than a list of headings or an intermediate theorem. To say that Wiktionary is nothing more than a list is to completely undervalue the work that has gone into it. Of course individual words cannot be copyright, and the individual translations of words cannot be copyright, but the way that definitions are expressed can be, and so can detailed explanations about how two words differ, or how to distinguish between false friends. Cointent can indeed come from many sources, and all of those sources do need to be recognized. Making note of the gender is a trivial exercise. Documenting who made that alteration is not a matter of copyright, but of recognizing that some person made the effort to put it there. If people make up sentences to illustrate a word those are copyright.
The GNU-FDL is not intended for data like wiktionary, this license was intended for manuals and it is not really appropriate for data like dictionaries. There are public domain dictionaries that will give you 90% of what you need to have in a modern dictionary.
And the other 10%? I accept that there are probably better licences for us. Being FDL does not override public domain when it's applicable.
When changing the license for Ultimate Wiktionary is a given, there are several ways of dealing with complaints from people who have contributed to a wiktionary. *You can import the same/similar data from several sources. So when a wiktionary editor complains he is removed as an editor and an other resource is attributed as its source. *This division is particularly easy for language independent stuff like translations wordtypes genders etc. Meanings and etymologies are more personal.
So you would remove all the efforts of people who don't agree with your system. That's not nice at all.
Given that it is only contributors can complain and given that changing the license will be done to make the UW more usefull, I doubt there will be many people that will complain. I also think that it will not be apreciated by the community when people complain as the reason why the license would be changed is to make it easier to fulfill the goals of the Wikimedia Foundation.
To claiim that you have that you have the key to fulfilling the goals of the Wikimedia Foundation seems a little inflated. Openness is better served by having a software that minimizes the difficulties for editors.
Ec
wiktionary-l@lists.wikimedia.org