Hoi,
So far all Wiktionary content has been licensed under a GNU-FDL license. With the Ultimate Wiktionary new functionality becomes a real possibility. One of these is providing information using the .dict format that has been described in RFC 2229. I learned from Hippietrail that .dict can also be used to have a local dictionary on your PC. At this moment almost every article has multiple authors it is not realistic to require the full history with every article as the GNU-FDL does. To grow the relevance of the Ultimate Wiktionary we do want to expand the way in which it can be used. I learned from Erik Moeller that we could regularly create an Ultimate Wiktionary export in the .dict format and have these distributed with Linux distributions for instance of with bittorrents. Again the license would be an issue it is not feasible to export the complete history with every word.
I am not an expert on licenses. It is not really my cup of tea. As far as I am concerned, the Ultimate Wiktionary content should remain Free therefore a license needs a viral aspect; the data should stay free. I really appreciate the history and therefore I do want to keep the author information within UW. But I also want to expand the use of what we are working on so I am not bothered about the history remaining with the data when it finds this other use.
Changing the license within a running project is difficult. As Ultimate Wiktionary will be a new database it is best to publish its content from the start using a license that will enable the expanded use that is possible with the new technology. Exporting TO the Ultimate Wiktionary will be problematic in that it will replace many of the existing Wiktionaries. It is not feasible to import the Wiktionary content including the full history. It will be hard but possible to parse the current information and enter this into UW. It is possible to mention the persons that worked on an article on the talk page. Given the aims of the Wikimedia projects I do not think from a moral point of view there should be a problem converting Wiktionary data to UW and change the license in the process. It will be impossible to convert all the data to UW and/or maintain all the history information as only the data that can be parsed can be entered into UW in the first place. Also there will be a large amount of manual work to make this conversion possible
We can also convert the data to the UW, change the license, recognise past efforts by publishing history details on the talk page and wait for people to object. An objection would result in the removal of the work they contributed to. This would be a pragmatic way of coping with issues.
Basically I have two questions;
*What license would be best that is FREE and allows for the expanded use of the UW data *Do we need to have the consent of every editor before we can export to UW or is UW sufficiently different from Wiktionary to make it an original work in its own right or do we need this only when we change the license?
Thanks, GerardM
On 19/05/05, Gerard Meijssen gerard.meijssen@gmail.com wrote:
*What license would be best that is FREE and allows for the expanded use of the UW data
The best place to look at when working this out is probably the Creative Commons [http://creativecommons.org/] - their mission [or one part of it] is to codify a whole set of licences and present them in a user-friendly way, so you can take your pick.
The closest equivalent to the GFDL is "Attribution-ShareAlike" (aka "cc-by-sa") - see http://creativecommons.org/licenses/by-sa/2.0/ How well this would deal with the massive number of authors involved in the whole database, I'm not sure, but I get the general impression that it's slightly more flexible than the GFDL..
There used to also be an option to not require attribution at all, but this was "phased out" "for simplicity" when they released "version 2.0" licenses (see http://creativecommons.org/weblog/entry/4216). Broadly, "cc-sa 1.0" [http://creativecommons.org/licenses/sa/1.0/] says you can do what you like with the work as long as you let everyone else do what they like with the result.
*Do we need to have the consent of every editor before we can export to UW or is UW sufficiently different from Wiktionary to make it an original work in its own right or do we need this only when we change the license?
Massive disclaimer: I am no kind of expert, and someone may well say I'm completely wrong on this; feel free to take their word over mine as soon as that happens.
That said, I will say that I understand relicencing to be, legally, very awkward indeed. Because the GFDL has exactly the kind of "copyleft"/"share-alike"/"viral" stipulation - you have to keep the content under similar terms however you manipulate it - anything which uses the existing data *has to be* "compatible" with the GFDL. Basically, all the people who've contributed to existing Wiktionaries own the copyright on their contributions, and have given explicit permission to license to them under the GFDL; they have *not* given permission for anyone to use them under a *different* license.
Your idea of "almost complying" and offering people an avenue of complaint is indeed a nice compromise - and as long as the way the data is used is "in the right spirit", I doubt anyone but trouble-makers *would* complain. [In fact, this is more-or-less what happens at the moment with content that's moved or copied between languages or projects; although, they're only moved, so far, between projects with the same license] What the *legal* implications are, I'm not sure - but then I think I'm right in saying that even the tremendously popular GPL has never actually been tested in court, so make of that what you will...
Switching to a license which didn't require attribution at all probably isn't at all possible though, because contributors haven't even broadly given "us" (or anyone) the right to use their contributions without giving them credit. So as far as I can see the best one can hope for is a simplified way of including the same information (either by a loose reading of the GFDL, or a simpler licence in similar spirit, which is the same thing).
One final comment - it's often mentionned in discussions on this topic that the Free Software Foundation and Creative Commons should be, and are, working on making their licences (GFDL and cc-by-sa) legally "compatible", so that works published under one can be used in works published under the other. I've not heard of any concrete progress on this, though, beyond "productive discussions", so don't hold your breath, as the saying goes.
A solution that Jimbo suggested for a similar problem on wikinews, was to require not only all contributions to be licensed under the gfdl/cc/whatever but also a proviso that by editing you agree to allow wikimedia to put your contributions under any license they want. So it will always be gfdl/cc for ever no matter what wikimedia does or says, but it can also be relicensed under a different license so that (for example) it could be easily distributed in the .dict fromat.
btw, every time i look at wiktionary i just get more and more exicted about its possibilites. It is such a potentialy awesome tool for linguists, teachers, students of a second language, etc .
peace out, -[[User:The bellman]]
A solution that Jimbo suggested for a similar problem on wikinews, was to require not only all contributions to be licensed under the gfdl/cc/whatever but also a proviso that by editing you agree to allow wikimedia to put your contributions under any license they want.
So it will always be gfdl/cc for ever no matter what wikimedia does or says, but it can also be relicensed under a different license so that (for example) it could be easily distributed in the .dict fromat.
It's a nice idea... but contract law would not allow it. Somehow, you can't accept to accept anything...
But there are solutions for a massive licence change. For example, I suggest to begin by offering dual licensing for new articles and editing. With a bit of time, more and more contributors would agree to change the license of their content. With good tagging, you could certainly keep a trace of licence changes and wait for the moment there will be enough dual licenses to begin removing the old licensed content.
Also it would be very important for wikipedia to provide its own license and to allow it to evolve. As a policy matter, I don't believe it's really safe to leave your legal needs within the hands of others like we did with the GNU/FDL (and it would be the same problem with CC).
Jean-Baptiste Soufron CNRS-CERSA Paris 2 http://soufron.free.fr
I think that there is no need to avoid GNU FDL terms. Just put gzipped/bzip2ed history inside of package and in all of articles can reference that contribution history can be found in that file.
On 5/19/05, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Hoi,
So far all Wiktionary content has been licensed under a GNU-FDL license. With the Ultimate Wiktionary new functionality becomes a real possibility. One of these is providing information using the .dict format that has been described in RFC 2229. I learned from Hippietrail that .dict can also be used to have a local dictionary on your PC. At this moment almost every article has multiple authors it is not realistic to require the full history with every article as the GNU-FDL does. To grow the relevance of the Ultimate Wiktionary we do want to expand the way in which it can be used. I learned from Erik Moeller that we could regularly create an Ultimate Wiktionary export in the .dict format and have these distributed with Linux distributions for instance of with bittorrents. Again the license would be an issue it is not feasible to export the complete history with every word.
I am not an expert on licenses. It is not really my cup of tea. As far as I am concerned, the Ultimate Wiktionary content should remain Free therefore a license needs a viral aspect; the data should stay free. I really appreciate the history and therefore I do want to keep the author information within UW. But I also want to expand the use of what we are working on so I am not bothered about the history remaining with the data when it finds this other use.
Changing the license within a running project is difficult. As Ultimate Wiktionary will be a new database it is best to publish its content from the start using a license that will enable the expanded use that is possible with the new technology. Exporting TO the Ultimate Wiktionary will be problematic in that it will replace many of the existing Wiktionaries. It is not feasible to import the Wiktionary content including the full history. It will be hard but possible to parse the current information and enter this into UW. It is possible to mention the persons that worked on an article on the talk page. Given the aims of the Wikimedia projects I do not think from a moral point of view there should be a problem converting Wiktionary data to UW and change the license in the process. It will be impossible to convert all the data to UW and/or maintain all the history information as only the data that can be parsed can be entered into UW in the first place. Also there will be a large amount of manual work to make this conversion possible
We can also convert the data to the UW, change the license, recognise past efforts by publishing history details on the talk page and wait for people to object. An objection would result in the removal of the work they contributed to. This would be a pragmatic way of coping with issues.
Basically I have two questions;
*What license would be best that is FREE and allows for the expanded use of the UW data *Do we need to have the consent of every editor before we can export to UW or is UW sufficiently different from Wiktionary to make it an original work in its own right or do we need this only when we change the license?
Thanks, GerardM
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
(To correct myself...)
I think that there is no need to avoid GNU FDL terms. Just put gzipped/bzip2ed history inside of package and in all of articles put reference (that contribution history can be found in that file).
Milos Rancic wrote:
(To correct myself...)
I think that there is no need to avoid GNU FDL terms. Just put gzipped/bzip2ed history inside of package and in all of articles put reference (that contribution history can be found in that file).
Hoi, Apparantly you do not understand that there are several issues that make the GNU-FDL not practical.
*Wiktionary data cannot be imported properly into Ultimate Wiktionary. UW has no room for gzipped or bzipped history. It is a server side database and nobody is going to see the information in this way. *Ultimate Wiktionary will import data from many Wiktionaries, the first one could be the nl:wiktionary. Many articles have been copied to and from the it:wiktionary. Suppose an article is shared, it arrived first from the nl:wikipedia so that one rules.. right ? Now what history should we have with the article ?? From a GNU-FDL point of view it is unforseen, crazy. *When we keep all these histories, who can say it is "my" work? I contributed to it ?? *When we export content to the .dict or RFC 2229 format, this is a subset of the data that we have on a word, a concept. We have the UW history and all these Wiktionary histories. Histories for each word. Histories for possibly a file with a few fields like: "Word" "Description" "Translation" "Original source". The amount of bagage that we should carry according to the GNU-FDL is unforseen and crazy. It just does not make sense. It is also data that has no stucture. Who will ever look at it ??
My conclusion is that the current GNU-FDL does not funtion for atomic information like we will have in Ultimate Wiktionary. When it prevents the implementation of new use for the data that we have, it becomes a hindrance. The goal of the Wikimedia Foundation is to make Free information available. When a license like the GNU-FDL only allows for server side information that has a static structure, I am sure that even Richard Stallman will find the arguments to ammend the GNU-FDL compelling.
One crucial thing in all this is that Free information should stay Free and be accessible. The current Wiktionary data is as closed as any proprietary datacollection. This is because of its lack of structure. It cannot be used for anything but server side information. Ultimate Wiktionary intends to combine the strength of the information that we have in all our wiktionaries, it will be structured. It does allow accessibility and new innovative uses. By being Free, accessible and innovative, we will gain a much wider public, these will not only be users of our data but also providers of data. This is what we aim for.
In the current nl:wiktionary we have people and organisations like FrankC and www.ziekenhuis.nl who contributed big time to the content of Wiktionary. We do need to recognise their contributions. They donated important body of works but we also have people like MARCEL and S.V.E.T who added content on a regular basis, it is important that we recognise their hard work and their contributions. They make and made it the success it is. So if anything, we should find a way to honour the members of our Wiktionary community as we move forward to an Ultimate Wiktionary.
Thanks, GerardM
Hello,
changing the license of a running project is difficult, but not that much. Just think that it will take some time but it will avoid all the present problems we have on WP.
Le 19 mai 05 à 18:45, Gerard Meijssen a écrit :
Hoi,
So far all Wiktionary content has been licensed under a GNU-FDL license. With the Ultimate Wiktionary new functionality becomes a real possibility. One of these is providing information using the .dict format that has been described in RFC 2229. I learned from Hippietrail that .dict can also be used to have a local dictionary on your PC. At this moment almost every article has multiple authors it is not realistic to require the full history with every article as the GNU-FDL does. To grow the relevance of the Ultimate Wiktionary we do want to expand the way in which it can be used. I learned from Erik Moeller that we could regularly create an Ultimate Wiktionary export in the .dict format and have these distributed with Linux distributions for instance of with bittorrents. Again the license would be an issue it is not feasible to export the complete history with every word.
I am not an expert on licenses. It is not really my cup of tea. As far as I am concerned, the Ultimate Wiktionary content should remain Free therefore a license needs a viral aspect; the data should stay free. I really appreciate the history and therefore I do want to keep the author information within UW. But I also want to expand the use of what we are working on so I am not bothered about the history remaining with the data when it finds this other use.
Changing the license within a running project is difficult. As Ultimate Wiktionary will be a new database it is best to publish its content from the start using a license that will enable the expanded use that is possible with the new technology. Exporting TO the Ultimate Wiktionary will be problematic in that it will replace many of the existing Wiktionaries. It is not feasible to import the Wiktionary content including the full history. It will be hard but possible to parse the current information and enter this into UW. It is possible to mention the persons that worked on an article on the talk page. Given the aims of the Wikimedia projects I do not think from a moral point of view there should be a problem converting Wiktionary data to UW and change the license in the process. It will be impossible to convert all the data to UW and/or maintain all the history information as only the data that can be parsed can be entered into UW in the first place. Also there will be a large amount of manual work to make this conversion possible
We can also convert the data to the UW, change the license, recognise past efforts by publishing history details on the talk page and wait for people to object. An objection would result in the removal of the work they contributed to. This would be a pragmatic way of coping with issues.
Basically I have two questions;
*What license would be best that is FREE and allows for the expanded use of the UW data *Do we need to have the consent of every editor before we can export to UW or is UW sufficiently different from Wiktionary to make it an original work in its own right or do we need this only when we change the license?
Thanks, GerardM
foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Jean-Baptiste Soufron:
Hello,
changing the license of a running project is difficult, but not that much. Just think that it will take some time but it will avoid all the present problems we have on WP.
I agree. I would, however, caution against moving too quickly. First, as Jean-Baptiste says, it may be desirable to have a license which is under our own control to develop (put it on a wiki ;-). Second, the license which is the most suitable for wikis, CC-WIKI http://creativecommons.org/drafts/wiki_0.5 is currently still a draft.
One key advantage of CC-WIKI over CC-BY-SA is that it does not require attribution to any particular person, but to the wiki community (a designated entity). I'm not sure how compatible this is with EU moral rights law, though. CC-BY-SA/CC-BY, on the other hand, require attribution to the "original author" only. This, too, might be a problem with moral rights, and it's certainly not very wiki-like to just attribute the first person making an edit.
I believe that Jimmy is in talks with the Creative Commons people about CC-WIKI. There have also been some attempts to make CC-BY-SA and GFDL compatible to one another. The latter would be desirable for Wikipedia.
For our other projects, I think the most reasonable course of action is to try to find some agreement between Creative Commons and Wikimedia that lets us steer the development of CC-WIKI, but CC would provide the legal review to bring it in line with national laws. The license would still be a Creative Commons license.
After we have a suitable license, we can make an effort to get agreement from the signed in Wiktionary contributors to dual-license their content.
All best,
Erik
I agree. I would, however, caution against moving too quickly. First, as Jean-Baptiste says, it may be desirable to have a license which is under our own control to develop (put it on a wiki ;-). Second, the license which is the most suitable for wikis, CC-WIKI http://creativecommons.org/drafts/wiki_0.5 is currently still a draft.
One key advantage of CC-WIKI over CC-BY-SA is that it does not require attribution to any particular person, but to the wiki community (a designated entity). I'm not sure how compatible this is with EU moral rights law, though.
Well it is not compatible with moral rights... and it is not compatible with patrimonial rights...
CC-BY-SA/CC-BY, on the other hand, require attribution to the "original author" only. This, too, might be a problem with moral rights, and it's certainly not very wiki-like to just attribute the first person making an edit.
... It depends...
I believe that Jimmy is in talks with the Creative Commons people about CC-WIKI. There have also been some attempts to make CC-BY-SA and GFDL compatible to one another. The latter would be desirable for Wikipedia
As being one of the translator of CC for France, I would be happy to be put in touch... Jimmy ?
I'm not convinced relicensing existing project would be at all easy, and I think there are more fundamental issues than what the ideal license would be. I see 4 options:
1) work out a way of distributing the content in different ways but still adhering precisely to the GFDL. For instance, am I right in thinking that the GFDL doesn't actually require the *history*, only a list of *authors*? If so, the list required becomes slightly less unwieldy, as it is simply a long list of names (and/or pseudonyms); there's not even any need to worry about who contributed to which parts, or which imported version takes "precedence" - just shove all the names in one big list and say "these are the authors".
2) find a way of interpretting the GFDL that allows us to relax some of the contraints - in other words, a loop-hole in the copyleft provision, which says that any derived work has to have all the same constraints as the GFDL has. In this category comes an appropriately compatible new version of the license, or some cunningly compatible-but-different license developed by the Creative Commons. This might allow the way the atrribution etc are presented to be different, but it is unlikely to change what information has to be included (i.e. I don't see that you could ever legally re-distribute a GFDL text with no author information at all)
3) stick to the *spirit* of the GFDL, but don't enforce it fully and hope no contributor does either; this is more-or-less what Wikipedia's been doing for years, letting people run mirrors, copy information, etc, with minimal attribution in the form of linking back to the orginal version. Similarly, "transwiki-ing" content between projects pays lip-service to giving attribution, but almost certainly doesn't comply with the full license. Legally dodgy, but morally excusable if done right.
4) convince everybody who has ever contributed to the projects in question to re-license/dual license their contributions. Or, come up with some way of only importing content that has never been touched by someone who hasn't licensed their work in this way (you could include a version of a page from *before* a non-dual-licenser editted, but every version after they have done so would be a "derivative work" of their version); how much information can be salvaged in such an operation depends on how reachable contributors - particularly those who made the *earliest* edits to content - are, and how successful the campaign to get their agreement.
Of course, I may be completely wrong...
Well, all of this is exactly what I am suggesting...
- work out a way of distributing the content in different ways but
still adhering precisely to the GFDL. For instance, am I right in thinking that the GFDL doesn't actually require the *history*, only a list of *authors*? If so, the list required becomes slightly less unwieldy, as it is simply a long list of names (and/or pseudonyms); there's not even any need to worry about who contributed to which parts, or which imported version takes "precedence" - just shove all the names in one big list and say "these are the authors".
- find a way of interpretting the GFDL that allows us to relax some
of the contraints - in other words, a loop-hole in the copyleft provision, which says that any derived work has to have all the same constraints as the GFDL has. In this category comes an appropriately compatible new version of the license, or some cunningly compatible-but-different license developed by the Creative Commons. This might allow the way the atrribution etc are presented to be different, but it is unlikely to change what information has to be included (i.e. I don't see that you could ever legally re-distribute a GFDL text with no author information at all)
- stick to the *spirit* of the GFDL, but don't enforce it fully and
hope no contributor does either; this is more-or-less what Wikipedia's been doing for years, letting people run mirrors, copy information, etc, with minimal attribution in the form of linking back to the orginal version. Similarly, "transwiki-ing" content between projects pays lip-service to giving attribution, but almost certainly doesn't comply with the full license. Legally dodgy, but morally excusable if done right.
- convince everybody who has ever contributed to the projects in
question to re-license/dual license their contributions. Or, come up with some way of only importing content that has never been touched by someone who hasn't licensed their work in this way (you could include a version of a page from *before* a non-dual-licenser editted, but every version after they have done so would be a "derivative work" of their version); how much information can be salvaged in such an operation depends on how reachable contributors - particularly those who made the *earliest* edits to content - are, and how successful the campaign to get their agreement.
Of course, I may be completely wrong...
-- Rowan Collins BSc [IMSoP] _______________________________________________ foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Just taking this off-topic a little. I just read RFC2229 (DICT), and it states that it uses UTF-8. I thought thier were various problems with using UTF-8, regarding asian languages, but i could be wrong... btw, where are the discussion pages and what not for ultimate wiki.
I am so very excited about the possiblities.
paz y amor, -rjs
Robin Shannon wrote:
Just taking this off-topic a little. I just read RFC2229 (DICT), and it states that it uses UTF-8. I thought thier were various problems with using UTF-8, regarding asian languages, but i could be wrong...
Such as...?
We're already using UTF-8 for everything except a few of the older European-language Wikipedias which are on an 8-bit ISO 8859 encoding, and those will be finally converted when we upgrade to 1.5.
While UTF-8 is somewhat less space efficient in that range than some alternatives, most alternatives are less convenient for many purposes. Its coverage is equal to any other Unicode data encoding, and far easier to work with for multilingual text than anything that's not Unicode.
-- brion vibber (brion @ pobox.com)
Ok, having had a look on wikipedia, and not finding anything, i withdraw this claim, and claim amnesia. Sorry.
paz y amor, -rjs
2005/5/21, Brion Vibber brion@pobox.com:
Robin Shannon wrote:
Just taking this off-topic a little. I just read RFC2229 (DICT), and it states that it uses UTF-8. I thought thier were various problems with using UTF-8, regarding asian languages, but i could be wrong...
Such as...?
We're already using UTF-8 for everything except a few of the older European-language Wikipedias which are on an 8-bit ISO 8859 encoding, and those will be finally converted when we upgrade to 1.5.
While UTF-8 is somewhat less space efficient in that range than some alternatives, most alternatives are less convenient for many purposes. Its coverage is equal to any other Unicode data encoding, and far easier to work with for multilingual text than anything that's not Unicode.
Brion Vibber wrote:
We're already using UTF-8 for everything except a few of the older European-language Wikipedias which are on an 8-bit ISO 8859 encoding.
Which ones of those are using something other than ISO-8859-1?
Timwi wrote:
Brion Vibber wrote:
We're already using UTF-8 for everything except a few of the older European-language Wikipedias which are on an 8-bit ISO 8859 encoding.
Which ones of those are using something other than ISO-8859-1?
None. (Or, all depending on your point of view -- many web clients will silently submit data in Windows Code Page 1252 which is a slight superset of ISO 8859-1.)
-- brion vibber (brion @ pobox.com)
wikimedia-l@lists.wikimedia.org