Hi,
Mandrakesoft, the company which created and sells the Linux distribution, is interested to distribute a DVD with an English and French version of Wikipedia. This DVD will be sold in their web site and included with the next distribution, due in next April.
Mandrakesoft will take legal responsibilities for this publication and is ready to donate some money to the Wikimedia Foundation. The amount is still to be decided.
Mandrakesoft wants that we provide them with a master DVD, and would like to complete this first edition for Christmas.
As you may have noticed, a mention about this was included in the press release and the newsletter with the authorization of Mandrakesoft who will also publish a press release about this project.
The summary below is also available on http://meta.wikimedia.org/wiki/Wikimedia_and_Mandrakesoft
== Points fixed so far ==
* It will first be sold on Mandrake web site, then included in the next version of the distribution.
* It will include only the current version of the English and French distribution. Mandrakesoft publishes a French version sold in French speaking countries and an English (international) version sold elsewhere in the world. The English Wikipedia will be sold with the international version of Mandrake Linux.
* Mandrakesoft asks that the Wikimedia Foundation provide them with a master DVD.
* Mandrakesoft will take the legal responsibility for this publication.
* Fair use images should be removed as the publication has to comply with worldwide copyright standards, not US only. Also images without proper licensing information have to be removed.
== Questions that need answering ==
* Do we include only complete articles or the whole of Wikipedia including stubs? * How do we package it? Several possibilities, see the page on meta.
== What you can do ==
So we need some help to complete this project. * Work is needed to provide proper lisensing information on all images in the English Wikipedia. * Help packaging. Help with technical knowledge is needed here. Med and Hashar, among others, are already working on this.
Thanks,
Yann
Yann-
Hi,
Mandrakesoft, the company which created and sells the Linux distribution, is interested to distribute a DVD with an English and French version of Wikipedia. This DVD will be sold in their web site and included with the next distribution, due in next April.
As a responsible organization, we should make Mandrakesoft aware of the fact that no systematic vetting of all articles for copyright violations has taken place yet (at least on en:). If they want to take the responsibility for some guy inserting chapters from a book, or the text of a paper, or a magazine article, into Wikipedia, then that's OK, but this *is* a substantial risk, because changing the text on thousands of distributed DVDs is obviously a lot harder than taking down some bad revisions from our site.
If Mandrakesoft ends up getting into trouble for this, I would like us to be able to publicly say "Shit happens, but we told you so" when this hits Slashdot or the New York Times.
Unfortunately it's a little too easy - and therefore tempting - to create physical media distributions. We *really* need a working peer review mechanism in place before we go into that business. Besides fact-checking, we need a process where there are people who check the text against subscriber-only electronic archives, offline sources etc. for copyright violations, for example. This should be less difficult than it sounds if an expert in the field is checking the article anyway - those people usually have easy access to material in their field.
Even basic Google searches are often not done. In terms of automated scanning, we should at least cover Google, groups.google and the Amazon.com "search inside the book" feature.
Am I the only one who is worried about this?
Regards,
Erik
Erik Moeller wrote:
Am I the only one who is worried about this?
Probably not--I think a good first step would be to contact someone appropriate at MandrakeSoft and see if they really know what they're getting into. When they say they're willing to take the legal risk, do they know that there is a non-negligible chance that there are copyrighted materials in there somewhere? Or are they only thinking about libel/etc.? It's possible they're familiar with Wikipedia and already know all this, but someone should find out.
Besides that though, I don't see how we could possibly have any sort of peer-reviewed or even reasonably sifted version of Wikipedia available in time for a Christmas distribution, especially given the code infrastructure isn't in place yet for it to start. When they say they want a master DVD, do they mean some sort of reasonably vetted one, or are they waiting for us to stick a current snapshot on a DVD?
-Mark
Delirium a écrit:
Erik Moeller wrote:
Am I the only one who is worried about this?
Probably not--I think a good first step would be to contact someone appropriate at MandrakeSoft and see if they really know what they're getting into. When they say they're willing to take the legal risk, do they know that there is a non-negligible chance that there are copyrighted materials in there somewhere? Or are they only thinking about libel/etc.? It's possible they're familiar with Wikipedia and already know all this, but someone should find out.
The discussion with Mandrakesoft has been ongoing for perhaps 2 months.
Yann is the primary contact with them, and he explained all that with them. The board has been in copy of most of the mails, and Angela is the primary contact for this whole topic.
So, possibly, the first good step to do was already done, and it is not necessarily to find someone to check out if they already know about this.
If Yann is announcing this for you, it is because the deal is now public, not because the deal was just suggested yesterday. The french wikipedia has been labelling all its pictures during summer for that perspective, and it was strongly suggested that en does the same as well.
ant
The french wikipedia has been labelling all its pictures during summer for that perspective, and it was strongly suggested that en does the same as well.
See for instance http://www.wikipedia.org/wiki/Wikipedia:Image tagging
Delirium wrote:
Probably not--I think a good first step would be to contact someone appropriate at MandrakeSoft and see if they really know what they're getting into.
You probably aren't aware that formal talks have been going on for 2 months now. This wasn't just suggested yesterday, there has been substantial communication.
--Jimbo
Mandrakesoft are aware of our editing processes. They know what they are getting, and that the content has not been verified. They have agreed to take legal responsibility for this. Basically, the only changes we are making is to remove images tagged as fair use, unknown, unverified and other unsuitable ones. Since the image tagging efforts on the English Wikipedia are going fairly slowly, there may be a large number of untagged images would also need to be removed.
The arguments over whether a Wikipedia DVD is going to be useful to people aren't really for us to decide. Mandrake obviously think people are interested in this, and if they turn out not to be because they can read it online, then we haven't lost anything. The chances are though that this will significantly increase the exposure of Wikipedia to a wider audience. Hopefully many of them will access it online, and even become editors themselves, but I don't think that detracts from the appeal of having a DVD published. This DVD production is in no way meant to deter people from the validation processes that are being proposed. Obviously these will be hugely beneficial for future distributions. However, it's also going to take a very long time before the product of such processes is ready, so distributing a non-validated version in advance of that is beneficial.
No one is claiming this distribution is perfect, but as a snapshot of Wikipedia I feel it is valuable. It's great advertising for us, it's great as an early trial of distributing our content offline, and it's great for raising awareness of the need to tag images properly.
I strongly encourage others to help with the tagging drive as there are still many untagged images that can not be distributed at this stage. I'd also like to thank the following people for taking in part in the recent drive to tag images. This is based on the recent changes to the lists of untagged images, and the list of participants at [[Wikipedia:Image copyright tags]], so I apologize in advance for anyone I've missed out: Yann, Jdforrester, Eugene van der Pijll, Tom-, Diberri, Rich Farmbrough, Gamaliel, Stan Shebs, Lupin, Sj, Blankfaze, Chmod007, GeneralPatton, Frecklefoot, Sunborn, Morven, Ævar Arnfjörð Bjarmason, Trilobite, Poccil, Morwen, Secretlondon, Anthony DiPierro, Imran, Maximus Rex, Flockmeal, Guanaco, and Frazzydee. Thanks also to Looxix for creating the lists of untagged images and everyone who has been doing the same task on the French Wikipedia.
Please help with the tagging at http://en.wikipedia.org/wiki/User:Yann/Untagged_Images and see http://meta.wikimedia.org/wiki/Wikimedia_and_Mandrakesoft for further information.
Angela.
Are we talking about a formal written contract or about an oral agreement ? Needless to say I strongly encourage a formal distribution deal, the transfer of responsibilty having quite a big potential of financial troubles (suing Wikimedia Foundation is not an opportunity because it is a Foundation and it has no significant financial background, this is not at all the same for MandrakeSoft ...). If needed, I'm willing to assess such a written contract.
villy
----- Original Message ----- From: "Angela" beesley@gmail.com To: "Wikimedia Foundation Mailing List" foundation-l@wikimedia.org Sent: Thursday, September 23, 2004 1:39 PM Subject: Re: [Foundation-l] Partnership with Mandrakesoft
Mandrakesoft are aware of our editing processes. They know what they are getting, and that the content has not been verified. They have agreed to take legal responsibility for this. Basically, the only changes we are making is to remove images tagged as fair use, unknown, unverified and other unsuitable ones. Since the image tagging efforts on the English Wikipedia are going fairly slowly, there may be a large number of untagged images would also need to be removed.
The arguments over whether a Wikipedia DVD is going to be useful to people aren't really for us to decide. Mandrake obviously think people are interested in this, and if they turn out not to be because they can read it online, then we haven't lost anything. The chances are though that this will significantly increase the exposure of Wikipedia to a wider audience. Hopefully many of them will access it online, and even become editors themselves, but I don't think that detracts from the appeal of having a DVD published.
This DVD production is in no way meant to deter people from the validation processes that are being proposed. Obviously these will be hugely beneficial for future distributions. However, it's also going to take a very long time before the product of such processes is ready, so distributing a non-validated version in advance of that is beneficial.
No one is claiming this distribution is perfect, but as a snapshot of Wikipedia I feel it is valuable. It's great advertising for us, it's great as an early trial of distributing our content offline, and it's great for raising awareness of the need to tag images properly.
I strongly encourage others to help with the tagging drive as there are still many untagged images that can not be distributed at this stage. I'd also like to thank the following people for taking in part in the recent drive to tag images. This is based on the recent changes to the lists of untagged images, and the list of participants at [[Wikipedia:Image copyright tags]], so I apologize in advance for anyone I've missed out: Yann, Jdforrester, Eugene van der Pijll, Tom-, Diberri, Rich Farmbrough, Gamaliel, Stan Shebs, Lupin, Sj, Blankfaze, Chmod007, GeneralPatton, Frecklefoot, Sunborn, Morven, Ævar Arnfjörð Bjarmason, Trilobite, Poccil, Morwen, Secretlondon, Anthony DiPierro, Imran, Maximus Rex, Flockmeal, Guanaco, and Frazzydee. Thanks also to Looxix for creating the lists of untagged images and everyone who has been doing the same task on the French Wikipedia.
Please help with the tagging at http://en.wikipedia.org/wiki/User:Yann/Untagged_Images and see http://meta.wikimedia.org/wiki/Wikimedia_and_Mandrakesoft for further information.
Angela. _______________________________________________ foundation-l mailing list foundation-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/foundation-l
Le Thursday 23 September 2004 20:13, Jean-Christophe Chazalette a écrit :
Are we talking about a formal written contract or about an oral agreement ? Needless to say I strongly encourage a formal distribution deal, the transfer of responsibilty having quite a big potential of financial troubles (suing Wikimedia Foundation is not an opportunity because it is a Foundation and it has no significant financial background, this is not at all the same for MandrakeSoft ...). If needed, I'm willing to assess such a written contract.
villy
Thanks, Villy. I will need your help for drafting a notice.
Yann
Angela-
Mandrakesoft are aware of our editing processes. They know what they are getting, and that the content has not been verified. They have agreed to take legal responsibility for this.
Well, we obviously can't stop them from doing it. As long as we've publicly and privately disclaimed liability, I think we're reasonably safe from a legal position. But please do not underestimate the legal mess they could get into. It's not just image tagging. It's quite likely that there are at least a few hundred undetected copyvios on en: and fr:, from books, unindexed websites, closed electronic archives, newspaper articles, magazines, journals, and so forth, and probably even quite a few Google- indexed pages or articles which contain small fragments thereof.
This is not because people are malicious but because many people simply have no concept of copyright. It is a stupid idea to begin with, so it takes quite a while for people to grasp the notion that certain sequences of words can be owned. Even with people understanding the concept of IP in one context (music, movies) they often have difficulties translating it into other contexts (texts, images, recipes ..)
Therefore, this:
It's great advertising for us, it's great as an early trial of distributing our content offline, and it's great for raising awareness of the need to tag images properly.
seems a little too enthusiastic. "Raising awareness" is not always a good thing if it leads to lawsuits and headlines like "Wiki-fiddlers steal from many sources". Also, we are *aware* of the need to tag images properly. May I suggest that the board promote these two things to get the process into motion: 1) make open list of untagged images and announce properly on community portal and the like that all of these images which are not tagged by date N will be hidden 2) fix stupid upload form (They should happen at the same time.)
Regards,
Erik
--- Erik Moeller erik_moeller@gmx.de wrote:
- make open list of untagged images and announce properly on community
portal and the like that all of these images which are not tagged by date N will be hidden 2) fix stupid upload form (They should happen at the same time.)
I'm in 100% agreement with Erik on these points. I also do *not* want to see my favorite Linux distro get in trouble over this.
</aol>
But would it be possible to automatically exclude untagged images when the date arrives? Or would humans have to comment each out in wiki text? My understanding of the category system is that it is not yet up to task to do this type of thing. But things change so fast around here, I'm probably wrong on that point. Is so, please explain.
-- mav
__________________________________ Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. http://promotions.yahoo.com/new_mail
Yes, this has certainly been discussed for a long time... Most recently, Sansculotte produced a nicely detailed set of pages based on an older mockup of Erik's for a new upload form. The important text for those pages has been translated already; but that's the easy part compared to actually implementing it.
http://www.ru-info.de/upload_neu.htm http://meta.wikimedia.org/wiki/Translation_requests/New_upload_form
--Sj
On Fri, 24 Sep 2004 10:17:22 +0630, Eric Pöhlsen eric@eric-poehlsen.de wrote:
- fix stupid upload form
(They should happen at the same time.)
I do not post often in this list, but just want to mention that discussions for a better uploadform were running for ages and nothing has happened yet :(
May I STRONGLY suggest that we include ALL the other depreciated image tags as well?
Rationale: Migrating from the current technical possibility not to include ANY tag/description while uploading directly to the limited choice of tags as per the below mockup link would be too big of a change for many users, who will feel--wrong as they might be--that their images "have to be included". This will lead to many folks picking "unknown/don't know" (unbekannt/weiss nicht) -- and practically having no license tags is ACUTELY worse than having depreciated license tags. We should have MORE metadata, not less of it. Having "depreciated license"-images in there (marked as depreciated) is A LOT better because we can always deal with these pics and/or that entire category later and we'll at least know where things are at as regards these pics. We WON'T know that in case of "unknown/don't know" -- these would just be a wild heap of wildly unknown stuff, making it acutely harder to deal with in the future.
On 24 Sep 2004, at 22:34, Sj wrote:
<snip>
new upload form. The important text for those pages has been translated already; but that's the easy part compared to actually implementing it.
http://www.ru-info.de/upload_neu.htm http://meta.wikimedia.org/wiki/Translation_requests/New_upload_form
--Sj
On Fri, 24 Sep 2004 10:17:22 +0630, Eric Pöhlsen eric@eric-poehlsen.de wrote:
- fix stupid upload form
I don't think I understand your point since there is absolutely no reason for people to be uploading new images with deprecated licence tags. "Unknown" is not an option. If you don't know the licence, simply do not upload it. Such images will be deleted via the new process at http://en.wikipedia.org/wiki/Wikipedia:Possibly_unfree_images
The new upload form will not apply to existing images. Those are being dealt with via the tagging drive at http://en.wikipedia.org/wiki/User:Yann/Untagged_Images
Angela.
On Fri, 24 Sep 2004 23:13:57 +0200, Jens Ropers ropers@ropersonline.com wrote:
May I STRONGLY suggest that we include ALL the other depreciated image tags as well?
Rationale: Migrating from the current technical possibility not to include ANY tag/description while uploading directly to the limited choice of tags as per the below mockup link would be too big of a change for many users, who will feel--wrong as they might be--that their images "have to be included". This will lead to many folks picking "unknown/don't know" (unbekannt/weiss nicht) -- and practically having no license tags is ACUTELY worse than having depreciated license tags. We should have MORE metadata, not less of it. Having "depreciated license"-images in there (marked as depreciated) is A LOT better because we can always deal with these pics and/or that entire category later and we'll at least know where things are at as regards these pics. We WON'T know that in case of "unknown/don't know" -- these would just be a wild heap of wildly unknown stuff, making it acutely harder to deal with in the future.
Brion wrote:
Wikipedia is a very valuable resource, but it's a *dynamic* one.
It's not dynamic for the majority of users. Most users of the site will simply read the article, and never edit it, regardless of how wrong it might be. Therefore, for everyone other than the editors of the site, a snapshot on DVD is no worse than the snapshot they see of the online Wikipedia.
There's a *lot* of crud in general. There will be mistakes. There will
be falsehoods. There will be 'FUCKFUCKFUCK' vandalism.
That's covered in the disclaimers. Not quite in those words though. ;)
And in six months when they go to press, the Wikipedia on the web will be much
improved
I don't see how this is an argument against a DVD version. The live website will always be better, and we will prominently link to it from any static version, but if the site is acceptable enough to allow the public to see it, why is a DVD not?
Wikipedia is a useful resource, despite its shortcomings. It's not just some draft awaiting approval before publication. It is already being published, even if only online. The validation processes will improve Wikipedia, but they should not be force the current version to be seen as some second-class useless collection of articles that can't be distributed. Wikipedia is good enough to distribute now and the possibility of a better version in 6 months does not negate that.
Erik wrote:
Well, we obviously can't stop them from doing it. As long as we've publicly and privately disclaimed liability, I think we're reasonably safe from a legal position.
Villy is working on a formal contract with them, so the assurances they have given us informally will be put into writing. We can never guarantee our content will be entirely free from copyright violations and no amount of Google checking will solve that. The publishers will obviously need disclaimers, insurance, and an easy way of correcting this for future distributions.
- make open list of untagged images and announce properly on community
portal and the like that all of these images which are not tagged by date N will be hidden
There is a list at http://en.wikipedia.org/wiki/User:Yann/Untagged_Images which has been advertised many times on the village pump, goings on, the mailing list and in the IRC channel topic. It's now on the portal as well. I don't think threatening to hide them will help since there is already the threat to delete them at http://en.wikipedia.org/wiki/Wikipedia:Possibly_unfree_images
- fix stupid upload form
I strongly agree. Even the one currently on the test wiki at http://test.wikipedia.org/wiki/Special:Upload is better than the current one. Is there any reason the one at http://meta.wikimedia.org/wiki/Image:Uploadform1.png can not be used? If it's nowhere near completion, perhaps the developer committee could consider putting a bounty on it?
mav wrote:
But would it be possible to automatically exclude untagged images
They will be excluded automatically from the DVD edition. I don't yet know how the captions will be hidden from articles in cases where the image doesn't exist.
Angela.
On Sep 24, 2004, at 4:25 PM, Angela wrote:
- fix stupid upload form
I strongly agree. Even the one currently on the test wiki at http://test.wikipedia.org/wiki/Special:Upload is better than the current one. Is there any reason the one at http://meta.wikimedia.org/wiki/Image:Uploadform1.png can not be used?
It's just a mock-up. To the best of my knowledge no such form has been coded.
If it's nowhere near completion, perhaps the developer committee could consider putting a bounty on it?
As far as I know this is part of Erik's Commons proposal that's remained simply a proposal, with no code written. Erik, where are you on this?
-- brion vibber (brion @ pobox.com)
Brion-
As far as I know this is part of Erik's Commons proposal that's remained simply a proposal, with no code written. Erik, where are you on this?
My final deadline for my book is Sep. 30, after which point I have more time for Wikimedia issues. The Commons code is one of the things I'd like to start working on then. However, I'd have no problem working on the basis of a slightly improved standard upload form, e.g. one using a <selection> to provide a list of licenses and generating {{FDL}}, {{PD}} etc. based on that. That would already be a great step forward.
Regards,
Erik
On Sep 25, 2004, at 1:29 AM, Erik Moeller wrote:
Brion-
As far as I know this is part of Erik's Commons proposal that's remained simply a proposal, with no code written. Erik, where are you on this?
My final deadline for my book is Sep. 30, after which point I have more time for Wikimedia issues. The Commons code is one of the things I'd like to start working on then. However, I'd have no problem working on the basis of a slightly improved standard upload form, e.g. one using a <selection> to provide a list of licenses and generating {{FDL}}, {{PD}} etc. based on that. That would already be a great step forward.
Spiffy, thanks.
-- brion vibber (brion @ pobox.com)
Angela wrote:
Brion wrote:
There's a *lot* of crud in general. There will be mistakes. There will be falsehoods. There will be 'FUCKFUCKFUCK' vandalism.
That's covered in the disclaimers. Not quite in those words though. ;)
A cd-edition of the german wikipedia is currently produced. Before we gave the data away, I did a fulltext search on several terms like this - I didn't get any results ;-)
greetings, elian
Elisabeth Bauer wrote:
Angela wrote:
Brion wrote:
There's a *lot* of crud in general. There will be mistakes. There will be falsehoods. There will be 'FUCKFUCKFUCK' vandalism.
That's covered in the disclaimers. Not quite in those words though. ;)
A cd-edition of the german wikipedia is currently produced. Before we gave the data away, I did a fulltext search on several terms like this
- I didn't get any results ;-)
Is the search function currently picking up the offending term when it is buried in a would-be 12-letter word as in the example?
Ec
On Sep 22, 2004, at 2:46 PM, Yann Forget wrote:
Mandrakesoft wants that we provide them with a master DVD, and would like to complete this first edition for Christmas.
I have to warn that this schedule sounds insanely optimistic. Somebody would need to check and lock off for publishing several thousand articles each day in order to meet this deadline.
-- brion vibber (brion @ pobox.com)
Yann Forget wrote:
Hi,
Mandrakesoft, the company which created and sells the Linux distribution, is interested to distribute a DVD with an English and French version of Wikipedia. This DVD will be sold in their web site and included with the next distribution, due in next April.
That's all very interesting, but did you have to tell me about it FIVE TIMES???
Please follow up to wikipedia-l.
-- Tim Starling
wikimedia-l@lists.wikimedia.org