Instructions: http://www.soschildrensvillages.org.uk/charity-news/2008-wikipedia-for-schoo...
Torrent link: http://www.soschildrensvillages.org.uk/schools-wikipedia-full-20081023.tar.g...
Size of .tar.gz is 3.1 GB.
- d.
David Gerard wrote:
Instructions: http://www.soschildrensvillages.org.uk/charity-news/2008-wikipedia-for-schoo...
Torrent link: http://www.soschildrensvillages.org.uk/schools-wikipedia-full-20081023.tar.g...
Size of .tar.gz is 3.1 GB.
Good idea but bad implementation - they're not compliant with the GFDL as far as I can tell. The articles have no author lists or backlinks to the edit histories on Wikipedia. Not sure if backlinks would actually be sufficient for a mirror intended for offline browsing, for that matter.
I've put them on the list over at http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Abc#2008.2F9_Wikipedia_Selection_for_schools but I haven't the time or energy to do proper follow up this weekend, anyone else want to take a stab?
2008/10/24 Bryan Derksen bryan.derksen@shaw.ca:
David Gerard wrote:
Instructions: http://www.soschildrensvillages.org.uk/charity-news/2008-wikipedia-for-schoo...
Torrent link: http://www.soschildrensvillages.org.uk/schools-wikipedia-full-20081023.tar.g...
Size of .tar.gz is 3.1 GB.
Good idea but bad implementation - they're not compliant with the GFDL as far as I can tell. The articles have no author lists or backlinks to the edit histories on Wikipedia. Not sure if backlinks would actually be sufficient for a mirror intended for offline browsing, for that matter.
I've put them on the list over at http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Abc#2008.2F9_Wikipedia_Selection_for_schools but I haven't the time or energy to do proper follow up this weekend, anyone else want to take a stab?
Someone mentioned that in a new thread as well. I agree, it doesn't look compliant. I never thought to check since it was being organised by a Wikipedia Admin... That admin reads this mailing list, so hopefully we'll get some comment soon.
On Fri, Oct 24, 2008 at 2:29 AM, Thomas Dalton thomas.dalton@gmail.comwrote:
Someone mentioned that in a new thread as well. I agree, it doesn't look compliant. I never thought to check since it was being organised by a Wikipedia Admin... That admin reads this mailing list, so hopefully we'll get some comment soon.
That was me on Foundation-l, after Danny Wool mentioned it on his blog.
Another problem I found is that no one has apparently vetted the images. For example:
http://schools-wikipedia.org/images/103/10307.jpg.htm
From that same F-35 Lightning article, images that were actually deleted as
non-free images. How are we distributing this?
http://commons.wikimedia.org/wiki/Commons:Deletion_requests/JSF_Images
We have an image in the schools encyclopedia with a big bold "this is up for deletion" notice, and that image in fact was deleted from Commons for being a copyvio violation. Back in AUGUST. So we're distributing copyright violations.
- Joe
Thanks for the images remark.
The script looks for and recognises licence types but this one did not have the public domain flag removed from it and the deletion notice was atypical.
We can catch these easily and quickly, shout if you see others. There were two out of 20000 last year.
Andrew
On Fri, Oct 24, 2008 at 5:18 PM, Joe Szilagyi szilagyi@gmail.com wrote:
On Fri, Oct 24, 2008 at 2:29 AM, Thomas Dalton thomas.dalton@gmail.comwrote:
Someone mentioned that in a new thread as well. I agree, it doesn't look compliant. I never thought to check since it was being organised by a Wikipedia Admin... That admin reads this mailing list, so hopefully we'll get some comment soon.
That was me on Foundation-l, after Danny Wool mentioned it on his blog.
Another problem I found is that no one has apparently vetted the images. For example:
http://schools-wikipedia.org/images/103/10307.jpg.htm
From that same F-35 Lightning article, images that were actually deleted as non-free images. How are we distributing this?
http://commons.wikimedia.org/wiki/Commons:Deletion_requests/JSF_Images
We have an image in the schools encyclopedia with a big bold "this is up for deletion" notice, and that image in fact was deleted from Commons for being a copyvio violation. Back in AUGUST. So we're distributing copyright violations.
- Joe
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
Joe Szilagyi wrote:
We have an image in the schools encyclopedia with a big bold "this is up for deletion" notice, and that image in fact was deleted from Commons for being a copyvio violation. Back in AUGUST. So we're distributing copyright violations.
Well, _we_ aren't distributing copyright violations, since we deleted the image and are no longer serving it from Wikipedia servers. SOS Children is distributing copyright violations, and for that particular image it's their problem to deal with.
I'm more concerned about the fact that they're not following the GFDL's terms in anywhere near to a serious manner. That means that they're violating _my_ copyright, and the copyrights of every other Wikipedian who's contributed to the articles they're mirroring. From what Andrew Cates has been posting on the matter (assuming he's acting in an official capcity of some sort) they don't seem to be grasping the significance of this. I don't want this to become an adversarial situation since they're doing a good thing with Wikipedia's material that I fully support in principle, but how can we convince them that this really needs to be done? Since they're already making deals with WikiMedia Foundation for the logo and such, who's the contact we can talk to on our side?
2008/10/25 Bryan Derksen bryan.derksen@shaw.ca:
Joe Szilagyi wrote:
We have an image in the schools encyclopedia with a big bold "this is up for deletion" notice, and that image in fact was deleted from Commons for being a copyvio violation. Back in AUGUST. So we're distributing copyright violations.
Well, _we_ aren't distributing copyright violations, since we deleted the image and are no longer serving it from Wikipedia servers. SOS Children is distributing copyright violations, and for that particular image it's their problem to deal with.
I'm more concerned about the fact that they're not following the GFDL's terms in anywhere near to a serious manner. That means that they're violating _my_ copyright, and the copyrights of every other Wikipedian who's contributed to the articles they're mirroring. From what Andrew Cates has been posting on the matter (assuming he's acting in an official capcity of some sort) they don't seem to be grasping the significance of this. I don't want this to become an adversarial situation since they're doing a good thing with Wikipedia's material that I fully support in principle, but how can we convince them that this really needs to be done? Since they're already making deals with WikiMedia Foundation for the logo and such, who's the contact we can talk to on our side?
I agree something needs to be done. Erik has already made a statement on the subject, so if you want to talk to someone at the foundation, he's probably the best choice. Alternatively, you could go straight to the top and talk to Sue, I've always found her most helpful and she'll pass you on to the appropriate person if necessary.
2008/10/25 Thomas Dalton thomas.dalton@gmail.com:
I agree something needs to be done. Erik has already made a statement on the subject, so if you want to talk to someone at the foundation, he's probably the best choice. Alternatively, you could go straight to the top and talk to Sue, I've always found her most helpful and she'll pass you on to the appropriate person if necessary.
I agree that the current attribution level is insufficient. I think it's important to recognize that they are trying to follow the GFDL in doing something we all want, which is to bring educational content to people who would otherwise not be able to access it. It's equally important to recognize that even parsing the meaning of the GFDL in the context of Wikipedia is not a trivial task, and people have provided various interpretations.
For example, by a literal interpretation, if we consider the version history the "history section" that's referenced in the GFDL, any derivative would not only have to include author names, but everything that's currently recorded in the history section for each article, which in some cases can be much more information than the article itself. See the actual text of the GFDL:
http://www.gnu.org/copyleft/fdl.html
[begin quote] # I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. [end quote]
Attribution for articles with thousands of associated usernames is not trivial and potentially very burdensome. This is equally an issue for electronic versions as for print, as the size of the actual educational content that organizations like SOS children want to include already far exceeds the media on which they are able to transport it. Storing the entire version history of the article on [[France]] means transporting the full meta-information about 9317 changes.
The gentleman's agreement has always been that linking to the history as part of the attribution is the minimum needed to be compliant at least for online copies. Instead of direct links, for whatever reason, SOS Children has chosen to have a general comment amounting to "Go to Wikipedia for the article histories" both to each page & to the general copyright text. Now, stepping back for a minute: In terms of the likelihood of someone actually visiting the article histories to determine attribution, I'm not convinced that there's a world of difference between those two options, especially for an offline version. At the same time I agree that it's important to establish and stay within minimal baseline attribution & history standards.
I've emailed Andrew Cates and asked to do what's possible to transition the product (both website & DVD) to direct website references to the history for each article. There may be limited flexibility on their part for the existing DVD copies, but we can at least gradually try to move the project to an acceptable level of compliance.
Erik Moeller wrote:
Attribution for articles with thousands of associated usernames is not trivial and potentially very burdensome.
Perhaps we could make this easier by doing the work ourselves? I imagine it'd be pretty straightforward to create a plugin for MediaWiki that would collect a list of all the usernames that had edited an article and throw away the duplicates and anonymous IPs. Maybe even add some way to manually make note of content from other articles merging in.
I've emailed Andrew Cates and asked to do what's possible to transition the product (both website & DVD) to direct website references to the history for each article. There may be limited flexibility on their part for the existing DVD copies, but we can at least gradually try to move the project to an acceptable level of compliance.
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
Ok guys. I am on this mailing list now and we are of course not wishing to be prominent jerks either, and we are looking at what else could be done. We do think we are following the license but we also agree we should be following the wishes of the community as far as they exist (and they ARE a moving target).
And as I have also pointed out this way of interpreting license compliance has been there for several years and is the same one as used by the "official" 0.5 release and the 0.7 project as well as two versions of this project. Currently I am not sure there is any plan to change for 1.0 but that is not my problem. This is the first time there has been serious compliant about our interpretation of license: previously there was some request for information. Our method on images is more attributory and harder work than those projects. Isn't it better that you have a change of finding copyvio images now the image pages are there?
Anyway once the conversion switches to the tone Erik is using I think/hope we can sort everything out fairly quickly.
The easiest thing is to give the exact url on both the DVD and the online version for the exact version in the article history from which we took the material. Everything is set up on a database in a way which makes this easy to do (it means making a few script changes and then running an overnight job). There are all sorts of reasons in terms of transparency on NPOV etc for doing this and this has moved from somewhere on the "sometime soon" list to "next priority" which basically means as soon as we all get back from half term at start Nov. We interpreted some comments from Erik on censorship as meaning we really should do this anyway.
The website and future downloads will go over immediately this is done. We are just discussing internally if this is worth delaying the next burn of DVDs (we only do 500 at a time, and we already have the copyvio images out of the master for that).
On other things (like whether we should give a list as prominent authors or some prominent authors etc on the DVD as well and whether we should do this per article or for the whole DVD), broadly we will think about the code practicality and try to produce something. We are looking at including another top page on "authors and licenses". That takes more work. Database tools aren't a fast shoe in either.
Andrew
On Sun, Oct 26, 2008 at 3:22 AM, Bryan Derksen bryan.derksen@shaw.ca wrote:
Erik Moeller wrote:
Attribution for articles with thousands of associated usernames is not trivial and potentially very burdensome.
Perhaps we could make this easier by doing the work ourselves? I imagine it'd be pretty straightforward to create a plugin for MediaWiki that would collect a list of all the usernames that had edited an article and throw away the duplicates and anonymous IPs. Maybe even add some way to manually make note of content from other articles merging in.
I've emailed Andrew Cates and asked to do what's possible to transition the product (both website & DVD) to direct website references to the history for each article. There may be limited flexibility on their part for the existing DVD copies, but we can at least gradually try to move the project to an acceptable level of compliance.
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
Bryan Derksen wrote:
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
While I'm on the subject, I've just discovered that http://www.wikipediaondvd.com/site.php is also noncompliant in the same general way (every article appears to link back to [[India]] as its source, regardless of content - see http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Vwxyz#Wikipedia_On_...). Since this site also uses the Wikipedia logo it should also be looked at by the Foundation about this.
2008/10/26 Bryan Derksen bryan.derksen@shaw.ca:
Bryan Derksen wrote:
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
While I'm on the subject, I've just discovered that http://www.wikipediaondvd.com/site.php is also noncompliant in the same general way (every article appears to link back to [[India]] as its source, regardless of content - see http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Vwxyz#Wikipedia_On_...). Since this site also uses the Wikipedia logo it should also be looked at by the Foundation about this.
Sigh. Volunteers to set up a series of GFDL compliance awards? Like the HTML compliance things.
2008/10/26 geni geniice@gmail.com:
2008/10/26 Bryan Derksen bryan.derksen@shaw.ca:
Bryan Derksen wrote:
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
While I'm on the subject, I've just discovered that http://www.wikipediaondvd.com/site.php is also noncompliant in the same general way (every article appears to link back to [[India]] as its source, regardless of content - see http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Vwxyz#Wikipedia_On_...). Since this site also uses the Wikipedia logo it should also be looked at by the Foundation about this.
Sigh. Volunteers to set up a series of GFDL compliance awards? Like the HTML compliance things.
We would need to work out what the GFDL actually requires for that, though...
There is also a question about whether this is a job for volunteers or say the legal counsel of WMF (who in the past has been out of sync with the community). The community may want lots of things which the license does not entitle them to ask for.
In respect of wikipediaondvd and schools-wikipedia though can I suggest this comes off the mailing lists onto the Project pages where the relevant people can give it attention.
Also someone somewhere particularly needs to look at offline attributions which is a multilingual problem.
Andrew
On Sun, Oct 26, 2008 at 7:19 PM, Thomas Dalton thomas.dalton@gmail.com wrote:
2008/10/26 geni geniice@gmail.com:
2008/10/26 Bryan Derksen bryan.derksen@shaw.ca:
Bryan Derksen wrote:
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
While I'm on the subject, I've just discovered that http://www.wikipediaondvd.com/site.php is also noncompliant in the same general way (every article appears to link back to [[India]] as its source, regardless of content - see http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Vwxyz#Wikipedia_On_...). Since this site also uses the Wikipedia logo it should also be looked at by the Foundation about this.
Sigh. Volunteers to set up a series of GFDL compliance awards? Like the HTML compliance things.
We would need to work out what the GFDL actually requires for that, though...
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
Thomas Dalton wrote:
We would need to work out what the GFDL actually requires for that, though...
How about setting a set of criteria, such as "links to Wikipedia's article history", "links to on-site copy of GFDL", "stores an on-site author list", etc., and give them a numeric rating without a clear-cut compliance/non-compliance point? That way we could at least give a relative rating to various sites, and WikiMedia could have standards to look at when deciding whether to grant approval to use the logo.
Ideally, GFDL version 2 will come out and it will remedy these flaws and give us CC compatibility and all manner of wonderful things. I recall WikiMedia Foundation's been working with the Free Software Foundation on this subject, does anyone know the latest news on that?
There are some pages on degree of compliance on en, at http://en.wikipedia.org/wiki/Wikipedia_talk:GFDL_Compliance. The problem with them is they are the most concentrated piece of Original Research I have ever seen on WP. Does anyone have any kind of well sourced opinion on the way that the licenses apply to Wikipedia? I have posted a request online.
Personally I agree the community should set its standards on attribution and community projects, especially logo-ed ones should be held to those standards but at present a group of well meaning individuals have worked out their own interpretations and started listing websites as "high" "medium" or "low" GFDL compliance, making significant allegations just based on a nested set of "probablies" in the license interpretation. Unless the sources and references are hidden somewhere out of sight? Otherwise all these pages should be changed to degree of compliance to community wishes and GFDL should be removed from the article names.
Andrew
On Sun, Oct 26, 2008 at 6:49 PM, Bryan Derksen bryan.derksen@shaw.ca wrote:
Bryan Derksen wrote:
Thanks. I do support what they're doing, and don't want to be a jerk about hounding a charity, but with Wikipedia's logo prominently displayed on this one I want to make sure nobody thinks we're putting a stamp of approval on this level of GFDL compliance.
While I'm on the subject, I've just discovered that http://www.wikipediaondvd.com/site.php is also noncompliant in the same general way (every article appears to link back to [[India]] as its source, regardless of content - see http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks/Vwxyz#Wikipedia_On_...). Since this site also uses the Wikipedia logo it should also be looked at by the Foundation about this.
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
----- "Erik Moeller" erik@wikimedia.org wrote:
For example, by a literal interpretation, if we consider the version history the "history section" that's referenced in the GFDL, any derivative would not only have to include author names, but everything that's currently recorded in the history section for each article, which in some cases can be much more information than the article itself. See the actual text of the GFDL:
http://www.gnu.org/copyleft/fdl.html
[begin quote] # I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence. [end quote]
Attribution for articles with thousands of associated usernames is not trivial and potentially very burdensome. This is equally an issue for electronic versions as for print, as the size of the actual educational content that organizations like SOS children want to include already far exceeds the media on which they are able to transport it. Storing the entire version history of the article on [[France]] means transporting the full meta-information about 9317 changes.
What happens if they haven't modified the document at all? Do they still have to create the History section if it didn't exist before? How many display-level changes constitute a modification to the document? Is putting a new template on Mediawiki a modification to the underlying documents? Is it unlawful to make up a template that doesn't have links back? What does this mean for plaintext print templates? Does someone have to print out 2000 pages of history along with the actual text in order to lawfully have a copy of the Wikipedia [[France]] article? What does this mean about the eventuality of any paper based reproduction of wikipedia.... GFDL doesn't seem to fit with the knowledge wiki model IMO as it is inflexible and based on arcane production cycles. Is it unlawful for someone not to put an edit comment on when they edit a wikipedia article for instance, as they are not fulfilling their GFDL obligations with describing their new version?
Cheers,
Peter
Good points here. The way I see it the Document referred to in the GFDL cannot be an individual Wikipedia article. It has to be the whole of Wikipedia. If the Document were an individual article then Wikipedia would be in breach of its own license. Every time people copy text between articles then they would create a Modified Version under the GFDL. They mostly do not comply with GFDL section 4 under these circumstances on a number of points.
So the only sensible interpretations are the whole of English Wikipedia or the whole of Wikipedia as the GFDL Document. This has the following implications for GFDL compliance:
- only need to give network location of Wikipedia, not individual articles - only need to give five principal authors of Wikipedia, not of individual articles - no real section Entitled "History", so no requirement to copy that
This makes a lot of sense to me as:
- I don't believe adding five principal authors of individual articles adds a lot of value. In any case this would lead to a lot of disputes and maybe gaming of the system if ever a system of defining principal authors was introduced. - the network location of articles is problematic in the general case of renames - the network location of articles (and thus edit histories and authors) is generally easily found from the article title and the Wikipedia URL - the spirit of GFDL history doesn't seem likely to me to intend every single minor edit - often copies (like Schools Wikipedia) would not want to include vandalism edit histories
Duncan
On Mon, Oct 27, 2008 at 2:34 AM, Peter Ansell ansell.peter@gmail.comwrote:
What happens if they haven't modified the document at all? Do they still have to create the History section if it didn't exist before? How many display-level changes constitute a modification to the document? Is putting a new template on Mediawiki a modification to the underlying documents? Is it unlawful to make up a template that doesn't have links back? What does this mean for plaintext print templates? Does someone have to print out 2000 pages of history along with the actual text in order to lawfully have a copy of the Wikipedia [[France]] article? What does this mean about the eventuality of any paper based reproduction of wikipedia.... GFDL doesn't seem to fit with the knowledge wiki model IMO as it is inflexible and based on arcane production cycles. Is it unlawful for someone not to put an edit comment on when they edit a wikipedia article for instance, as they are not fulfilling their GFDL obligations with describing their new version?
Cheers,
Peter
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l