No idea how we're going to unravel this one.
We have received, over a period of months, numerous reports to OTRS regarding copyvios from the www.marvunapp.com web site. That site publishes character summaries for scores of comic book characters ranging from the well-known to those obscure figures that make only cameo appearances. The summaries are written by a fairly small number of contributors to the the site and are generally not edited once they are posted.
Their site content is making its way to Wikipedia. There are three problems: a) Cut and paste copying of text b) Paraphrasing of text c) Unattributed copying of images the site has scanned from comic books
(a) and (b) are copyright violations. The problem goes back at least six months and is ongoing.
(c) while not strictly speaking a copyvio since marvunapp.com is merely making scans of flat art, is upsetting the site operators due to the volume of material involved and the lack of attribution. If left unaddressed (c) is likely to make them more pedantic about their rights regarding (a) and (b). So far they've been pretty understanding and supportive, considering.
Right now we don't know the extent of the problem because we've only investigated cases that the marvunapp.com people have reported, which cover perhaps a couple dozen pages all together. I've investigated a couple myself and it is very time consuming since often there is just a paragraph or two in the middle of a longer Wikipedia article that's a copyvio and the use of paraphrasing makes automated tools unhelpful. Addressing these in a meaningful way is difficult because ongoing edits after the copyvio leads to a badly polluted page history that no one wants to delete, and I've ended up having my removals reverted over the course of time by people who show up two weeks later and don't realize when and why the text was removed, so they re-add it from the history.
The copyvios have been added by more than one user, probably because unrelated people trying to expand our comic coverage have found marvunapp.com so helpful.
I've asked for some help on WP:AN/I and so far no real takers although there at least has been support for my block of one of the perps.
Here's what I think needs to be done:
* The pages marvunapp.com has reported to us need to be reviewed by experienced editors and any copyvios found and dealt with at least in the current version of the page. * Ideally, the history should be purged. In cases where this isn't practical, notes should be left on the talk page so that people don't erroneously assume they can re-integrate content from the history. * The perpetrators should be re-educated about how big of a deal this is and how much work it is to clean up. * Pages where marvunapp.com was used as a reference should include marvunapp.com in the list of references. * Images taken from the marvunapp.com should be properly attributed even if copyright doesn't apply. * If we ever get everything else done, we would want to try to be a little more vigilant in going through the comic articles ourselves rather than waiting for the marvunapp.com people to report them.
Volunteers for all this drudgery should be IMO selected from a pool of people who still believe that the main deciding factor in inclusion/deletion debates should be the cost of hard disk space.
The Uninvited Co., Inc. (a Delaware corporation)
The Uninvited Co., Inc wrote:
No idea how we're going to unravel this one.
We have received, over a period of months, numerous reports to OTRS regarding copyvios from the www.marvunapp.com web site. That site publishes character summaries for scores of comic book characters ranging from the well-known to those obscure figures that make only cameo appearances. The summaries are written by a fairly small number of contributors to the the site and are generally not edited once they are posted.
Their site content is making its way to Wikipedia. There are three problems: a) Cut and paste copying of text b) Paraphrasing of text c) Unattributed copying of images the site has scanned from comic books
(a) and (b) are copyright violations. The problem goes back at least six months and is ongoing.
(c) while not strictly speaking a copyvio since marvunapp.com is merely making scans of flat art, is upsetting the site operators due to the volume of material involved and the lack of attribution. If left unaddressed (c) is likely to make them more pedantic about their rights regarding (a) and (b). So far they've been pretty understanding and supportive, considering.
Just for clarification: (c) is a copyvio too. "flat art" is under copyright too. Their site presumably has no permission to host these pictures either though? Unless they are claiming fair use? (b) on the other hand is not a copyvio in general, unless it is really (a).
Justin
Paraphrasing is okay. (Otherwise, writing an article about an oft cited fact can be impossible.) Just changing a few words isn't. The difference can be hard to spot.
Mgm
On 10/19/06, Justin Cormack justin@specialbusservice.com wrote:
The Uninvited Co., Inc wrote:
No idea how we're going to unravel this one.
We have received, over a period of months, numerous reports to OTRS regarding copyvios from the www.marvunapp.com web site. That site publishes character summaries for scores of comic book characters ranging from the well-known to those obscure figures that make only cameo appearances. The summaries are written by a fairly small number of contributors to the the site and are generally not edited once they are posted.
Their site content is making its way to Wikipedia. There are three problems: a) Cut and paste copying of text b) Paraphrasing of text c) Unattributed copying of images the site has scanned from comic books
(a) and (b) are copyright violations. The problem goes back at least six months and is ongoing.
(c) while not strictly speaking a copyvio since marvunapp.com is merely making scans of flat art, is upsetting the site operators due to the volume of material involved and the lack of attribution. If left unaddressed (c) is likely to make them more pedantic about their rights regarding (a) and (b). So far they've been pretty understanding and supportive, considering.
Just for clarification: (c) is a copyvio too. "flat art" is under copyright too. Their site presumably has no permission to host these pictures either though? Unless they are claiming fair use? (b) on the other hand is not a copyvio in general, unless it is really (a).
Justin
WikiEN-l mailing list WikiEN-l@Wikipedia.org To unsubscribe from this mailing list, visit: http://mail.wikipedia.org/mailman/listinfo/wikien-l
Justin Cormack wrote:
c) Unattributed copying of images the site has scanned from comic books
(c) while not strictly speaking a copyvio since marvunapp.com is merely making scans of flat art, is upsetting the site operators due to the volume of material involved and the lack of attribution. If left unaddressed (c) is likely to make them more pedantic about their rights regarding (a) and (b). So far they've been pretty understanding and supportive, considering.
Just for clarification: (c) is a copyvio too. "flat art" is under copyright too. Their site presumably has no permission to host these pictures either though? Unless they are claiming fair use?
If it's being used in a valid fair use manner then it's not a copyvio, for them or for us. But even if it is a copyvio it's not marvunapp.com's copyright that's being violated. Simply scanning an image doesn't appear to qualify as creating a derivative work under the description at [[derivative work]], so they hold no copyright over the scans and only the original comics' copyright applies. As far as I can see they've got no basis for complaint about (c), since we're doing basically the same thing any situation where we're not using an image "fairly" is probably one where they're just as much in violation.
All that being said, the sources for scans like these really should be clearly attributed on image pages in accordance with our own policies.
Justin Cormack wrote:
The Uninvited Co., Inc wrote:
<sni>
Their site content is making its way to Wikipedia. There are three problems: a) Cut and paste copying of text b) Paraphrasing of text c) Unattributed copying of images the site has scanned from comic books
(a) and (b) are copyright violations. The problem goes back at least six months and is ongoing.
(c) while not strictly speaking a copyvio since marvunapp.com is merely making scans of flat art, is upsetting the site operators due to the volume of material involved and the lack of attribution. If left unaddressed (c) is likely to make them more pedantic about their rights regarding (a) and (b). So far they've been pretty understanding and supportive, considering.
Just for clarification: (c) is a copyvio too. "flat art" is under copyright too. Their site presumably has no permission to host these pictures either though? Unless they are claiming fair use? (b) on the other hand is not a copyvio in general, unless it is really (a).
I'll bite.
a) is an infringement of the copyright of the site's authors and is rude. b) is plagiarism and is rude. c) is plagiarism *and* copyright infringement of the comic book authors/artists, the latter of which we believe is justifiable under the mysterious "fair use" and "fair dealing" clauses present in many English-speaking countries (especially the US). It's also rude.
In short:
1. DON'T INFRINGE ON COPYRIGHT. "Fair use" images should be the exception to the rule and should NOT be used when a Free alternative is available. 2. Don't use a source without attribution. If we can do it for NASA and the US Government who so graciously put their works into the Public Domain, we can at least acknowledge the hard work of people who have carefully scanned (often rare) comic books and made those images available online, and researched and written detailed summaries on said comics.
On 10/19/06, Alphax (Wikipedia email) alphasigmax@gmail.com wrote:
I'll bite.
Me too!
a) is an infringement of the copyright of the site's authors and is rude.
Very true, on both accounts.
b) is plagiarism and is rude.
Not if you provide a source it ain't. If you provide a reference for the information it is simply another piece of the encyclopedia. Hell, this is what we do! We summarize and paraphrase other sources and put it in neat little packets called "articles". If this really was just paraphrasing (and not, as Mgm points out, just changing a few words) I have no problem with it. Source it, and it's not only ok, it is standard practice.
c) is plagiarism *and* copyright infringement of the comic book authors/artists, the latter of which we believe is justifiable under the mysterious "fair use" and "fair dealing" clauses present in many English-speaking countries (especially the US). It's also rude.
Again, it is only rude if we don't tell people were we got the images. But lets not forget: this website has no claim whatsoever on the copyrights. We don't have to give them anything, this is all Marvels IP. I agree that we should give them credit because it's the nice thing to do, but lets not forget that if it is genuine fair use and we want the images, they have no right to complain.
In short:
- DON'T INFRINGE ON COPYRIGHT. "Fair use" images should be the
exception to the rule and should NOT be used when a Free alternative is available.
Yeah, how many free images of copyrighted superheroes do you think there are? I can tell you: there ain't any! This is exactly why fair use exists, this exact purpose! It would be insane (insane I tell you!) not to take advantage of that glorious right.
- Don't use a source without attribution. If we can do it for NASA and
the US Government who so graciously put their works into the Public Domain, we can at least acknowledge the hard work of people who have carefully scanned (often rare) comic books and made those images available online, and researched and written detailed summaries on said comics.
Agree completely.
--Oskar
Oskar Sigvardsson wrote:
On 10/19/06, Alphax (Wikipedia email) alphasigmax@gmail.com wrote:
I'll bite.
Me too!
a) is an infringement of the copyright of the site's authors and is rude.
Very true, on both accounts.
b) is plagiarism and is rude.
Not if you provide a source it ain't. If you provide a reference for the information it is simply another piece of the encyclopedia. Hell, this is what we do! We summarize and paraphrase other sources and put it in neat little packets called "articles". If this really was just paraphrasing (and not, as Mgm points out, just changing a few words) I have no problem with it. Source it, and it's not only ok, it is standard practice.
<snip>
I read it as "paraphrasing without attribution of source". If there had been paraphrasing /with/ attribution then I don't see on what grounds they could complain.
On 10/19/06, The Uninvited Co., Inc uninvited@nerstrand.net wrote:
No idea how we're going to unravel this one.
I've contacted WikiProject Comics. They are the people most likely to be able to do something about this.
geni wrote:
On 10/19/06, The Uninvited Co., Inc uninvited@nerstrand.net wrote:
No idea how we're going to unravel this one.
I've contacted WikiProject Comics. They are the people most likely to be able to do something about this.
Based on the list provided, there isn't much of an issue. The downloading of images from there and uploading them to Wikipedia, as far as I can see ain't a copyright issue, since it's Marvel who holds the copyright, not the site in question. I've added source info to all images on the list and deleted two pages from the list as copyvio, but to my eye that's the extent of the problem. I appreciate this sitemaster is a little miffed, but what we are doing to him is no different to what we're doing to any other fan-site on the internet. That's my take on it at the minute, anyway.
On Oct 18, 2006, at 11:27 PM, The Uninvited Co., Inc wrote:
No idea how we're going to unravel this one.
Nuke all of the articles on Marvel comics characters.
Restart them all from scratch while clearly paying attention to our policies about fictional characters. Aggressively eliminate extended fictional biographies from all of them, focusing the articles on publication history, media appearances, creation histories, iconic versions, and other information that pertains to the real world.
-Phil
On Thu, 19 Oct 2006 14:39:42 -0400, Phil Sandifer Snowspinner@gmail.com wrote:
Restart them all from scratch while clearly paying attention to our policies about fictional characters. Aggressively eliminate extended fictional biographies from all of them, focusing the articles on publication history, media appearances, creation histories, iconic versions, and other information that pertains to the real world.
At last! A crusade I can join with a clear conscience!
Guy (JzG)
Phil Sandifer wrote:
On Oct 18, 2006, at 11:27 PM, The Uninvited Co., Inc wrote:
No idea how we're going to unravel this one.
Nuke all of the articles on Marvel comics characters.
Wow. Your proposal is to delete thousands and thousands of unrelated articles, many of them quite large and with edit histories going back to the early days of Wikipedia, and start them all over from scratch. But Steve Block is able to "unravel" the situation by adding source info to all images on the list and deleting just two pages as copyvio. No idea how large the list is but if one guy was able to go through it all doing that it couldn't have been all that long.
The term "ludicrous overkill" comes to mind. I think you've got another agenda here. :)
On Oct 19, 2006, at 8:40 PM, Bryan Derksen wrote:
The term "ludicrous overkill" comes to mind. I think you've got another agenda here. :)
Damn straight.
Our articles on fictional characters are cesspools of fancruft, original research, and outright idiocy. This should be a clear call to the fact that we do not have sufficient control over these articles. We need to demonstrate our lack of tolerance with a zeal previously known only to BLP.
-Phil
Phil Sandifer wrote:
Our articles on fictional characters are cesspools of fancruft, original research, and outright idiocy. This should be a clear call to the fact that we do not have sufficient control over these articles.
In this incident we had two copyvio articles and a bunch of unattributed images. It's not at all clear how you're progressing from that to the conclusion that all of Wikipedia's fictional character articles are so out-of-control that they need to be "nuked."
We need to demonstrate our lack of tolerance with a zeal previously known only to BLP.
Wikimedia Foundation could get sued for having "fancruft" in articles about fictional characters? Or having fancruft could somehow ruin the lives of fictional characters? Those are the reasons for BLP but I don't see any similar level of importance for fictional character articles.
Sorry, this is still a ludicrous overreaction as far as I can tell.
On 20/10/06, Bryan Derksen bryan.derksen@shaw.ca wrote:
Phil Sandifer wrote:
Our articles on fictional characters are cesspools of fancruft, original research, and outright idiocy. This should be a clear call to the fact that we do not have sufficient control over these articles.
In this incident we had two copyvio articles and a bunch of unattributed images. It's not at all clear how you're progressing from that to the conclusion that all of Wikipedia's fictional character articles are so out-of-control that they need to be "nuked."
I think Phil is speaking from having read numerous fictional character articles, not implying that his feelings stem from this incident. (I happen to agree with him; the amount of blithering fancruft is astonishing.)
We need to demonstrate our lack of tolerance with a zeal previously known only to BLP.
Wikimedia Foundation could get sued for having "fancruft" in articles about fictional characters? Or having fancruft could somehow ruin the lives of fictional characters? Those are the reasons for BLP but I don't see any similar level of importance for fictional character articles.
Again, I think you're misreading Phil's point, namely that we should be as ruthless at eliminating fancruft as we are at BLP issues, not that fancruft is causing the same problems as BLP issues.
Earle Martin wrote:
On 20/10/06, Bryan Derksen bryan.derksen@shaw.ca wrote:
In this incident we had two copyvio articles and a bunch of unattributed images. It's not at all clear how you're progressing from that to the conclusion that all of Wikipedia's fictional character articles are so out-of-control that they need to be "nuked."
I think Phil is speaking from having read numerous fictional character articles, not implying that his feelings stem from this incident. (I happen to agree with him; the amount of blithering fancruft is astonishing.)
Yet he's using this incident as his "excuse" for it. If this incident itself had nothing specifically to do with his proposal and he was just tossing in an old dream of his, why was it initially focused on just Marvel comics characters?
We need to demonstrate our lack of tolerance with a zeal previously known only to BLP.
Wikimedia Foundation could get sued for having "fancruft" in articles about fictional characters? Or having fancruft could somehow ruin the lives of fictional characters? Those are the reasons for BLP but I don't see any similar level of importance for fictional character articles.
Again, I think you're misreading Phil's point, namely that we should be as ruthless at eliminating fancruft as we are at BLP issues, not that fancruft is causing the same problems as BLP issues.
No, I don't think that I am. There's reasons why BLP is "ruthless" with regard to biographies of living people and those reasons are completely inapplicable to biographies of fictional characters. If you want to propose being equally "ruthless" for fictional characters I want to see a reason that's just as strong.
Also, the definition of "fancruft" is far less clear than the definition of "libel." I'd want to see something solid and widely accepted for that as well.
On Fri, 20 Oct 2006 10:22:35 -0600, Bryan Derksen bryan.derksen@shaw.ca wrote:
If you want to propose being equally "ruthless" for fictional characters I want to see a reason that's just as strong.
Sure. They are the number one source of original research, banned by policy, and these articles have misled countless editors into believing that (a) "I read it on teh internets" is adequate sourcing and (b) we should tailor policy on verifiability, original research and neutrality to ensure that these articles are not impacted.
I'd say that was a pretty bad - and pretty big - problem. BLP tends to affect one individual badly but be easily, this affects a big chunk of the project in a way which is making it actively worse because any improvement to any article is resisted by reference to similar cruft in another one.
Guy (JzG)
Guy Chapman aka JzG wrote:
On Fri, 20 Oct 2006 10:22:35 -0600, Bryan Derksen bryan.derksen@shaw.ca wrote:
If you want to propose being equally "ruthless" for fictional characters I want to see a reason that's just as strong.
Sure. They are the number one source of original research, banned by policy, and these articles have misled countless editors into believing that (a) "I read it on teh internets" is adequate sourcing and (b) we should tailor policy on verifiability, original research and neutrality to ensure that these articles are not impacted.
All of which is already covered by the same policies policy in both fictional character and living-person-biography. The only thing that BLP does is make the enforcement of our original research and verification policies more _urgent_ on living-person-biography articles, it doesn't put any additional constraints on the material that can be found within.
The articles are held to the same content standards already, it's the urgency in applying those standards that I'm questioning here. Why is it just as urgent to deal with OR on fictional character articles as it is to deal with OR on living person articles?
On Fri, 20 Oct 2006 18:00:58 -0600, Bryan Derksen bryan.derksen@shaw.ca wrote:
If you want to propose being equally "ruthless" for fictional characters I want to see a reason that's just as strong.
Sure. They are the number one source of original research, banned by policy, and these articles have misled countless editors into believing that (a) "I read it on teh internets" is adequate sourcing and (b) we should tailor policy on verifiability, original research and neutrality to ensure that these articles are not impacted.
All of which is already covered by the same policies policy in both fictional character and living-person-biography. The only thing that BLP does is make the enforcement of our original research and verification policies more _urgent_ on living-person-biography articles, it doesn't put any additional constraints on the material that can be found within.
If you look into the debates on these policies you will see that for popular culture articles there is a large body of opinion which holds that if there are no reliable sources for an article then we should allow, by policy, the use of whatever sources we can find, reliable or not.
The articles are held to the same content standards already, it's the urgency in applying those standards that I'm questioning here. Why is it just as urgent to deal with OR on fictional character articles as it is to deal with OR on living person articles?
I disagree that they are held to the same standards. I can find you a treeware biography of Leonard Cheshire, I cannot find a treeware biography of Garfield. I know which is the more significant, by any rational definition. Guess which gets the most coverage?
Guy (JzG)
On 10/21/06, Guy Chapman aka JzG guy.chapman@spamcop.net wrote:
On Fri, 20 Oct 2006 18:00:58 -0600, Bryan Derksen bryan.derksen@shaw.ca wrote: If you look into the debates on these policies you will see that for popular culture articles there is a large body of opinion which holds that if there are no reliable sources for an article then we should allow, by policy, the use of whatever sources we can find, reliable or not.
I disagree with this characterisation of the argument. I'm not going to argue that some people might put it like that or think of it like that, but many Wikipedians hold, instead, that characterising sources as 'reliable' or 'unreliable' based only on their means of publication, for instance, is poor policy, and we should not have blanket policy defining this.
I disagree that they are held to the same standards. I can find you a treeware biography of Leonard Cheshire, I cannot find a treeware biography of Garfield. I know which is the more significant, by any rational definition. Guess which gets the most coverage?
More coverage is not anything to do with standards, IMO; simply that more people feel qualified to write about Garfield than about Leonard Cheshire. It's a manifestation of systemic bias, not a case of intentional bias. I suspect if a Leonard Cheshire fanboy, if such existed, spewed huge amounts of trivia and unsourced crap onto that article it'd probably take just as long for it to be sorted out as on a popular culture article, since AFAIR Cheshire is dead and thus not covered by BLP.
In my opinion, that we have extensive articles on 'trivial' subjects does not mean that we are disparaging 'serious' subjects.
-Matt
Phil Sandifer wrote:
On Oct 19, 2006, at 8:40 PM, Bryan Derksen wrote:
The term "ludicrous overkill" comes to mind. I think you've got another agenda here. :)
Damn straight.
Our articles on fictional characters are cesspools of fancruft, original research, and outright idiocy. This should be a clear call to the fact that we do not have sufficient control over these articles. We need to demonstrate our lack of tolerance with a zeal previously known only to BLP.
I want to point out I am in broad agreement with Phil here, but doubt we'd get the consensus to do it. That said, there is a consensus forming along that exact opinion, that it should be done but we'd never be able to do it. Although the list of copyright violations on the list provided were very small, they might be very large in the contributor in questions edit history.
Have we looked through the history of all the copyvios to see if it's the same guy doing this? If so, we can just block him and revert, and it's not too bad. If not, then we really do have a serious problem and we're going to have to go through a lot of pages of comic book characters in the near future. I'll help out with the cleanup if needed, and I promise not to AfD anything.
On 10/18/06, The Uninvited Co., Inc uninvited@nerstrand.net wrote:
No idea how we're going to unravel this one.
We have received, over a period of months, numerous reports to OTRS regarding copyvios from the www.marvunapp.com web site. That site publishes character summaries for scores of comic book characters ranging from the well-known to those obscure figures that make only cameo appearances. The summaries are written by a fairly small number of contributors to the the site and are generally not edited once they are posted.
Their site content is making its way to Wikipedia. There are three problems: a) Cut and paste copying of text b) Paraphrasing of text c) Unattributed copying of images the site has scanned from comic books
(a) and (b) are copyright violations. The problem goes back at least six months and is ongoing.
(c) while not strictly speaking a copyvio since marvunapp.com is merely making scans of flat art, is upsetting the site operators due to the volume of material involved and the lack of attribution. If left unaddressed (c) is likely to make them more pedantic about their rights regarding (a) and (b). So far they've been pretty understanding and supportive, considering.
Right now we don't know the extent of the problem because we've only investigated cases that the marvunapp.com people have reported, which cover perhaps a couple dozen pages all together. I've investigated a couple myself and it is very time consuming since often there is just a paragraph or two in the middle of a longer Wikipedia article that's a copyvio and the use of paraphrasing makes automated tools unhelpful. Addressing these in a meaningful way is difficult because ongoing edits after the copyvio leads to a badly polluted page history that no one wants to delete, and I've ended up having my removals reverted over the course of time by people who show up two weeks later and don't realize when and why the text was removed, so they re-add it from the history.
The copyvios have been added by more than one user, probably because unrelated people trying to expand our comic coverage have found marvunapp.com so helpful.
I've asked for some help on WP:AN/I and so far no real takers although there at least has been support for my block of one of the perps.
Here's what I think needs to be done:
- The pages marvunapp.com has reported to us need to be reviewed by
experienced editors and any copyvios found and dealt with at least in the current version of the page.
- Ideally, the history should be purged. In cases where this isn't
practical, notes should be left on the talk page so that people don't erroneously assume they can re-integrate content from the history.
- The perpetrators should be re-educated about how big of a deal this is
and how much work it is to clean up.
- Pages where marvunapp.com was used as a reference should include
marvunapp.com in the list of references.
- Images taken from the marvunapp.com should be properly attributed even
if copyright doesn't apply.
- If we ever get everything else done, we would want to try to be a
little more vigilant in going through the comic articles ourselves rather than waiting for the marvunapp.com people to report them.
Volunteers for all this drudgery should be IMO selected from a pool of people who still believe that the main deciding factor in inclusion/deletion debates should be the cost of hard disk space.
The Uninvited Co., Inc. (a Delaware corporation)
WikiEN-l mailing list WikiEN-l@Wikipedia.org To unsubscribe from this mailing list, visit: http://mail.wikipedia.org/mailman/listinfo/wikien-l