The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
The community process that has developed with regard to GFDL compliance on the web has generally tacitly favored a link to the article and to its history as proper credit. But, for printed books, publishers have generally wanted to be more in compliance with the letter of the license. So, the Bertelsmann "Wikipedia in one volume" includes a looong list of authors in a very tiny font.
Is that practical? How about Wikipedia articles on passenger information systems (screens on subways, airplanes)? How about small booklets where there isn't a lot of room for licensing information? Should a good license for wikis make a distinction between print and online uses?
I haven't heard anyone argue strongly for full inclusion of the _license text_. But I'd like to hear opinions on the inclusion of username lists.
My personal preference would be a system where we have a special "credits" URL for each article, something like
http://en.wikipedia.org/credits/World_War_II
which would list authors and also provide full licensing information for all media files. If we had a specific collection of articles, the system could support this using collection IDs:
http://en.wikipedia.org/collection_credits/Bertelsmann_One_Volume_Encycloped...
(These URLs are completely made up and have no basis in reality.)
The advantage that I see of such an approach is that it would allow us to standardize and continually refine the way we display authorship information, and benefit the free sharing of content with a very lightweight process. The disadvantage (if it is perceived as such) is that if we would officially recommend such attribution in printed books, individual contributors would be less likely to see their username in print. But we might see more print uses because it would make the attribution more manageable.
It's also conceivable to require full author attribution for printed collections of a certain length or printed in certain quantity. (The GFDL has "in quantity" rules, but they do not seem to apply in any way to the authorship information.)
Aside from what the legal implications of any given approach are, the first question I think that needs to be answered is what's desirable. Thoughts?
2008/10/20 Erik Moeller erik@wikimedia.org:
The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
Excellent question, I think this is going to be a very interesting, if long, discussion.
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
That's one of several dozen ways of interpreting the GFDL, how about we ignore how things are done now and just look at how things should be done in the future?
The community process that has developed with regard to GFDL compliance on the web has generally tacitly favored a link to the article and to its history as proper credit. But, for printed books, publishers have generally wanted to be more in compliance with the letter of the license. So, the Bertelsmann "Wikipedia in one volume" includes a looong list of authors in a very tiny font.
Is that practical? How about Wikipedia articles on passenger information systems (screens on subways, airplanes)? How about small booklets where there isn't a lot of room for licensing information? Should a good license for wikis make a distinction between print and online uses?
Online you can link to an off-site credits page which you can't do in print or in off-line electronic versions, so I think a distinction is a good idea. We want people to reuse our content as much as possible which means we should make reusing it as easy as possible. Including an appropriate link is far easier than attributing contributors yourself. In print, that isn't possible, so they'll have to include the names directly (a printed URL is pretty useless).
I haven't heard anyone argue strongly for full inclusion of the _license text_. But I'd like to hear opinions on the inclusion of username lists.
Again, online you can link to an off-site copy of the license, in print you can't, so I would support including the license text in printed copies of large amounts of content (for a yet to be determined definition of "large"). Smaller amounts of printed content should include a much shorter summary of the license since that's all that is practical.
My personal preference would be a system where we have a special "credits" URL for each article, something like
Isn't that basically what we already have with the history page (possibly reformatted at bit)? I think we should certainly keep history pages.
Aside from what the legal implications of any given approach are, the first question I think that needs to be answered is what's desirable. Thoughts?
The legal implications do certainly need to be considered, however. Moral rights to attribution may well get in the way. Mike Godwin can advise on US law, but someone needs to make official contact with lawyers in other jurisdictions and get advice. Our content needs to be reusable in any jurisdiction (to the extent possible, it's conceivable that some jurisdictions will have laws that are completely incompatible with our goals and we'll have to give them up as a lost cause [a local chapter could lobby for a change in the law, of course]). This mailing list is not the place for a detailed discussion of the law, but that discussion does need to take place (between WMF, CC, FSF and lots and lots of lawyers from all over the world - this will probably cost a lot of money since you'll be lucky to find people willing to work pro-bono is every significant jurisdiction, but is essential).
Norwegian law says principal authors should be attributed, and I believe its the correct thing to do. It is not a good reason to say that today we can't identify those authors. Most of the articles I've been involved in writing has had very few principal authors, most of them only one or two.
In Norwegian law the principal authors can choose what to do with the article, even relicense it, without asking any of the other writers.
It should be interesting to make some statistics over how many principal authors there are for articles from Wikipedia. I think the nom are pretty few, even for those articles that has grown very large.
John
It should be interesting to make some statistics over how many principal authors there are for articles from Wikipedia. I think the nom are pretty few, even for those articles that has grown very large.
That would, indeed, be interesting, but it would require a definition of "principal author".
John at Darkstar wrote:
Norwegian law says principal authors should be attributed, and I believe its the correct thing to do. It is not a good reason to say that today we can't identify those authors. Most of the articles I've been involved in writing has had very few principal authors, most of them only one or two.
In Norwegian law the principal authors can choose what to do with the article, even relicense it, without asking any of the other writers.
It should be interesting to make some statistics over how many principal authors there are for articles from Wikipedia. I think the nom are pretty few, even for those articles that has grown very large.
John
In Finnish moral rights law, the right to be identified as the author of ones work is inalienable and absolute, and cannot be voided even through a contractual transaction.
Yours,
Jussi-Ville Heiskanen
--- On Mon, 10/20/08, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
From: Jussi-Ville Heiskanen cimonavaro@gmail.com Subject: Re: [Foundation-l] What's appropriate attribution? To: "Wikimedia Foundation Mailing List" foundation-l@lists.wikimedia.org Date: Monday, October 20, 2008, 4:21 PM John at Darkstar wrote:
Norwegian law says principal authors should be
attributed, and I believe
its the correct thing to do. It is not a good reason
to say that today
we can't identify those authors. Most of the
articles I've been involved
in writing has had very few principal authors, most
of them only one or
two.
In Norwegian law the principal authors can choose what
to do with the
article, even relicense it, without asking any of the
other writers.
It should be interesting to make some statistics over
how many principal
authors there are for articles from Wikipedia. I think
the nom are
pretty few, even for those articles that has grown
very large.
John
In Finnish moral rights law, the right to be identified as the author of ones work is inalienable and absolute, and cannot be voided even through a contractual transaction.
I don't believe the right to be identified as an author is necessarily the same disscusion as the attribution appropriate for various formats. Publishing a work without any explict attribution to an author =! voiding that author's right to be identified as an author of the work.
Birgitte SB
__________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Mon, Oct 20, 2008 at 2:36 PM, Birgitte SB birgitte_sb@yahoo.com wrote:
--- On Mon, 10/20/08, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
From: Jussi-Ville Heiskanen cimonavaro@gmail.com Subject: Re: [Foundation-l] What's appropriate attribution? To: "Wikimedia Foundation Mailing List" foundation-l@lists.wikimedia.org Date: Monday, October 20, 2008, 4:21 PM John at Darkstar wrote:
Norwegian law says principal authors should be
attributed, and I believe
its the correct thing to do. It is not a good reason
to say that today
we can't identify those authors. Most of the
articles I've been involved
in writing has had very few principal authors, most
of them only one or
two.
In Norwegian law the principal authors can choose what
to do with the
article, even relicense it, without asking any of the
other writers.
It should be interesting to make some statistics over
how many principal
authors there are for articles from Wikipedia. I think
the nom are
pretty few, even for those articles that has grown
very large.
John
In Finnish moral rights law, the right to be identified as the author of ones work is inalienable and absolute, and cannot be voided even through a contractual transaction.
I don't believe the right to be identified as an author is necessarily the same disscusion as the attribution appropriate for various formats. Publishing a work without any explict attribution to an author =! voiding that author's right to be identified as an author of the work.
Birgitte SB
Agreed. I don't think anyone is suggesting that *wikipedia itself* is doing away with the kind of attribution we currently provide; the question is what standard reusers of content should be held to. Arguably, the simpler the standard the more likely people are to adhere to it. A minimum standard also wouldn't prevent anyone from going above and beyond and crediting the entire list of authors, say, if they wanted to.
-- phoebe
Birgitte SB wrote:
--- On Mon, 10/20/08, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
From: Jussi-Ville Heiskanen cimonavaro@gmail.com Subject: Re: [Foundation-l] What's appropriate attribution? To: "Wikimedia Foundation Mailing List" foundation-l@lists.wikimedia.org Date: Monday, October 20, 2008, 4:21 PM John at Darkstar wrote:
Norwegian law says principal authors should be
attributed, and I believe
its the correct thing to do. It is not a good reason
to say that today
we can't identify those authors. Most of the
articles I've been involved
in writing has had very few principal authors, most
of them only one or
two.
In Norwegian law the principal authors can choose what
to do with the
article, even relicense it, without asking any of the
other writers.
It should be interesting to make some statistics over
how many principal
authors there are for articles from Wikipedia. I think
the nom are
pretty few, even for those articles that has grown
very large.
John
In Finnish moral rights law, the right to be identified as the author of ones work is inalienable and absolute, and cannot be voided even through a contractual transaction.
I don't believe the right to be identified as an author is necessarily the same disscusion as the attribution appropriate for various formats. Publishing a work without any explict attribution to an author =! voiding that author's right to be identified as an author of the work.
As a matter of fact, I don't believe this is accurate.
As I understand it, the "paternity right" in the Finnish section on moral rights in law, implies that publication without attribution, can happen with explicit permission of the author, but the author can rescind that permission at any time.
Yours,
Jussi-Ville Heiskanen
On Mon, Oct 20, 2008 at 6:16 PM, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
As a matter of fact, I don't believe this is accurate.
As I understand it, the "paternity right" in the Finnish section on moral rights in law, implies that publication without attribution, can happen with explicit permission of the author, but the author can rescind that permission at any time.
[snip]
That behaviour is, as I understand it, typical of moral rights (in places which acknowledge them). The notion is that you can't contract away attribution (or other 'moral rights') any less than you can sell yourself into slavery because attribution (like freedom) is a moral right and not an economic right.
Some moral rights implementations are potentially very harmful to free content as we know it: You wouldn't want to be forced to remove an improved version of a document simply because a sour original author has decided he dislikes you and that your enhancements are prejudicial to his reputation. But attribution is not an example of a problematic right, for the most part.
I think there is an second interrelated issue: There is a notion in some places that some nearly invisible and almost always unread "terms of service" can represent an agreement to abandon your right of attribution. I think this is bogus even in places where it is attribution is 'only' an economic right. However attribution is handled the principle of least surprise should always be heeded.
Gregory Maxwell wrote:
Some moral rights implementations are potentially very harmful to free content as we know it: You wouldn't want to be forced to remove an improved version of a document simply because a sour original author has decided he dislikes you and that your enhancements are prejudicial to his reputation. But attribution is not an example of a problematic right, for the most part.
What appears to be a big distinction between moral rights laws in European countries and in English speaking countries is the burden of proving that a change is indeed prejudicial to one's reputation. In a common law country the presumption of innocence implies that prejudice must be proved.
Ec
As I understand it, the "paternity right" in the Finnish section on moral rights in law, implies that publication without attribution, can happen with explicit permission of the author, but the author can rescind that permission at any time.
Yours,
Jussi-Ville Heiskanen
In Norway a few news wire companies does not attribute the journalists, and it seems to be legal, but it is claimed at least by one person to be due to their employment by the company. Then they transfers the rights to the company. I can't see how a license could make the same thing happen, but this would work somehow.
Note also that in Norway a principal author can make decissions against the coauthors will
John
John at Darkstar wrote:
As I understand it, the "paternity right" in the Finnish section on moral rights in law, implies that publication without attribution, can happen with explicit permission of the author, but the author can rescind that permission at any time.
Yours,
Jussi-Ville Heiskanen
In Norway a few news wire companies does not attribute the journalists, and it seems to be legal, but it is claimed at least by one person to be due to their employment by the company. Then they transfers the rights to the company. I can't see how a license could make the same thing happen, but this would work somehow.
Note also that in Norway a principal author can make decissions against the coauthors will
John
There is recent case law on this point in Finland, where it was found that being a director of the movie did not make it legal to make significant alterations to a movie manuscript when shooting the movie, against the wishes of the scriptwriter. [1] I understand that would be a shocking result in Hollywood.
[1] "Riisuttu Mies" (the movie was given a stiff fine, and theatrical distribution was forbid, though the scriptwriter has allowed television showings and video distribution)
Yours,
Jussi-Ville Heiskanen
I don't believe the right to be identified as an author is necessarily the same disscusion as the attribution appropriate for various formats. Publishing a work without any explict attribution to an author =! voiding that author's right to be identified as an author of the work.
Birgitte SB
Can you give an example? John
On Mon, Oct 20, 2008 at 4:30 PM, John at Darkstar vacuum@jeb.no wrote:
In Norwegian law the principal authors can choose what to do with the article, even relicense it, without asking any of the other writers.
That's true under US law too, if the work is treated as a work of joint authorship.
On Mon, Oct 20, 2008 at 12:46 PM, Erik Moeller erik@wikimedia.org wrote:
The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
<snip>
I thought about this a fair amount when putting together "How Wikipedia Works." We opted there for using the first five authors as determined by this script: http://vs.aka-online.de/cgi-bin/wppagehiststat.pl precisely to avoid the pages-of-tiny-print problem (though there is a certain satisfaction in seeing one's own name in print even if you only copyedited an article once.) (see our full credits at: http://howwikipediaworks.com/ape.html).
Of course, first-five doesn't solve much of anything in terms of true attribution; there were certain cases where I knew those names were people who had primarily reverted vandalism rather than the people who had come up with the bulk of the ideas in the text (this is especially true for policies, which often started with sweeping essays written by an individual who was bringing together thoughts and practice back in 2003 or 2004). In a few important cases, the early history is lost to the ages (and disk failure), and it's only through anecdote and deduction that you'll figure out how, say, Larry Sanger contributed to NPOV. I stuck to this algorithm anyway for the sake of consistency, however. I think in practice, however, listing individual authors of any particular article, whether you list only a few or all of them, invariably overvalues some people's contributions, undervalues others, and totally ignores anonymous contribs, and also doesn't do much for preserving everyone's copyright claims since so many people are completely pseudonymous.
So, stepping away from what the GFDL & CC currently specify, I think that moving to a corporate model of citing authors makes sense. When you contribute to Wikipedia, you're contributing to specific, discrete pages. So what about using a page-level model citation like:
Credit: Contributors to "Foobar article." From Wikipedia, The Free Encyclopedia. Accessed July 17, 2012. permanent URL here. List of contributors: http://en.wikipedia.org/wiki/Foobar/history.
And using either a perma-link to the history that's tied to the date of the perma-link used, or some other kind of stable history/credits link like Erik proposes? We keep this data and intend to keep it for the future, presumably, so offering up a link to it seems reasonable as long as the page-site combination is adequately referenced.
-- Phoebe
On Mon, Oct 20, 2008 at 9:46 PM, Erik Moeller erik@wikimedia.org wrote:
It's also conceivable to require full author attribution for printed collections of a certain length or printed in certain quantity. (The GFDL has "in quantity" rules, but they do not seem to apply in any way to the authorship information.)
This approach seems to me as a reasonable one. However, it has to be defined well. If someone, let's say, prints the whole Wikipedia in English, I don't see why not to print one more (or 10 more) books with the list of authors. At the other side, it is true that it is not reasonable to demand printing authors on a flier.
I've got one other, a very general idea about the solution. Here is the sketch:
- List of authors of particular articles should be printed periodically. Yearly, or one in two or three years. Of course, we should find some automatic way for gathering such data. (Maybe via some specific user boxes.) - Any printed book may refer to such periodical as the source of the list of authors. - Strictly speaking, this means that sources from Wikipedia in such way may be used only from dumps which were sources for the printed list of authors. If they are using newer articles, they should list authors which contributed in the mean time. Generally, I think that this approach is a reasonable one because it is not necessary anymore to use the newest article to make a book about the most of the issues. Otherwise, if someone is really willing to be up to date about some current events, they should spend some more time in finding the rest of the authors. Of course, we should make free software tools for doing that. - This is, also a good fund raising movement. If companies which are willing to print books based on Wikipedia content are willing to have such printed papers (and additions) once per month, then they should give money to WMF to do so. If they are willing to have "the frozen version" of Wikipedia for that time, they should give money for servers; and so on.
But, it is not just related to Wikimedia. If Wikimedia introduces such approach, supported by license, it may be a good source for funding similar projects for keeping bibliographical data consistent. Which is, at last, a very important issue in building a valid scientific resource.
And, of course, it is not about fliers, it is not about full encyclopedias. It is for the rest of usage. Defining what are the borders is the task and it may be discussed a lot about it.
Hoi, Think of the trees. Consider the enormous waste of paper and ink needed to do justice to all the people who contribute. Consider where this data is mostly used, I do argue that a clear reference to sources used and a link to for instance the history pages should more then suffice.
Our aim is to provide information. Our aim is to provide the freedom to continue on top of what came before. This link to an history and the continuation of a work under the same license is what is really important. The rest is effectively an ambiguous pumping of egos. Ambiguous because of the lack of clarity whose ego to pump. Thanks, Gerard
On Mon, Oct 20, 2008 at 9:46 PM, Erik Moeller erik@wikimedia.org wrote:
The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
The community process that has developed with regard to GFDL compliance on the web has generally tacitly favored a link to the article and to its history as proper credit. But, for printed books, publishers have generally wanted to be more in compliance with the letter of the license. So, the Bertelsmann "Wikipedia in one volume" includes a looong list of authors in a very tiny font.
Is that practical? How about Wikipedia articles on passenger information systems (screens on subways, airplanes)? How about small booklets where there isn't a lot of room for licensing information? Should a good license for wikis make a distinction between print and online uses?
I haven't heard anyone argue strongly for full inclusion of the _license text_. But I'd like to hear opinions on the inclusion of username lists.
My personal preference would be a system where we have a special "credits" URL for each article, something like
http://en.wikipedia.org/credits/World_War_II
which would list authors and also provide full licensing information for all media files. If we had a specific collection of articles, the system could support this using collection IDs:
http://en.wikipedia.org/collection_credits/Bertelsmann_One_Volume_Encycloped...
(These URLs are completely made up and have no basis in reality.)
The advantage that I see of such an approach is that it would allow us to standardize and continually refine the way we display authorship information, and benefit the free sharing of content with a very lightweight process. The disadvantage (if it is perceived as such) is that if we would officially recommend such attribution in printed books, individual contributors would be less likely to see their username in print. But we might see more print uses because it would make the attribution more manageable.
It's also conceivable to require full author attribution for printed collections of a certain length or printed in certain quantity. (The GFDL has "in quantity" rules, but they do not seem to apply in any way to the authorship information.)
Aside from what the legal implications of any given approach are, the first question I think that needs to be answered is what's desirable. Thoughts? -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Mon, Oct 20, 2008 at 3:46 PM, Erik Moeller erik@wikimedia.org wrote: [snip]
What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to?
Wikipedia articles seldom have more than a few authors. Many even more than a single copyright bearing author of the page text (plus a few additional authors for illustrations).
Collections obviously have more authors in total, but I don't think the situation is very different from a traditional dead tree encyclopaedia, which typically has one or a few authors per article and a great many authors in total in total.
One difference is that traditional encyclopaedias usually compensate their authors financially while authors on Wikipedia receive only positive Karma from furthering the social mission and the reputational boost that comes from their good work having tractable attribution that links their work back to them.
My long standing recommendation for free content licensing in the context of collaborative works is well embodied in this recommendation:
http://meta.wikimedia.oro/wiki/GFDL_suggestions#Proposed_attribution_text
What shouldn't be done is create rules which gives special privileged to some parties for virtue of running a website (remember, websites are not communities. The community may entirely leave and they should not have to attribute their initial webhost for all eternity), and failing to ensuring that there is a way to trace creations back to their creators.
2008/10/20 Gregory Maxwell gmaxwell@gmail.com:
My long standing recommendation for free content licensing in the context of collaborative works is well embodied in this recommendation:
http://meta.wikimedia.oro/wiki/GFDL_suggestions#Proposed_attribution_text
I think this is a very good proposal. I like the proposed modification to only require the five principal authors to be attributed if they are provided to begin with: It should not be the obligation of re-users to determine who the principal authors are.
This language would, arguably, do away with the 4pt author name listings in some publications -- which some people may consider to be a bad thing.
On Mon, Oct 20, 2008 at 3:46 PM, Erik Moeller erik@wikimedia.org wrote:
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
There's another relevant clause: "Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence."
Now, when I first read that I interpreted "authors" to mean all authors, but I've heard someone else interpret it to mean "authors...as given on its Title Page", which in the case of Wikipedia articles, would be no one.
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
That's extremely misleading and/or incorrect. I listed 5 authors on the title page (http://web.archive.org/web/20050202210758/http://mcfly.org/), but I listed *all* the authors on a page which I linked from a page entitled "GFDL History" ( http://web.archive.org/web/20050217045214/en.mcfly.org/GFDL_History, which unfortunately does not contain the linked page, probably because it was so huge). Furthermore, I did not base my use on the assumption that Wikipedia is a single GFDL work. Rather, I based my use on the assumption that *either* Wikipedia is a single GFDL work *or* that it could be merged into a single work under section 5 "Combining Documents".
Also, I would like to point out that the GFDL does not say to list *the* five principal authors, it says to list "five *of* the principal authors".
On Mon, Oct 20, 2008 at 10:35 PM, Anthony wikimail@inbox.org wrote:
On Mon, Oct 20, 2008 at 3:46 PM, Erik Moeller erik@wikimedia.org wrote:
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
That's extremely misleading and/or incorrect. I listed 5 authors on the title page (http://web.archive.org/web/20050202210758/http://mcfly.org/), but I listed *all* the authors on a page which I linked from a page entitled "GFDL History" ( http://web.archive.org/web/20050217045214/en.mcfly.org/GFDL_History, which unfortunately does not contain the linked page, probably because it was so huge).
Ah, here it is: http://web.archive.org/web/20071009040722/en.mcfly.org/Wikipedia+contributor..., which was linked from *both* the title page and the GFDL History page.
An explanation of how I complied with the GFDL is at http://web.archive.org/web/20071008202154/en.mcfly.org/McFly_copyrights
I wonder, could it be easier to solve the attribution problem with GFDL, and perhaps adjust that license so a clearly identifiable source of the license is sufficient? That solves most the problem with todays use of the GFDL license.
The attribution problem isn't that difficult to solve. It is possible to identify who wrote which parts of an article. It is also possible to say something about which parts clearly constitutes _content_ and which parts are not content but purely factual statements, that is templates. The authors that is involved in producing content are those that belongs in the category principal authors, and among them some will be possible to identify as truly principal authors.
The main problem with the GFDL is how to clearly identify the work as licensed under GFDL. Today this leads to the printing of the whole license text, but the only thing necessary is identification of the license in a clearly visible manner.
Rethink the whole problem, whats necessary is to be able to identify a work and as part of this be able to identify the license and other data. Perhaps something like an ISBN-number for any authored work, and then some kind of magic site that can act as a broker between those who need additional information and those who deliver such information. This could be a step further than today, not only identifying which license a particular work uses, but also licensing of previous version and how it relates to other parts of a collection of works.
John
On Mon, Oct 20, 2008 at 11:44 PM, John at Darkstar vacuum@jeb.no wrote: [snip]
The main problem with the GFDL is how to clearly identify the work as licensed under GFDL. Today this leads to the printing of the whole license text, but the only thing necessary is identification of the license in a clearly visible manner.
One of the proposed FSF GFDL revision drafts had a size threshold for triggering the requirement to reproduce the license text. (http://gplv3.fsf.org/comments/gfdl-draft-1.html; 6a)
Informing people of their rights is important: Since if they don't know them they might as well not have them. If the license text is well written, or at least has a good preamble, then including it can go a long way to further understanding of Free content (it's not just no cost!), unfortunately the FDL isn't clear and don't have a clear preamble. But the GPLv3 very much is and does, so it can be done.
Including a single license copy squeezed onto a page along with 1000 pages of information is pretty non-burdensome, even in printed form. Certainly no worse than all the other random overhead pages a book typically contains.
The thresholds in the proposed draft may be probably too low to remove this burden (it was something like 20k words or 10 pages), but it's an indication that the general approach may be acceptable to the drafters.
Rethink the whole problem, whats necessary is to be able to identify a work and as part of this be able to identify the license and other data. Perhaps something like an ISBN-number for any authored work, and then some kind of magic site that can act as a broker between those who need additional information and those who deliver such information. This could be a step further than today, not only identifying which license a particular work uses, but also licensing of previous version and how it relates to other parts of a collection of works.
Hm. Well it would have to be Universal, and it purpose is Locating Resources, so we could call it a ULR! This seems somehow familiar. ;)
More seriously, a clearing house would be interesting and very useful. But I think in terms of providing licensing information it still makes sense to always tag along: "year, basic attribution; license" if nothing else. A clearing house identifier would be bonus.
The other *must solve* issue is the gratuitous incompatibility with similar but different licenses: You can't create a new work that is derived from both third-party FDL content and third-party CC-By-SA content while strictly conforming with the licenses. (many people would call this the most significant problem with the FDL today, thought it's also true of all other existing copyleft free content licenses)
I think that almost everyone agrees that you ought to be able to do this (the most negative thing I've seen said about it is that you ought to respect the most restrictive of the combined terms in this case), and there are a number of ways to address this. My preferred way is to just have the licenses explicitly enumerate compatible licenses and the rules for combined works. GPLv3 addressed the compatibility question in a different way, but it was addressed successfully there, so again it has been proven that it can be done.
Gregory Maxwell wrote:
The other *must solve* issue is the gratuitous incompatibility with similar but different licenses: You can't create a new work that is derived from both third-party FDL content and third-party CC-By-SA content while strictly conforming with the licenses. (many people would call this the most significant problem with the FDL today, thought it's also true of all other existing copyleft free content licenses)
I think that almost everyone agrees that you ought to be able to do this (the most negative thing I've seen said about it is that you ought to respect the most restrictive of the combined terms in this case), and there are a number of ways to address this. My preferred way is to just have the licenses explicitly enumerate compatible licenses and the rules for combined works. GPLv3 addressed the compatibility question in a different way, but it was addressed successfully there, so again it has been proven that it can be done.
As I understand it (correct me if I am wrong), one of the salient problems with "close but no cigar" license compatibility is that a license either *is* "viral", or it *is not*. And getting by that is near impossible in a way that is coherent.
Yours,
Jussi-Ville Heiskanen
On Tue, Oct 21, 2008 at 12:26 AM, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote: [snip]
As I understand it (correct me if I am wrong), one of the salient problems with "close but no cigar" license compatibility is that a license either *is* "viral", or it *is not*. And getting by that is near impossible in a way that is coherent.
Nope.
Basically you can't be compatible and not expose yourself to some weaknesses in the other license: You're exposed to the risk that the other is two permissive if you follow a "allow any act permitted by either" combination, or too restrictive if you follow a "allow only acts permitted by both", or some variant depending on how the combination permission is constructed. If both licenses are copyleft (what you're calling viral) then you may end up in a case where further downstream works must be under the combined licenses, unless that situation is specifically avoided in *both* copyleft licenses.
...But if you consider compatibility to be important (and I think everyone can agree that it's at least somewhat important some of the time) then your only other alternatives are getting both works dual licensed or both re-licensed under a single license. Neither of which should be better than the controlled exposure.
You don't have to take my word for it, There is an existence proof: GPLv3 accomplishes license compatibility with other licenses, not merely license which allow covered works to be simply re-licensed as GPLv3.
See: http://www.gnu.org/licenses/gpl-faq.html#WhatDoesCompatMean
http://www.gnu.org/copyleft/gpl.html Section 7
(actually, the AGPL compatibility in Section 13 is basically the type of compatibility I prefer: Explicit bidirectional compatibility with defined terms)
I've asked about this some time back, and the answare was that Wikipedia is a collection of independent work, meaning each one of them has to list the principal authors of that work. The collection as such is a database and may or may not be a work in itself.
Also, a failure to state the principal authors does not release any later work from giving due attribution. The attribution is a property of the work itself and not for some random copy of the work, that is each copy has to give due respect to the authors of the work not the authors of the previous copy.
John
Anthony skrev:
On Mon, Oct 20, 2008 at 3:46 PM, Erik Moeller erik@wikimedia.org wrote:
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
There's another relevant clause: "Preserve the section Entitled "History", Preserve its Title, and add to it an item stating at least the title, year, new authors, and publisher of the Modified Version as given on the Title Page. If there is no section Entitled "History" in the Document, create one stating the title, year, authors, and publisher of the Document as given on its Title Page, then add an item describing the Modified Version as stated in the previous sentence."
Now, when I first read that I interpreted "authors" to mean all authors, but I've heard someone else interpret it to mean "authors...as given on its Title Page", which in the case of Wikipedia articles, would be no one.
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
That's extremely misleading and/or incorrect. I listed 5 authors on the title page (http://web.archive.org/web/20050202210758/http://mcfly.org/), but I listed *all* the authors on a page which I linked from a page entitled "GFDL History" ( http://web.archive.org/web/20050217045214/en.mcfly.org/GFDL_History, which unfortunately does not contain the linked page, probably because it was so huge). Furthermore, I did not base my use on the assumption that Wikipedia is a single GFDL work. Rather, I based my use on the assumption that *either* Wikipedia is a single GFDL work *or* that it could be merged into a single work under section 5 "Combining Documents".
Also, I would like to point out that the GFDL does not say to list *the* five principal authors, it says to list "five *of* the principal authors". _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Mon, Oct 20, 2008 at 11:11 PM, John at Darkstar vacuum@jeb.no wrote:
I've asked about this some time back, and the answare was that Wikipedia is a collection of independent work, meaning each one of them has to list the principal authors of that work. The collection as such is a database and may or may not be a work in itself.
1) Who told you that? 2) Can the names be combined into a single list? I don't see why not.
Also, a failure to state the principal authors does not release any
later work from giving due attribution. The attribution is a property of the work itself and not for some random copy of the work, that is each copy has to give due respect to the authors of the work not the authors of the previous copy.
Absolutely agreed. My longstanding interpretation of the GFDL was that attribution of all (non de-minimus) authors was required, in the section Entitled History. Considering moral rights laws and the ethical principles behind them, I still believe this is the correct interpretation, and that the phrase "as given on its Title page" should be interpreted to apply only to "publisher of the Document".
Absolutely agreed. My longstanding interpretation of the GFDL was that attribution of all (non de-minimus) authors was required, in the section Entitled History. Considering moral rights laws and the ethical principles behind them, I still believe this is the correct interpretation, and that the phrase "as given on its Title page" should be interpreted to apply only to "publisher of the Document".
If memory serves (it's been a while since I read the license properly), the "5 principal authors" thing is for re-use, the "preserve the section entitled history" thing is for modifications. The two are different uses of the license. If you're just using the content as is it's far easier than if you're modifying it.
On Tue, Oct 21, 2008 at 8:38 AM, Thomas Dalton thomas.dalton@gmail.comwrote:
Absolutely agreed. My longstanding interpretation of the GFDL was that attribution of all (non de-minimus) authors was required, in the section Entitled History. Considering moral rights laws and the ethical
principles
behind them, I still believe this is the correct interpretation, and that the phrase "as given on its Title page" should be interpreted to apply
only
to "publisher of the Document".
If memory serves (it's been a while since I read the license properly), the "5 principal authors" thing is for re-use, the "preserve the section entitled history" thing is for modifications. The two are different uses of the license. If you're just using the content as is it's far easier than if you're modifying it.
Nope, they're both for modifications. If you're just making a verbatim copy, you preserve any attribution in the original as a natural part of not making any modifications. Of course, in the case of Wikipedia, the original isn't properly attributed in the first place.
Anthony skrev:
On Mon, Oct 20, 2008 at 11:11 PM, John at Darkstar vacuum@jeb.no wrote:
I've asked about this some time back, and the answare was that Wikipedia is a collection of independent work, meaning each one of them has to list the principal authors of that work. The collection as such is a database and may or may not be a work in itself.
- Who told you that? 2) Can the names be combined into a single list? I
don't see why not.
*is* should be *can be* in the first sentence.
The person, the actual name is insignificant, said that such a collection is an independent work and should be attributed as such together with attribution for each contained work. It is also possible to interpret it as a database, sort of special notation in "Åndsverksloven" (http://www.lovdata.no/all/hl-19610512-002.html#43) - law about artistic works or intellectual property or something like that - it gives the database protection as if it is a work of art. It is not obvious which one is most suitable for Wikipedia. An interpretation as a database seems more on line with WMF being an isp.
I don't think it really says anything about attribution for the content of the database, but §6, which does not apply to a database says; Er det to eller flere opphavsmenn til et åndsverk uten at de enkeltes ytelser kan skilles ut som særskilte verk, erverver de opphavsrett til verket i fellesskap. If there are two or more creators for a work of art and none of the contributions can be singled out as independent works, they will collectively own the "copyright". I use quotation as opphavsrett is not similar to copyright but its close enough. Note that the articles in Wikipedia is clearly independent works that can be singled out, which means they should be attributed individually.
Attribution can be organized any way appropriate as long as it is according to "good practice. The same § 3 says this "The rights after the first and second paragraphs can not be released, unless the use of the work in question is limited after the nature and scope." That is, a license that does not request attribution can't be used in such a manner, you may use it but still you will have to attribute the authors. It is although possible to say that a limited use can be done without attribution, lets say someone printing out a single hardcopy.
Also, a failure to state the principal authors does not release any
later work from giving due attribution. The attribution is a property of the work itself and not for some random copy of the work, that is each copy has to give due respect to the authors of the work not the authors of the previous copy.
Absolutely agreed. My longstanding interpretation of the GFDL was that attribution of all (non de-minimus) authors was required, in the section Entitled History. Considering moral rights laws and the ethical principles behind them, I still believe this is the correct interpretation, and that the phrase "as given on its Title page" should be interpreted to apply only to "publisher of the Document". __
My guess is that a history link should exist if appropriate, if necessary at a the original publisher/isp/whattever (WM-site) but prinsipal authors should be attributed anyhow at copies.
John
On Tue, Oct 21, 2008 at 4:40 AM, Anthony wikimail@inbox.org wrote: ....
Absolutely agreed. My longstanding interpretation of the GFDL was that attribution of all (non de-minimus) authors was required, in the section Entitled History. Considering moral rights laws and the ethical principles behind them, I still believe this is the correct interpretation, and that the phrase "as given on its Title page" should be interpreted to apply only to "publisher of the Document".
I actually based my only-citing-five-authors-per-article tactic on advice from Eben Moglen, who as I understood it, felt that as long as our metric was consistent and we linked back to the history on Wikipedia, citing all the authors of every article in our print version was not necessary.
In general, I think part of the trouble with the GFDL as it stands is that very different interpretations are not only possible but likely among people who have spent a good deal of time thinking about and studying it. The intention is clear -- provide appropriate attribution to the people who wrote the thing you're trying to cite -- but the implementation is entirely murky. Pity the random person who tries to reuse content and has to figure out the license... -- phoebe
Hoi, I find it interesting to see how this thread is being weaved. If I read Erik correctly, he is asking us what appropriate attribution is. He is asking any and all observations. What I find is a thread about existing legalities.
When we observe the current practice, you find that people attribute by referring to Wikipedia. This is an effective way of providing access to any and all the people who have contributed to what has been used. When you read the byzantine requirements under the different licenses, you have to be a lawyer to understand them properly and there is no tooling to help you define such things as "principal author" or the five most significant authors.
If all we can do is discuss how things are currently legal, then we are not looking for something that works practically. It is for practical reasons that I wonder about the number of trees that have to be felled to attribute. Certainly when you have a print of all the Wikipedia articles on the popes of Rome and all the Christian saints and martyrs, you have a long list of articles that may all need their own attribution. When you approach these articles as a single work, you do no justice to the individual article and its authors.
Really, why are we not talking about how this is to WORK for the people that will use our data.. Please remember that this is what we do it for. Thanks, GerardM
On Mon, Oct 20, 2008 at 9:46 PM, Erik Moeller erik@wikimedia.org wrote:
The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
The community process that has developed with regard to GFDL compliance on the web has generally tacitly favored a link to the article and to its history as proper credit. But, for printed books, publishers have generally wanted to be more in compliance with the letter of the license. So, the Bertelsmann "Wikipedia in one volume" includes a looong list of authors in a very tiny font.
Is that practical? How about Wikipedia articles on passenger information systems (screens on subways, airplanes)? How about small booklets where there isn't a lot of room for licensing information? Should a good license for wikis make a distinction between print and online uses?
I haven't heard anyone argue strongly for full inclusion of the _license text_. But I'd like to hear opinions on the inclusion of username lists.
My personal preference would be a system where we have a special "credits" URL for each article, something like
http://en.wikipedia.org/credits/World_War_II
which would list authors and also provide full licensing information for all media files. If we had a specific collection of articles, the system could support this using collection IDs:
http://en.wikipedia.org/collection_credits/Bertelsmann_One_Volume_Encycloped...
(These URLs are completely made up and have no basis in reality.)
The advantage that I see of such an approach is that it would allow us to standardize and continually refine the way we display authorship information, and benefit the free sharing of content with a very lightweight process. The disadvantage (if it is perceived as such) is that if we would officially recommend such attribution in printed books, individual contributors would be less likely to see their username in print. But we might see more print uses because it would make the attribution more manageable.
It's also conceivable to require full author attribution for printed collections of a certain length or printed in certain quantity. (The GFDL has "in quantity" rules, but they do not seem to apply in any way to the authorship information.)
Aside from what the legal implications of any given approach are, the first question I think that needs to be answered is what's desirable. Thoughts? -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
2008/10/21 Gerard Meijssen gerard.meijssen@gmail.com:
Really, why are we not talking about how this is to WORK for the people that will use our data.. Please remember that this is what we do it for. Thanks, GerardM
Problem is there are rather a different set of scenarios where different standards are likely to be popular:
==Text==
Text from wikipedia can have a very large number of authors and in many cases the work is a derivative of the work of every single author. So lets look at the various uses for wikipedia text.
*Reproduction of a single article. In this case having to include an author list longer than the article is a real problem. So people may advocate being allowed to include a straight URL where the author list can be found.
*reproduction of a collection of articles as a book (say a book on WW2 British submarines). In this case including a complete authorlist while potentially rather uninformative would certainly be possible. A URL would likely be regarded as a poor replacement.
*reproduction of an article in a non GFDL environment (say a single article in a magazine). For a normal article a complete authorlist would be possible but would tend to break down for WW2
*use in a power point presentation. Doesn't really matter. Whatever requirements you put in place people just jam in a couple of slides at the end with the stuff on it and rapidly shuffle past them.
*Recorded to vorbis/tape/mp3/45 whatever. Not too much a problem (with the posible exception of the 45). There are various bits of text reading software around that could read through the complete author list although most people would stop listening there.
*Recorded for conventional radio. Serious problem here. no one is going to want to waste airtime reading out too long a list of credits at the same time things like http://en.wikipedia.org/w/index.php?title=France&action=history would be rather hard to read out on air.
*use in a computer game. As long as credit in the credit file is accepted not a problem. In game credit is a bit of a headache.
==Photos==
Photos tend to have fewer authors but tend to be more frequently deployed in situations where space is a premium.
*postcard. As long as putting credit on the back is accepted not a problem.
*jigsaw. as long as putting the credit on a separate object (in this case the box) is accepted not a problem
*use in a computer game. As long as credit in the credit file is accepted not a problem. In game credit may be possible via standard watermark method
*Use on a T-shirt. There would be space but I have no idea where the credit should be put.
*Tattoos. I'm not aware of any copyright cases over tattoos.
This is just a start and I haven't yet covered other forms of media (video sound sculpture etc). I could have a shot if anyone is interested. .
On Tue, Oct 21, 2008 at 11:57 AM, Gerard Meijssen <gerard.meijssen@gmail.com
wrote:
Hoi, I find it interesting to see how this thread is being weaved. If I read Erik correctly, he is asking us what appropriate attribution is. He is asking any and all observations. What I find is a thread about existing legalities.
The appropriate attribution is certainly no less than what was promised in the first place.
When we observe the current practice, you find that people attribute by
referring to Wikipedia.
Copyright violations on the Internet are rampant. So what?
On Tue, Oct 21, 2008 at 1:08 PM, Anthony wikimail@inbox.org wrote:
Copyright violations on the Internet are rampant. So what?
Another point about attribution which we need to be mindful of is the proposed Orphan Works law in the US which is getting closer and closer to passing: http://www.govtrack.us/congress/billtext.xpd?bill=s110-2913
I expect it to make attentive copyright holders far more aggressive. We can expect to get perfunctory complaints about many of our valid fair use images just as we already receive from trademark holders.
In particular we can expect copyright holders to become more aggressive with respect to attribution because unattributed (or incorrectly attributed) copies floating around on the Internet will cause an effective loss (if only temporary) of copyright protection.
(The orphan works act as currently drafted also has other risks outside of current topic of discussion, such as the prohibition against injunctive relief and the pure monetary damages focus which may wedge copyleft enforcement, at least for works claimed to be orphaned)
Gerard Meijssen wrote:
Hoi, I find it interesting to see how this thread is being weaved. If I read Erik correctly, he is asking us what appropriate attribution is. He is asking any and all observations. What I find is a thread about existing legalities.
You are not wholly accurate. There is discussion about laws which refer to the *fact* of law, that it is impossible to rely on some copyleft "wishes" (which aren't really provisions in those jurisdictions, no matter how one might click ones red shoes together at the heels), when there are more strict moral rights in play in those jurisdictions. Personally I find it entirely appropriate in terms of attribution, that we don't present un-necessary problems to downstream users.
The downstream users have to deal with legal facts on the ground.
It would certainly be an evil trifecta for us, to ignore certain facts of morality. We shouldn't do it because we *want* people to safely re-use our content. We shouldn't do it because we respect the fact that there is a genuinely good reason why attribution is *the right thing* to do. And thirdly, we shouldn't do it, because generous attribution is a genuine incentive and an argument in favour of wikipedia, in comparison to many other compendia, which only list the whole list of contributors, without specifying to which articles in their work they have added wordage. (EB Micropaedia being a case in point).
When we observe the current practice, you find that people attribute by referring to Wikipedia. This is an effective way of providing access to any and all the people who have contributed to what has been used. When you read the byzantine requirements under the different licenses, you have to be a lawyer to understand them properly and there is no tooling to help you define such things as "principal author" or the five most significant authors.
This is a nice fiction, but not true, if a downstream user, which I think is the focus in ultimo, is going for a fixed published media. Linking is good, if what you have is on the internets, but if not, not so hot. You are not provided access, if you can't follow the link.
If all we can do is discuss how things are currently legal, then we are not looking for something that works practically. It is for practical reasons that I wonder about the number of trees that have to be felled to attribute. Certainly when you have a print of all the Wikipedia articles on the popes of Rome and all the Christian saints and martyrs, you have a long list of articles that may all need their own attribution. When you approach these articles as a single work, you do no justice to the individual article and its authors.
Now, this I find quite silly on several grounds. First you mentioned linking to history, and now you shift ground and talk about felling trees. If you are felling trees, you can't link to the history, to save your argument.
One might quite more legitimately worry about the amount of trees that have to be felled to include citations, references etc. In short, this argument is very poor indeed.
Do not be blinded by the fact that moral rights are recognized in law, from the fact, that they are recognized as such, because of a strong ethical foundation. Slavery not being nice is not just a fact of law, there are strands and roots deepset into general philosophy and ethology.
Really, why are we not talking about how this is to WORK for the people that will use our data.. Please remember that this is what we do it for.
I think this is a case where asking the doctor to heal themselves is not totally amiss...
Yours,
Jussi-Ville Heiskanen
Let me make a radical suggestion. One that, for the moment, ignores all those overbearing legal questions.
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
When you look at a Wikipedia article there is no list of authors (principal or otherwise). There is simply a link to "history", a statement at the bottom of the page saying that the content is under the GFDL, and a link to the GFDL. On the Wikipedia page itself, that is essentially the full extent of the licensing and attribution.
I assume that practically all Wikipedia contributors are comfortable with recieving this very low level of attribution for Wikipedia articles.
So, by extension, perhaps the goal should be finding a way to codify this scheme in a way that works both for us and for reusers. Namely, making the requirements for redistribution of Wikipedia content to simply be:
1) A link or reference to the article's history 2) A statement acknowledging the free content license 3) A link or reference to the text of that license
That's very simple and practical. One can add some details regarding new versions and modifications, but even there I think you accomplish more by keeping it simple.
Now I suspect there are about three dozen reasons why defining attribution as simply a link to the history page is legally impossible and incompatible with the GFDL. But even so, doesn't it make some sense to start with: How are Wikipedia articles being used? and work backwards backwards to construct the licensing scheme that best resembles actual practice while still being legally rigorous? Wikipedia authors don't seem to want or expect prominent and overt acknowledgements when writing articles, so why should our licensing scheme require reusers to add more overt statements than even we ourselves have?
-Robert Rohde
On Mon, Oct 20, 2008 at 12:46 PM, Erik Moeller erik@wikimedia.org wrote:
The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
The community process that has developed with regard to GFDL compliance on the web has generally tacitly favored a link to the article and to its history as proper credit. But, for printed books, publishers have generally wanted to be more in compliance with the letter of the license. So, the Bertelsmann "Wikipedia in one volume" includes a looong list of authors in a very tiny font.
Is that practical? How about Wikipedia articles on passenger information systems (screens on subways, airplanes)? How about small booklets where there isn't a lot of room for licensing information? Should a good license for wikis make a distinction between print and online uses?
I haven't heard anyone argue strongly for full inclusion of the _license text_. But I'd like to hear opinions on the inclusion of username lists.
My personal preference would be a system where we have a special "credits" URL for each article, something like
http://en.wikipedia.org/credits/World_War_II
which would list authors and also provide full licensing information for all media files. If we had a specific collection of articles, the system could support this using collection IDs:
http://en.wikipedia.org/collection_credits/Bertelsmann_One_Volume_Encycloped...
(These URLs are completely made up and have no basis in reality.)
The advantage that I see of such an approach is that it would allow us to standardize and continually refine the way we display authorship information, and benefit the free sharing of content with a very lightweight process. The disadvantage (if it is perceived as such) is that if we would officially recommend such attribution in printed books, individual contributors would be less likely to see their username in print. But we might see more print uses because it would make the attribution more manageable.
It's also conceivable to require full author attribution for printed collections of a certain length or printed in certain quantity. (The GFDL has "in quantity" rules, but they do not seem to apply in any way to the authorship information.)
Aside from what the legal implications of any given approach are, the first question I think that needs to be answered is what's desirable. Thoughts? -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Hoi, The question is not what is compatible with the GFDL or CC-by-sa, the question is what is appropriate. Those lead to different answers. I like your approach to compare how things work in the real world and what is stated in a license.
In the end it is about having a license that will work and that can be enforced because it makes sense for our users. Thanks, GerardM
On Tue, Oct 21, 2008 at 6:52 PM, Robert Rohde rarohde@gmail.com wrote:
Let me make a radical suggestion. One that, for the moment, ignores all those overbearing legal questions.
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
When you look at a Wikipedia article there is no list of authors (principal or otherwise). There is simply a link to "history", a statement at the bottom of the page saying that the content is under the GFDL, and a link to the GFDL. On the Wikipedia page itself, that is essentially the full extent of the licensing and attribution.
I assume that practically all Wikipedia contributors are comfortable with recieving this very low level of attribution for Wikipedia articles.
So, by extension, perhaps the goal should be finding a way to codify this scheme in a way that works both for us and for reusers. Namely, making the requirements for redistribution of Wikipedia content to simply be:
- A link or reference to the article's history
- A statement acknowledging the free content license
- A link or reference to the text of that license
That's very simple and practical. One can add some details regarding new versions and modifications, but even there I think you accomplish more by keeping it simple.
Now I suspect there are about three dozen reasons why defining attribution as simply a link to the history page is legally impossible and incompatible with the GFDL. But even so, doesn't it make some sense to start with: How are Wikipedia articles being used? and work backwards backwards to construct the licensing scheme that best resembles actual practice while still being legally rigorous? Wikipedia authors don't seem to want or expect prominent and overt acknowledgements when writing articles, so why should our licensing scheme require reusers to add more overt statements than even we ourselves have?
-Robert Rohde
On Mon, Oct 20, 2008 at 12:46 PM, Erik Moeller erik@wikimedia.org wrote:
The GFDL has specific attribution requirements that were designed for software manuals. What's appropriate attribution for a wiki, where a page can have thousands of authors, and a collection of pages is very likely to? I would like to start a broad initial discussion on this topic; it's likely that the issue will need to be raised more specifically in the context of possible modifications to the GFDL or a migration to CC-BY-SA.
The relevant GFDL clause states: "List on the Title Page, as authors, one or more persons or entities responsible for authorship of the modifications in the Modified Version, together with at least five of the principal authors of the Document (all of its principal authors, if it has fewer than five), unless they release you from this requirement."
Most people have chosen to ignore the "principal authors" requirement and to try to attribute every author instead because there's no obvious way to determine who the principal authors are. I remember a few years back that Anthony tried a completely different approach, where he created a full copy of Wikipedia (under the assumption that it's a single GFDL work) and attributed it to five people on the frontpage. Anthony, please correct me if my recollection is incorrect.
The community process that has developed with regard to GFDL compliance on the web has generally tacitly favored a link to the article and to its history as proper credit. But, for printed books, publishers have generally wanted to be more in compliance with the letter of the license. So, the Bertelsmann "Wikipedia in one volume" includes a looong list of authors in a very tiny font.
Is that practical? How about Wikipedia articles on passenger information systems (screens on subways, airplanes)? How about small booklets where there isn't a lot of room for licensing information? Should a good license for wikis make a distinction between print and online uses?
I haven't heard anyone argue strongly for full inclusion of the _license text_. But I'd like to hear opinions on the inclusion of username lists.
My personal preference would be a system where we have a special "credits" URL for each article, something like
http://en.wikipedia.org/credits/World_War_II
which would list authors and also provide full licensing information for all media files. If we had a specific collection of articles, the system could support this using collection IDs:
http://en.wikipedia.org/collection_credits/Bertelsmann_One_Volume_Encycloped...
(These URLs are completely made up and have no basis in reality.)
The advantage that I see of such an approach is that it would allow us to standardize and continually refine the way we display authorship information, and benefit the free sharing of content with a very lightweight process. The disadvantage (if it is perceived as such) is that if we would officially recommend such attribution in printed books, individual contributors would be less likely to see their username in print. But we might see more print uses because it would make the attribution more manageable.
It's also conceivable to require full author attribution for printed collections of a certain length or printed in certain quantity. (The GFDL has "in quantity" rules, but they do not seem to apply in any way to the authorship information.)
Aside from what the legal implications of any given approach are, the first question I think that needs to be answered is what's desirable. Thoughts? -- Erik Möller Deputy Director, Wikimedia Foundation
Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Tue, Oct 21, 2008 at 1:07 PM, Gerard Meijssen gerard.meijssen@gmail.comwrote:
Hoi, The question is not what is compatible with the GFDL or CC-by-sa, the question is what is appropriate.
Appropriate for what? Are we considering starting a new project, or something?
On Tue, Oct 21, 2008 at 12:52 PM, Robert Rohde rarohde@gmail.com wrote:
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
[snip]
This is basically what is proposed at http://meta.wikimedia.org/wiki/GFDL_suggestions but there are a few differences such as:
# Conventional named attribution be preserved in cases where it is easy and reasonable to do so.
Consider, we copy an image from RandomFreeContentPhotoHost and stick it in a Wikipedia article without the author's knoweldge. Joe publisher takes just that image and uses it in his printed book, and captions the image "RandomImage (source: Wikipedia.org; http://.../randomimage.jpg)".
This may well surprise and offend the author, and we'll have to deal with someone yelling at US saying they revoke the license, and yelling at the reusers "the license says you must provide attribution!", the mess here would be doubly compounded if in the meantime we'd deleted the image and made the publisher look like a liar.
In cases where attribution can be directly provided, we should avoid the middle-man. This will match people's expectations.
# that history requirement doesn't depend on you linking to a particular site, but to any that provides the history, which avoids making a special right for initial ISPs and webhosts
Imagine: Wikipedia turns evil and the entire community moves as a whole to NotEvilPedia™. Does it make any sense that NotEvilPedia must forever direct everyone to the evil Wikipedia forever and always simply because Wikipedia was the initial webhost for the community?
Of course not, the purpose of needing a history link is to provide the history information not to invent a new class of content ownership for ISPs. Anyone with a complete copy of the history should be able to fulfill the roll.
On Tue, Oct 21, 2008 at 10:13 AM, Gregory Maxwell gmaxwell@gmail.com wrote:
On Tue, Oct 21, 2008 at 12:52 PM, Robert Rohde rarohde@gmail.com wrote:
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
[snip]
This is basically what is proposed at http://meta.wikimedia.org/wiki/GFDL_suggestions but there are a few differences such as:
# Conventional named attribution be preserved in cases where it is easy and reasonable to do so.
Consider, we copy an image from RandomFreeContentPhotoHost and stick it in a Wikipedia article without the author's knoweldge. Joe publisher takes just that image and uses it in his printed book, and captions the image "RandomImage (source: Wikipedia.org; http://.../randomimage.jpg)".
This may well surprise and offend the author, and we'll have to deal with someone yelling at US saying they revoke the license, and yelling at the reusers "the license says you must provide attribution!", the mess here would be doubly compounded if in the meantime we'd deleted the image and made the publisher look like a liar.
In cases where attribution can be directly provided, we should avoid the middle-man. This will match people's expectations.
# that history requirement doesn't depend on you linking to a particular site, but to any that provides the history, which avoids making a special right for initial ISPs and webhosts
Imagine: Wikipedia turns evil and the entire community moves as a whole to NotEvilPedia™. Does it make any sense that NotEvilPedia must forever direct everyone to the evil Wikipedia forever and always simply because Wikipedia was the initial webhost for the community?
Of course not, the purpose of needing a history link is to provide the history information not to invent a new class of content ownership for ISPs. Anyone with a complete copy of the history should be able to fulfill the roll.
Dude, now I really want to join NotEvilPedia. But where to host it? Sealand?
Also, agreed with both of these things, though how you determine whether someone has a full copy of the history or not seems a little dicey. We should really provide better easily-downloaded metadata for articles (such as initial creation date, etc). And I would say that in the photo example the proper credit would be both to the author & to Wikipedia as source: Randomimage. (Credit: Joe Blow. Source: Wikipedia.org, http://...randomimage.jpg, licensed under GFDL 11.16, etc.)
-- phoebe
On Tue, Oct 21, 2008 at 9:52 AM, Robert Rohde rarohde@gmail.com wrote:
Let me make a radical suggestion. One that, for the moment, ignores all those overbearing legal questions.
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
When you look at a Wikipedia article there is no list of authors (principal or otherwise). There is simply a link to "history", a statement at the bottom of the page saying that the content is under the GFDL, and a link to the GFDL. On the Wikipedia page itself, that is essentially the full extent of the licensing and attribution.
I assume that practically all Wikipedia contributors are comfortable with recieving this very low level of attribution for Wikipedia articles.
So, by extension, perhaps the goal should be finding a way to codify this scheme in a way that works both for us and for reusers. Namely, making the requirements for redistribution of Wikipedia content to simply be:
- A link or reference to the article's history
- A statement acknowledging the free content license
- A link or reference to the text of that license
<snip>
Totally agreed with this. See my message upthread. My sample citation is missing an acknowledgment of the license; add that in and I think you'd be good to go for most purposes. I think the concept "this came from a bunch of authors on Wikipedia" makes more sense, intuitively, as a crediting device than trying to say "this came from JoeBlow9567, a particular Wikipedia contributor with bits of help from half-a-dozen other people."
As for the argument that's cropped up occasionally that most articles have only a few primary articles -- that is true for many articles but by no means all, and we need to develop a metric that will work with all cases, not just many of them. Additionally, as we go along, Wikipedia pages will simply acquire more authors, not less, and we need to develop a metric that will work over time. The problem I faced when citing policies that had thousands and thousands of substantial revisions is a perfect example of this.
-- phoebe
Robert Rohde wrote:
Let me make a radical suggestion. One that, for the moment, ignores all those overbearing legal questions.
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
When you look at a Wikipedia article there is no list of authors (principal or otherwise). There is simply a link to "history", a statement at the bottom of the page saying that the content is under the GFDL, and a link to the GFDL. On the Wikipedia page itself, that is essentially the full extent of the licensing and attribution.
I assume that practically all Wikipedia contributors are comfortable with recieving this very low level of attribution for Wikipedia articles.
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement. If the authors can make this implicit release, then you have to look at whatever attribution is customary in a given context, along with any moral rights issues.
Which is why I never get particularly worked up with people's concerns about attribution. As Mike Godwin pointed out, we do seek to maintain attribution in our own way, and most people are willing to accept and work with that.
--Michael Snow
On Tue, Oct 21, 2008 at 10:44 PM, Michael Snow wikipedia@verizon.net wrote:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement. If the authors can make this implicit release, then you have to look at whatever attribution is customary in a given context, along with any moral rights issues.
Although no matter how well that argument holds for text written directly into Wikipedia, Wikipedia has a non-trivial amount of freely licensed text copied from elsewhere, and a large amount of images from elsewhere.
So the "well they must have known because thats how we obviously do it" clearly does not hold in many cases.
Michael Snow wrote:
Robert Rohde wrote:
Let me make a radical suggestion. One that, for the moment, ignores all those overbearing legal questions.
Why not assume that the appropriate amount of attribution for a Wikipedia article is essentially the amount that it has now?
When you look at a Wikipedia article there is no list of authors (principal or otherwise). There is simply a link to "history", a statement at the bottom of the page saying that the content is under the GFDL, and a link to the GFDL. On the Wikipedia page itself, that is essentially the full extent of the licensing and attribution.
I assume that practically all Wikipedia contributors are comfortable with recieving this very low level of attribution for Wikipedia articles.
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement. If the authors can make this implicit release, then you have to look at whatever attribution is customary in a given context, along with any moral rights issues.
Which is why I never get particularly worked up with people's concerns about attribution. As Mike Godwin pointed out, we do seek to maintain attribution in our own way, and most people are willing to accept and work with that.
--Michael Snow
I think this is very close to precisely right. We do make a good faith effort at expansive attribution, which is the important bit.
And we do it because it is right, not because it is required by the GFDL. And as long as we do "the right thing" by our contributors, it is accurate to say that any moral rights based lawsuits while unfortunate, would both be perceived to be a nuisance effort, and easily defensible in law (and really it would serve no purpose to hash out how such cases should be handled, suffice it to say that our moral and legal standing would be firm).
It is is a point of insignificant import, that it is not quite true that current practice that is accepted on wikipedia is to elide attribution but for exceptional circumstances, such as may apply to lost histories due to early disk crashes.
While for instance translations from other language wikipedias currently only link to the original language article in somewhat diverse form, in principle the concept behind this has been that this is something which will be repaired in the future, once we figure out how to properly attribute edits made in a different language.
I cannot really thing of any other instances from which one might make a case for it being "accepted practice" to consider editors having released their edits without an expectation to a good faith effort at crediting them for their work.
Yours,
Jussi-Ville Heiskanen
On Tue, Oct 21, 2008 at 10:44 PM, Michael Snow wikipedia@verizon.netwrote:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement.
For the title page, sure. But the basic practice on Wikipedia is to list the username of every single edit in the page history.
As for online sources, I think there are a lot of people upset about the practices of these "subsequent distributors", but for the most part it's just not worth it to sue them. I suppose it'd be enlightening to send a DMCA takedown notice to a few of the big names, but even that takes quite a bit of effort, and for online sources it's fairly pointless. I might have done it myself by now, except that I changed my username to the generic "Anthony", in part because for a lot of the articles I've contributed to I actually would prefer *not* to be associated as an author. Of course, I've also largely stopped contributing.
For dead-tree distributors, this is mostly untested waters. Personally I would be extremely upset if I made significant contributions (say two paragraphs or more) to a Wikipedia article which was copied into a book, and I was not attributed in the book. Printing a URL absolutely doesn't cut it, in my opinion, when it comes to a printed book. Pheobe and company may have gotten advice from Eben Moglen saying that this was A-OK, but quite frankly I think he was both ethically and legally wrong. I don't think you can draw any conclusions that this practice is an accepted one. There just aren't that many dead-tree distributors doing this. As far as I know I haven't made significant contributions to that book, though. So that's someone else's fight to fight.
I do feel like I need to speak up here, though, because the suggestion that I have waived my right to attribution is an absolutely false one.
To Anthony and Jussi-ville,
Why do you want attribution of work you have done on Wikipedia articles to be acknowledged more prominently in dead tree media than it is online?
That's the sense I get from you when you say that referencing an online publication of the history is not okay. If one looks at the Wikipedia publication, in general one has to choose to seek out the edit history and (in many cases) put effort into parsing through it before they would even notice that you had contributed significantly to the article. You seem to be suggesting that in the case of dead tree media you have an expectation that attribution be made clearer/easier to access than it is online. Is that a correct understanding of your view point? And if so why?
Personally, it feels antithetical to the principles of free content and frankly a bit unethical to demand that reusers give a more prominent acknowledgment to contributers than one receives from the primary publication, i.e. Wikipedia.
-Robert Rohde
On Wed, Oct 22, 2008 at 4:30 AM, Anthony wikimail@inbox.org wrote:
On Tue, Oct 21, 2008 at 10:44 PM, Michael Snow wikipedia@verizon.netwrote:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement.
For the title page, sure. But the basic practice on Wikipedia is to list the username of every single edit in the page history.
As for online sources, I think there are a lot of people upset about the practices of these "subsequent distributors", but for the most part it's just not worth it to sue them. I suppose it'd be enlightening to send a DMCA takedown notice to a few of the big names, but even that takes quite a bit of effort, and for online sources it's fairly pointless. I might have done it myself by now, except that I changed my username to the generic "Anthony", in part because for a lot of the articles I've contributed to I actually would prefer *not* to be associated as an author. Of course, I've also largely stopped contributing.
For dead-tree distributors, this is mostly untested waters. Personally I would be extremely upset if I made significant contributions (say two paragraphs or more) to a Wikipedia article which was copied into a book, and I was not attributed in the book. Printing a URL absolutely doesn't cut it, in my opinion, when it comes to a printed book. Pheobe and company may have gotten advice from Eben Moglen saying that this was A-OK, but quite frankly I think he was both ethically and legally wrong. I don't think you can draw any conclusions that this practice is an accepted one. There just aren't that many dead-tree distributors doing this. As far as I know I haven't made significant contributions to that book, though. So that's someone else's fight to fight.
I do feel like I need to speak up here, though, because the suggestion that I have waived my right to attribution is an absolutely false one. _______________________________________________ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Wed, Oct 22, 2008 at 9:43 AM, Robert Rohde rarohde@gmail.com wrote: [snip]
Why do you want attribution of work you have done on Wikipedia articles to be acknowledged more prominently in dead tree media than it is online?
[snip]
I'm not stating my opinion on Anthony's position at this time, but I do not think he is asking for additional attribution.
On Wikipedia attribution is "on the next page", it's just over on the history tab. This is analogous to including attribution at the tail of a dead-tree article, or perhaps in a separate authors index. It is exactly analogous to providing attribution is a location which is certainly not immediately accessible to the reader, and which is potentially completely inaccessible. (For practical reasons it may not be possible to provide an equivalent, as dead-tree is not an equivalent medium, but this fact doesn't make a URL the equivalent or even the nearest fit)
I expect this discrepancy to become more obvious as tools like automatic text attribution make it easier to ignore vandalism, copy-editing, and removed changes in the article history. Addressing this concern well is important even if your position isn't the same as Anthony's.
Probably you should focus more on whats according to present law than what someone wants to believe they can do. It is interesting to see what the Norwegian law says on this matter instead of trying to fight against the law.
Åndsverksloven § 3. Opphavsmannen har krav på å bli navngitt slik som god skikk tilsier, så vel på eksemplar av åndsverket som når det gjøres tilgjengelig for almenheten.
"The creator of the work has a right to be attributed according to good practice, as well on each copy of the work as when it is made available for the general public."
Later it says Sin rett efter første og annet ledd kan opphavsmannen ikke fraskrive seg, med mindre den bruk av verket som det gjelder, er avgrenset efter art og omfang.
"The rights after first and second paragraphs can the author not release, unless the work in question is limited in nature and scope."
Proper attribution i Norwegian law can be said to be covered in a reference to the correct page on Wikipedia. To attribute Wikipedia as such are probably not completely correct, even if it is done customary in newspapers in Norway. It is although the common thing to do - it has become according to good practice, and it is done likewise on other similar publications. A legal alternate is to credit the principal authors or some publication that has the same article.
The law says you should be attributed on each copy of the work, still if the copy is limited in nature and scope you can drop the attribution. Now, is a printed copy of a single article from Wikipedia limited in such a way? My guess is that it is and a reference to Wikipedia is sufficient. On a printed copy of the whole Wikipedia a reference to Wikipedias crediting system is probably sufficient. That is, the printed copy (the book) is limited to Wikipedia so the crediting system on Wikipedia is used to solve the attribution for this "limited nature and scope".
Its the necessity to identify a publication which creates some of the problems, that is, pointing the reader to "Wikipedia". It would be an option if for example FSF or CC had some kind of identifier for each work licensed with their license. Then that could be used the same way as an ISBN number. That would make it possible to credit authors and identify the work through the number, for example "(Desperados, Emanuel; ''Norway'', GFDL 0123456789)". Note that this is an identifier for some broker system, not an identification of the first publisher. It is not necessary to attribute the publisher, it it only necessary to attribute the author. Still something like "(Wikipedia/Norway)" is sufficient if there is a description of how attribution works on Wikipedia, and again "(Wikipedia)" is probably not sufficient.
If someone outside Wikipedia reuses an article from Wikipedia then they probably has to credit the persons involved at that point, or give some kind of pointer to the correct version on Wikipedia. Probably they should describe what this kind of crediting means. They could choose to make a history page of their own, but then that page should described similarly.
Now if they don't want to credit Wikipedia, that is they don't want Wikipedia to attribute the authors, then they has to attribute at least the principal authors.
In a printed "The complete Wikipedia" i believe that an identifier that says "rev 1234567890" on each article is a sufficient attribution if the meaning of this is described somewhere easy to find, and it is described what this means when it comes to attribution of authors.
What I would like to have, is a special page that generated a list of probable principal authors, a list of major authors and a list of other authors. If there could be a single list sorted on importance of contributions it would be nice, as this opens for more judgment from the reader. If principal authors can be detected they should go in the footer on the article pages, but only if they choose to supply their full name, because this is to important for a lot of persons. This has become very visible in Norway as an old paper-based lexicon has taken up the fight against Wikipedia. If someone does not provide their full name (it is in the database but not used for the moment) it should be taken as a grant to not use the name in the footer but only list their user name on the special page.
Such a special page should be able to generate such lists for previous versions, not only for the present version. Ie, The complete Wikipedia's article for Norway (revision 1234567890) is identified as "Special:Attribution/Norway,1234567890". Likewise "Special:Attribution/Norway" is the present version. This should also be linked in the footer together with any identified principal authors. Note that those numbers are our internal revisions, not some kind of ISBN-equivalent.
A full credit of an _article_ on Wikipedia would be "Wikipedia/Norway/1234567890", a sufficient credit would be "Wikipedia/Norway", and probably an insufficient one would be "Wikipedia", given that there is a description of how the attribution works on Wikipedia and given that the use of the article(s) are limited in nature and scope.
I'm not sure how this works given the GFDL license, but it seems to be within the legal boundaries for me. The overall solution is pretty much as today but with an added focus on attribution of principal authors.
On Wed, Oct 22, 2008 at 9:51 AM, Gregory Maxwell gmaxwell@gmail.com wrote:
On Wed, Oct 22, 2008 at 9:43 AM, Robert Rohde rarohde@gmail.com wrote: [snip]
Why do you want attribution of work you have done on Wikipedia articles to be acknowledged more prominently in dead tree media than it is online?
[snip]
I'm not stating my opinion on Anthony's position at this time, but I do not think he is asking for additional attribution.
On Wikipedia attribution is "on the next page", it's just over on the history tab. This is analogous to including attribution at the tail of a dead-tree article, or perhaps in a separate authors index. It is exactly analogous to providing attribution is a location which is certainly not immediately accessible to the reader, and which is potentially completely inaccessible. (For practical reasons it may not be possible to provide an equivalent, as dead-tree is not an equivalent medium, but this fact doesn't make a URL the equivalent or even the nearest fit)
Well, first of all, I never said that linking is perfectly fine with me. Depending on how the link is handled, I have various degrees of disappointment. Ideally, I think online media should directly provide a list of authors. Linking to someone else's copy of a list of authors would be next best (assuming the link remains valid and provides the list of authors at the time of the copy). Linking to the Wikipedia history page is significantly worse, but right about where I'd draw the line ethically. Linking to the Wikipedia article itself is over that line.
Printing a URL which someone can use to get the list of authors if they can manage to get a computer, get internet access, type in, etc., is not at all acceptable, for the reasons given by Gregory above. It's also not accessible because URLs go dead, and they go dead much faster than paper disintegrates. I don't think Wikipedia will be around 20 years from now, but printed copies of Wikipedia probably will be. A mirror which relies on a link to provide attribution takes the risk that the link will go down, and when that happens they have the responsibility to provide a new link or to provide the attribution directly. Dead-tree publishers aren't going to recall all the books they've printed when a url goes down.
I'd be willing to set a threshold on who gets direct attribution. I haven't thought about it enough to say for sure, but somewhere around 50 words is probably an acceptable threshold. That's not all that much dead-tree space to deal with. Worst case scenario, if everyone wrote exactly 50 words and had a two-word-long-attribution, we're talking about around 4% overhead. Of course, the tools aren't widespread to calculate that sort of thing, if they're available at all. But if that's what the rules say, then I'm sure they will be developed. And in the mean time, publishers can choose to print all names instead of calculating which names to include.
Anthony
On Wed, Oct 22, 2008 at 8:37 AM, Anthony wikimail@inbox.org wrote:
On Wed, Oct 22, 2008 at 9:51 AM, Gregory Maxwell gmaxwell@gmail.com wrote:
On Wed, Oct 22, 2008 at 9:43 AM, Robert Rohde rarohde@gmail.com wrote: [snip]
Why do you want attribution of work you have done on Wikipedia articles to be acknowledged more prominently in dead tree media than it is online?
[snip]
I'm not stating my opinion on Anthony's position at this time, but I do not think he is asking for additional attribution.
On Wikipedia attribution is "on the next page", it's just over on the history tab. This is analogous to including attribution at the tail of a dead-tree article, or perhaps in a separate authors index. It is exactly analogous to providing attribution is a location which is certainly not immediately accessible to the reader, and which is potentially completely inaccessible. (For practical reasons it may not be possible to provide an equivalent, as dead-tree is not an equivalent medium, but this fact doesn't make a URL the equivalent or even the nearest fit)
Well, first of all, I never said that linking is perfectly fine with me. Depending on how the link is handled, I have various degrees of disappointment. Ideally, I think online media should directly provide a list of authors. Linking to someone else's copy of a list of authors would be next best (assuming the link remains valid and provides the list of authors at the time of the copy). Linking to the Wikipedia history page is significantly worse, but right about where I'd draw the line ethically. Linking to the Wikipedia article itself is over that line.
<snip>
As I suggested before, though less directly, unless Wikipedia directly provides a quotable list of authors, I don't see any reason to expect that other publishers should be prepared or required to create one. They could copy the entire history, though many people acknowledge that this goes over to the absurd for very long articles. Arguably providing a list of "principal" authors is a technically solvable problem for Wikipedia with appropriate tools, though as Phoebe notes there are serious questions about how one defines significant authorship given the fluid nature of wikitext and the different varieties of editing Wikipedians do.
For the long articles:
"Multiple Authors. 'Earth' retrieved from Wikipedia on Jan 1, 2008. http://en.wikipedia.org/wiki/Earth"
is about the level of acknowledgment that I would expect to see currently.
There are several problems with that. I would say there should also be (at least) a revision id, a reference to the history page, and statements about free content. But unless we can agree on the structure we want to see in acknowledgments from reusers, then I don't expect them to do much better than the above.
Part of agreeing on a structure for reusers could be agreeing on a framework for who should be listed as authors, but until we have a standardized way of providing that information in a useful form, I am mostly surprised when publishers bother to list any authors with a specific acknowledgment at all. The more direct point is that the free content movement should not be expecting other people to solve the authorship problem if we ourselves are unable to do so.
So I welcome the discussion of where to draw lines on authorship, if you really think it is possible to do so. However, I personally am rather skeptical about the ability to have a practical set of rules for defining an author list in a way that would actually satisfy the majority of people in the majority of cases.
-Robert Rohde
Robert Rohde wrote:
To Anthony and Jussi-ville,
Why do you want attribution of work you have done on Wikipedia articles to be acknowledged more prominently in dead tree media than it is online?
That's the sense I get from you when you say that referencing an online publication of the history is not okay.
Surprisingly enough, I cannot speak for Anthony. For myself I would in fact probably like to soften my stance a mite. A link to a web page, if it is specific enough (such as the history of the article), may well be a - not ideal / but will do in a pinch - solution.
What has weakened my opposition to this approach, is that I thought of software distributions which only provide source on demand, and are considered compliant with a non-expansive interpretation of open source.
Perhaps there is a good argument for not trying to be more catholic than the pope.
If one looks at the Wikipedia publication, in general one has to choose to seek out the edit history and (in many cases) put effort into parsing through it before they would even notice that you had contributed significantly to the article. You seem to be suggesting that in the case of dead tree media you have an expectation that attribution be made clearer/easier to access than it is online. Is that a correct understanding of your view point? And if so why?
Having said what I did above, there is one valid argument that would favor providing clearer attribution in a fixed medium publication of wikipedia content, than is on the editable site. That is that ostensibly (yes, I do realize it is in part a fiction) wikipedia is "merely" a work in progress, and not to be used as a finished reference work. A scratch pad as it was originally termed, for Nupedia.
But though I find this argument very persuasive, it is clearly an ethical/editorial one, and not a legal one.
Personally, it feels antithetical to the principles of free content and frankly a bit unethical to demand that reusers give a more prominent acknowledgment to contributers than one receives from the primary publication, i.e. Wikipedia.
Well, like I said, that is assuming wikipedia is a publication rather than a website for the collaborative editing of content that can be used by others for fashioning finished publications.
Do remember that wikipedia relies on the distinction of not being a publisher, for that legal protection under that clause that I forget the number of... 230 or something of some statute or law or another.
Yours,
Jussi-Ville Heiskanen
On Wed, Oct 22, 2008 at 10:26 AM, Jussi-Ville Heiskanen cimonavaro@gmail.com wrote:
For myself I would in fact probably like to soften my stance a mite. A link to a web page, if it is specific enough (such as the history of the article), may well be a - not ideal / but will do in a pinch - solution.
What has weakened my opposition to this approach, is that I thought of software distributions which only provide source on demand, and are considered compliant with a non-expansive interpretation of open source.
Thats my perspective for history information, as well as access to the preferred form for editing, they are 'source code', I'd even drawn the same parallel to software.
GPLv3 even relaxes the requirement for distributors to provide future access to source in some cases (verbatim reproductions; though I'd expect different rules in a free content license). A parallel structure, along with really strong and permissive excerpting rules, would create great justice in a future free content license.
I also hold the same view for attribution, *but only* in cases where the most correct attribution is either too complex for reasonable reproduction in some media (sometimes true for Wikipedia text), or not easily available (always true for Wikipedia text as things stand today).
A good practice, perhaps one worth codifying in a future free content license, might be to be make it clear that the URL is not the author with an attribution like: "Multiple Authors (http://en.wikipedia.org/wiki/somearticle)". This method makes it clear that the complete attribution was omitted for brevity, and not as a claim that "Wikipedia" wrote the article (an outrageous claim in some cases, especially for works which originated in other compatibly licensed locations).
For something like the reproduction of a isolated common photograph with a single author, a failure to directly make available the name of the author would be surprising and inconsistant with common practice as well as unnecessary. So it shouldn't be done there. (Nor should it be done for the frequent case of Wikipedia articles with single effective authors, but we currently have no way of easily identifying them and I do not think it's reasonable to place that burden on the reusers - I think this is a burden that should be shifted somewhat towards authors⋯ If you don not make your attribution clear, don't expect other people to name you.).
On Wed, Oct 22, 2008 at 4:30 AM, Anthony wikimail@inbox.org wrote:
On Tue, Oct 21, 2008 at 10:44 PM, Michael Snow wikipedia@verizon.netwrote:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement.
For the title page, sure. But the basic practice on Wikipedia is to list the username of every single edit in the page history.
As for online sources, I think there are a lot of people upset about the practices of these "subsequent distributors", but for the most part it's just not worth it to sue them. I suppose it'd be enlightening to send a DMCA takedown notice to a few of the big names, but even that takes quite a bit of effort, and for online sources it's fairly pointless. I might have done it myself by now, except that I changed my username to the generic "Anthony", in part because for a lot of the articles I've contributed to I actually would prefer *not* to be associated as an author. Of course, I've also largely stopped contributing.
For dead-tree distributors, this is mostly untested waters. Personally I would be extremely upset if I made significant contributions (say two paragraphs or more) to a Wikipedia article which was copied into a book, and I was not attributed in the book. Printing a URL absolutely doesn't cut it, in my opinion, when it comes to a printed book. Pheobe and company may have gotten advice from Eben Moglen saying that this was A-OK, but quite frankly I think he was both ethically and legally wrong. I don't think you can draw any conclusions that this practice is an accepted one.
Just a few points: 1) there *isn't* really an accepted practice, which is why we're having this discussion. There just haven't been that many test cases -- there have been very few attempts to reprint Wikipedia content in large scale in print, rather than on another website where standard practice has been to link back to Wikipedia.
2) For HWW, I think everything we used from Wikipedia would qualify under fair use anyway -- we quoted few pages verbatim or at length, so hopefully we're good for you and anyone else who disagrees on that score.
3) For our book particularly -- if you can't get to a computer and type in a URL, it's a pretty useless piece of dead-tree anyway, since it's all about how to use Wikipedia online :P Of course that won't be true for article collection reprints.
4) When you say "significant contributions", that's the sticking point for me. What's significant? A first draft of an article that people then change completely? One paragraph? Two? What about adding paragraphs that are subsequently removed and are not present at the time of quoting the article? Adding some references? Any major edit? Repeated vandalism reversal over time? It seems to me that this is such a loose concept that might be interpreted so differently by various editors that the reprinter is pretty much stuck with an all-or-nothing approach -- either you print all the editors in tiny type, which actually obscures the major contributors to an article, or you use some sort of metric or value judgment in picking out significant contributors, which seems like will always be wrong in some way.
-- phoebe
On Wed, Oct 22, 2008 at 11:48 AM, phoebe ayers phoebe.wiki@gmail.comwrote:
On Wed, Oct 22, 2008 at 4:30 AM, Anthony wikimail@inbox.org wrote:
For dead-tree distributors, this is mostly untested waters. Personally I would be extremely upset if I made significant contributions (say two paragraphs or more) to a Wikipedia article which was copied into a book,
and
I was not attributed in the book. Printing a URL absolutely doesn't cut
it,
in my opinion, when it comes to a printed book. Pheobe and company may
have
gotten advice from Eben Moglen saying that this was A-OK, but quite
frankly
I think he was both ethically and legally wrong. I don't think you can
draw
any conclusions that this practice is an accepted one.
Just a few points:
- there *isn't* really an accepted practice, which is why we're
having this discussion.
Absolutely agreed.
2) For HWW, I think everything we used from Wikipedia would qualify
under fair use anyway -- we quoted few pages verbatim or at length, so hopefully we're good for you and anyone else who disagrees on that score.
Well, my one statement was qualified with an "if", "if I made significant contributions". My other statement was regarding Eben Moglen, who you said "felt that as long as our metric was consistent and we linked back to the history on Wikipedia, citing all the authors of every article in our print version was not necessary". Maybe he made this comment knowing that the amount quoted was insignificant, in which case I withdraw my statement.
But at the same time, "fair use" may be an excuse for copying, but it isn't an excuse for lack of attribution.
3) For our book particularly -- if you can't get to a computer and
type in a URL, it's a pretty useless piece of dead-tree anyway, since it's all about how to use Wikipedia online :P Of course that won't be true for article collection reprints.
I don't buy that as an excuse.
4) When you say "significant contributions", that's the sticking
point for me. What's significant? A first draft of an article that people then change completely? One paragraph? Two?
That's a grey area obviously, but I suggested maybe two paragraphs.
What about adding paragraphs that are subsequently removed and are not present at the time of quoting the article? Adding some references? Any major edit? Repeated vandalism reversal over time?
Anything removed shouldn't count. Adding references probably lacks the creative expression necessary for copyright protection.
It seems to me that this is such a loose concept that might be interpreted so differently by various editors that the reprinter is pretty much stuck with an all-or-nothing approach -- either you print all the editors in tiny type, which actually obscures the major contributors to an article, or you use some sort of metric or value judgment in picking out significant contributors, which seems like will always be wrong in some way.
Life (especially with regard to the law and ethics) works that way some times. But just because it's difficult for you to determine exactly where the line is, that doesn't excuse you from clearly crossing it.
Try applying your excuse that it's all or nothing to a few other situations and you'll see how ridiculous it is. Should I drink myself into oblivion because I can't quantify exactly how many beers is too many?
On Wed, Oct 22, 2008 at 9:10 AM, Anthony wikimail@inbox.org wrote:
On Wed, Oct 22, 2008 at 11:48 AM, phoebe ayers phoebe.wiki@gmail.comwrote:
<snip, including some sarcasm that was lost on the replier. Moving on!>
It seems to me that this is such a loose concept that might be interpreted so differently by various editors that the reprinter is pretty much stuck with an all-or-nothing approach -- either you print all the editors in tiny type, which actually obscures the major contributors to an article, or you use some sort of metric or value judgment in picking out significant contributors, which seems like will always be wrong in some way.
Life (especially with regard to the law and ethics) works that way some times. But just because it's difficult for you to determine exactly where the line is, that doesn't excuse you from clearly crossing it.
Actually... my all or nothing phrasing seems to have muddied the waters. It seems to me the options are, when reprinting a Wikipedia article in a book:
1. cite everybody who ever touched the article 2. cite some of the people who touched the article 3. provide a link back to a comprehensive list of everyone who ever touched the article, which also has the benefits of handy diffs so you can see who added what, etc.
Option 1) has the advantage that there are no questions and no judgment that need be applied to the list by the reprinters -- here's all the authors, plain and simple. It has the disadvantages that it is technically difficult to get (there's no clean way currently to get a de-duped history dump for a particular article, hopefully this will change in the future), difficult to work with (we're talking thousands of names for big articles), and arguably obscures the major contributors (check out one of the Wikitravel readers for a great example of this -- in a history section with a long list of all authors, a la Bertelsmann, "Mike" is cited, then the next name on the list is "Mike_sucks." Hmm, I wonder who made more constructive contributions?)
Option 2) has the advantage that you actually (hopefully) highlight the primary contributors to a piece. It has the disadvantage that it's incredibly difficult to figure out a metric for who the primary contributors actually are, and then once you've done that technically producing the list is also hard (to nearly impossible for an average person without the ability to write history-mining scripts or go through the whole thing by hand, if you use a value-laded judgment like "x quantity of significant writing).
Option 3) has the advantage that it's simple, easy to apply consistently and you don't have to worry about getting all the authors listed -- it's already done for you. It's also much more practical for short applications, e.g. reprinting an article in a magazine. It has the disadvantage that according to some contributors (and perhaps, the current license) it doesn't give full and proper attribution to their work.
There may be other problems and advantages that I haven't thought of yet. I'll leave other people to hash out the moral, ethical, and legal advantages of each approach. But these are the practical considerations faced by a reprinter of content. It's also important to remember, I think, that if we are trying, in general, to make reprinting and reuse not just possible but smooth and easy that adds a consideration to the problem. For my part, I think we need to think carefully about this problem and come up with a good solution for the sake of free content distribution in general -- producing content that can be reused is a fundamental part of Wikimedia's mission, so let's do it right.
-- phoebe
phoebe ayers wrote:
- cite everybody who ever touched the article
- cite some of the people who touched the article
- provide a link back to a comprehensive list of everyone who ever
touched the article, which also has the benefits of handy diffs so you can see who added what, etc.
...
There may be other problems and advantages that I haven't thought of yet. I'll leave other people to hash out the moral, ethical, and legal advantages of each approach. But these are the practical considerations faced by a reprinter of content. It's also important to remember, I think, that if we are trying, in general, to make reprinting and reuse not just possible but smooth and easy that adds a consideration to the problem. For my part, I think we need to think carefully about this problem and come up with a good solution for the sake of free content distribution in general -- producing content that can be reused is a fundamental part of Wikimedia's mission, so let's do it right.
It all comes down to the risk tolerance of the person doing the printing. I would be satisfied with including a printed link to the relevant Wikipedia history page. This would satisfy what I believe to be my ethical responsibilities in the matter.
When you bring it down to basics you arrive at the core issue, producing re-usable content. On the way there we get diverted by trying to have the language just right, but each elaboration of language brings new vulnerabilities to the fundamental principle. People seem to read laws in a way that puts them at maximum disadvantage. In an attempt to abide by the literal word of the law (which includes private rules) they imagine circumstances that can only remind us of the boys in the George Carlin skit trying to befuddle the aging priest with hypothetical sins. We tie ourselves in knots trying to find legal countermeasures to aspects of the law whose interpretation is tenuous at best.
Maybe it just takes a plain language statement of what we believe.
Ec
Anthony wrote:
On Wed, Oct 22, 2008 at 11:48 AM, phoebe ayers phoebe.wiki@gmail.comwrote
It seems to me that this is such a loose concept that might be interpreted so differently by various editors that the reprinter is pretty much stuck with an all-or-nothing approach -- either you print all the editors in tiny type, which actually obscures the major contributors to an article, or you use some sort of metric or value judgment in picking out significant contributors, which seems like will always be wrong in some way.
Life (especially with regard to the law and ethics) works that way some times. But just because it's difficult for you to determine exactly where the line is, that doesn't excuse you from clearly crossing it.
That's a self-contradictory statement. If you can't determine exactly where the line is there is nothing clear about having crossed it.
Ec
On Thu, Oct 23, 2008 at 12:46 AM, Ray Saintonge saintonge@telus.net wrote:
Anthony wrote:
On Wed, Oct 22, 2008 at 11:48 AM, phoebe ayers <phoebe.wiki@gmail.com wrote
It seems to me that this is such a loose concept that might be interpreted so differently by various editors that the reprinter is pretty much stuck with an all-or-nothing approach -- either you print all the editors in tiny type, which actually obscures the major contributors to an article, or you use some sort of metric or value judgment in picking out significant contributors, which seems like will always be wrong in some way.
Life (especially with regard to the law and ethics) works that way some times. But just because it's difficult for you to determine exactly
where
the line is, that doesn't excuse you from clearly crossing it.
That's a self-contradictory statement. If you can't determine exactly where the line is there is nothing clear about having crossed it.
Sure there is. There's clearly right, there's clearly wrong, and then there's a grey area in-between. You know the line is in the grey area, but you're not sure exactly where.
2008/10/22 Michael Snow wikipedia@verizon.net:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement. If the authors can make this implicit release, then you have to look at whatever attribution is customary in a given context, along with any moral rights issues.
In any case, this discussion has already reached the stage of counting angels dancing on the heads of pins and assuming that law is as brittle as computer code. It just ain't so.
The threat model we're taking about is: what does a reuser say if taken to court by an insane and obsessive author? Would a judge consider the reuser's actions reasonable, given accepted behaviour regarding said licence to date? That sort of squishy, arguable, grey area thing.
- d.
David Gerard wrote:
2008/10/22 Michael Snow wikipedia@verizon.net:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement. If the authors can make this implicit release, then you have to look at whatever attribution is customary in a given context, along with any moral rights issues.
In any case, this discussion has already reached the stage of counting angels dancing on the heads of pins and assuming that law is as brittle as computer code. It just ain't so.
The threat model we're taking about is: what does a reuser say if taken to court by an insane and obsessive author? Would a judge consider the reuser's actions reasonable, given accepted behaviour regarding said licence to date? That sort of squishy, arguable, grey area thing.
There is no inoculation to prevent insanity and obsession. Whatever model is chosen can provide opportunities for the litigious. Thus if we go with the five principal authors, what's to prevent number six from arguing that he should be in the top five.
In the general case I think that any reuser who exercises a modicum of good faith and due diligence will likely be safe Accepted behaviour will also be influenced by past practice including the chronic failure of rights owners (not WMF) to protect their own rights
Ec
On Thu, Oct 23, 2008 at 2:39 AM, Ray Saintonge saintonge@telus.net wrote:
David Gerard wrote:
2008/10/22 Michael Snow wikipedia@verizon.net:
I might add that the attribution requirement of the GFDL talks about listing at least five principal authors, "unless they release you from this requirement." A fairly straightforward argument can be made that existing and accepted practice on Wikipedia, and for that matter on nearly all wikis, amounts to releasing subsequent distributors from this requirement. If the authors can make this implicit release, then you have to look at whatever attribution is customary in a given context, along with any moral rights issues.
In any case, this discussion has already reached the stage of counting angels dancing on the heads of pins and assuming that law is as brittle as computer code. It just ain't so.
The threat model we're taking about is: what does a reuser say if taken to court by an insane and obsessive author? Would a judge consider the reuser's actions reasonable, given accepted behaviour regarding said licence to date? That sort of squishy, arguable, grey area thing.
There is no inoculation to prevent insanity and obsession. Whatever model is chosen can provide opportunities for the litigious. Thus if we go with the five principal authors, what's to prevent number six from arguing that he should be in the top five.
In the general case I think that any reuser who exercises a modicum of good faith and due diligence will likely be safe Accepted behaviour will also be influenced by past practice including the chronic failure of rights owners (not WMF) to protect their own rights
Going with "the five principal authors" is a terrible idea both from the standpoint of avoiding litigation and from the standpoint of protecting the right to attribution. Of course, the GFDL doesn't mention "the five principal authors", it mentions "five of the principal authors", which may seem like a small difference in English language, but it represents an enormous difference in terms of meaning. Of course, this phrasing is even worse from the standpoint of protecting the right to attribution, because it means essentially that no one writing an article with six principal authors has a right to attribution. But really, setting a limit to the number of principal authors is meaningless anyway, because *anyone can modify the text without permission*, so even if you work your ass off and produce a 10,000 word text, all a reuser has to do is take 5 other 10,001 word texts, append it to the end, and now you get no attribution at all.
Of course, the phrase "five of the principal authors" only occurs in the GFDL when talking about the title page. This whole section should probably be eliminated, because it offers no protection to authors and only invites litigation - maybe it could be turned into a strong suggestion. Fortunately, there is at least an argument that all authors need to be included in the section entitled History. Of course, there's still the problem, which is fairly specific to wikis, of how to define "all authors". I'd say here that the most expansive view of this would be all logged-in authors who have contributed more than a de minimus amount of copyrightable expression to the final end-product. That's a real-life definition, which maximizes the protection of the right to attribution, but perhaps invites litigation. Even then I'm not so sure. I think most judges would handle a borderline case of this nature and award nominal damages if any. Of course, the drop-off-the-cliff clause of the GFDL that any violation of it results in an immediate revocation of the license needs to be removed.
Maybe that definition is too expansive for Wikipedia, but I'm not going to say this for sure until I see some hard numbers on it. What is the ratio of characters of attribution to characters of text if we include the names of any logged-in non-reverted authors?
Only attributing "the five principal authors" is utterly unacceptable. Only attributing "five of the principal authors" is utterly unacceptable. Any attribution clause which doesn't ensure the attribution of *all* significant contributors, is unacceptable. Within that framework I think there are a lot of reasonable solutions.
Anthony
On Thu, Oct 23, 2008 at 1:58 PM, Anthony wikimail@inbox.org wrote:
has a right to attribution. But really, setting a limit to the number of principal authors is meaningless anyway, because *anyone can modify the text without permission*, so even if you work your ass off and produce a 10,000 word text, all a reuser has to do is take 5 other 10,001 word texts, append it to the end, and now you get no attribution at all. ... Only attributing "the five principal authors" is utterly unacceptable. Only attributing "five of the principal authors" is utterly unacceptable. Any attribution clause which doesn't ensure the attribution of *all* significant contributors, is unacceptable. Within that framework I think there are a lot of reasonable solutions.
I was reading this thread (more or less) carefully and I was wondering how it is possible that the direction of the discussion was toward attribution only five persons for the whole Wikipedia (or to some part of it, no matter). So, thanks for mentioning this.
I just may imagine an ironic smile of one my friend, a copyright lawyer from Serbia, with the question: Would it pass at the court? :) At least in Serbia, it would be treated as a typical example of trying to make a fraud based on a weird interpretation of a license (or whichever legal document) or "false contracts" (something in the sense: "See, I killed him because we signed a contract that I may kill him!").
However, I really think that we would come into a dead end if we insist that every ~300 pages book has to print 100 (or 1000) more pages of contributors. It is not a questionable issue, it is just a matter of time: it is, maybe, true even today, it could be no true for the next 5 years, but it will become our reality for sure.
So, some way for solving this problem has to be find. I mentioned in my first post of this thread that some kind of "hard copy links", like web links to the history of the page on Wikipedia, may be used instead of writing all names inside of the book. Maybe it should be defined that if the list of authors is longer than 10% of the book size, for the rest of them, book has to refer to the (mentioned) bibliography.
And this is something which license has to solve. After solving that issue inside of the license, we would have to convince continental legal systems that such kind of solution is reasonable.
And, of course, I am sure that others have some other ideas how to address this problem.
On Thu, Oct 23, 2008 at 8:57 AM, Milos Rancic millosh@gmail.com wrote:
However, I really think that we would come into a dead end if we insist that every ~300 pages book has to print 100 (or 1000) more pages of contributors. It is not a questionable issue, it is just a matter of time: it is, maybe, true even today, it could be no true for the next 5 years, but it will become our reality for sure.
No, it really isn't possible. For a 300 page book to require 100 pages of authors, each author could only have contributed 3 times as many characters as their user name. Unless you're going to count vandals or vandal-reverters as authors, it just isn't going to happen.
Anthony
On Thu, Oct 23, 2008 at 3:35 PM, Anthony wikimail@inbox.org wrote:
On Thu, Oct 23, 2008 at 8:57 AM, Milos Rancic millosh@gmail.com wrote:
However, I really think that we would come into a dead end if we insist that every ~300 pages book has to print 100 (or 1000) more pages of contributors. It is not a questionable issue, it is just a matter of time: it is, maybe, true even today, it could be no true for the next 5 years, but it will become our reality for sure.
No, it really isn't possible. For a 300 page book to require 100 pages of authors, each author could only have contributed 3 times as many characters as their user name. Unless you're going to count vandals or vandal-reverters as authors, it just isn't going to happen.
Imagine that someone is making a 300 pages book about countries in the world, based on Wikipedia articles. All basic Wikipedia articles about countries have (~200) have, of course, much more than 300 pages. It may have even 2000 pages. But, someone wants to use Wikipedia articles to make a shorter book about the issue. Author of the book would use, probably, introductions, as well as some other parts of the articles. So, the author is not able even to try to count who contributed to the introduction, but he has to count on article as a whole.
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page). 200 articles about countries with 100 distinctive names per article means that the list will be 100 pages long. Even 50 is a lot (if we assume that not all articles about countries would have such number of contributors, like article about France would have).
And, numbers will just be raising.
Of course, we may tell to such authors to make a research for every single page and to find which contributions are still inside of the article and which are not. So, instead of working on the matter, author would have to analyze contributions for more than year (I am not sure that I am able to make analysis of the article about France in one working day; even if I assume a number of [existing and non-existing] tools for that).
It is, simply, not reasonable; as well as it is not toward our goal to spread free knowledge.
However, I really agree with you that all significant contributors should be attributed.
On Thu, Oct 23, 2008 at 4:29 PM, Milos Rancic millosh@gmail.com wrote:
Of course, we may tell to such authors to make a research for every single page and to find which contributions are still inside of the article and which are not. So, instead of working on the matter, author would have to analyze contributions for more than year (I am not sure that I am able to make analysis of the article about France in one working day; even if I assume a number of [existing and non-existing] tools for that).
It is, simply, not reasonable; as well as it is not toward our goal to spread free knowledge.
However, I really agree with you that all significant contributors should be attributed.
Although it would not solve the problem for your hypothetical writer, I think for the general case it would be good for us to _provide_ this information with the article - either on the article page, or on the history page, or maybe somewhere else (but I would prefer the first, or if that doesn't work, the second). The information could be created automatically from the history file, and a kind of bot could slowly go over the articles to update it, giving each user's contribution to a page a number, stored in the database. When a page (or history page) is then shown, all users with either more than X contribution, or more than Y% of the total contribution, or among the Z (5) largest contributors would be shown (with a quick-and-dirty version of the algorithm to get a 'maximum' contribution for those who contributed to the page after the last time the information was updated). It might not be that much use to your writer, who still would have 200 lists of 10 or 20 names to deal with (still, 4000 names, many of them duplicates is much more manageable than 20.000 of them), but for more reasonable cases where whole pages or large portions of pages are used, it could give a good indication of which names to include and not to include.
On Thu, Oct 23, 2008 at 6:57 PM, Andre Engels andreengels@gmail.com wrote:
Although it would not solve the problem for your hypothetical writer, I think for the general case it would be good for us to _provide_ this information with the article - either on the article page, or on the history page, or maybe somewhere else (but I would prefer the first, or if that doesn't work, the second). The information could be created automatically from the history file, and a kind of bot could slowly go over the articles to update it, giving each user's contribution to a page a number, stored in the database. When a page (or history page) is then shown, all users with either more than X contribution, or more than Y% of the total contribution, or among the Z (5) largest contributors would be shown (with a quick-and-dirty version of the algorithm to get a 'maximum' contribution for those who contributed to the page after the last time the information was updated). It might not be that much use to your writer, who still would have 200 lists of 10 or 20 names to deal with (still, 4000 names, many of them duplicates is much more manageable than 20.000 of them), but for more reasonable cases where whole pages or large portions of pages are used, it could give a good indication of which names to include and not to include.
Yes, it would be good to have such tool as the first step. It would be useful to have it even during this discussion to get a figure about what do we demand from authors who would write books based on Wikipedia.
So, as I hope that you are interested in making that 0:-) may you give numbers for, let's say, countries [1] of the world and species Felidae [2].
And, of course, we need lists of contributors: 1. Every contributor [let's say, without bots, while it may be disputable, too] with an account and with immediately not reverted edits. -- as the largest group of authors. 2-n. Other ideas which you mentioned.
It would be, also, good to have an approximation of the sizes of the books based on full article size (without templates and images).
[1] - Let's say, this list lists them inside fo the table: http://en.wikipedia.org/wiki/List_of_countries_and_outlying_territories_by_a... [2] - This template is good enough: http://en.wikipedia.org/wiki/Template:Felidae_nav
On Thu, Oct 23, 2008 at 8:15 PM, Milos Rancic millosh@gmail.com wrote:
And, of course, we need lists of contributors:
- Every contributor [let's say, without bots, while it may be
disputable, too] with an account and with immediately not reverted edits. -- as the largest group of authors.
"with immediately not reverted edits" -> with more than immediately not reverted edits
On Thu, Oct 23, 2008 at 8:45 PM, Milos Rancic millosh@gmail.com wrote:
On Thu, Oct 23, 2008 at 8:15 PM, Milos Rancic millosh@gmail.com wrote:
And, of course, we need lists of contributors:
- Every contributor [let's say, without bots, while it may be
disputable, too] with an account and with immediately not reverted edits. -- as the largest group of authors.
"with immediately not reverted edits" -> with more than immediately not reverted edits
Ah, I realized now that the first construction was good :)
On Thu, Oct 23, 2008 at 8:15 PM, Milos Rancic millosh@gmail.com wrote:
Yes, it would be good to have such tool as the first step. It would be useful to have it even during this discussion to get a figure about what do we demand from authors who would write books based on Wikipedia.
So, as I hope that you are interested in making that 0:-) may you give numbers for, let's say, countries [1] of the world and species Felidae [2].
I have already made one once (my goal being to compare a few different algorithms to see which one most corresponds to people's ideas of who actually is the author), but it seems to have gotten lost in a computer crash or something like that.
And, of course, we need lists of contributors:
- Every contributor [let's say, without bots, while it may be
disputable, too] with an account and with immediately not reverted edits. -- as the largest group of authors. 2-n. Other ideas which you mentioned.
Regarding the bots, my idea would be to exclude not by being a bot, but by only looking at the actual text and images on the page. Much bot work would then be excluded because changing interwiki or changing the target of an internal link would not be counted.
On Thu, Oct 23, 2008 at 10:29 AM, Milos Rancic millosh@gmail.com wrote:
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page).
Half of a page for the list of authors of France. Now, I just checked, and a printed copy of the article on France takes up about 25 pages. So attribution takes up about 2% overhead, if indeed there are 100 authors like you say.
200 articles about countries with 100 distinctive names per article means that the list will be 100 pages long.
200 articles the size of [[France]], which would be a 5000 page book. I take it this is going to be split into volumes.
I'm sorry, your numbers are pulled too wildly from the air to be useful. A 300 page book about 200 countries? You're better off rewriting everything "ab initio" than copying from Wikipedia for that. The work to cull down the information into that small of a format is going to far outweigh the savings from plagiarizing the content anyway.
On Thu, Oct 23, 2008 at 1:22 PM, Anthony wikimail@inbox.org wrote:
On Thu, Oct 23, 2008 at 10:29 AM, Milos Rancic millosh@gmail.com wrote:
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page).
Half of a page for the list of authors of France. Now, I just checked, and a printed copy of the article on France takes up about 25 pages. So attribution takes up about 2% overhead, if indeed there are 100 authors like you say.
And *I* just checked that, and there are in fact 4077 authors (2100 IP addresses) for [[France]] on en:wp currently, according to http://vs.aka-online.de/cgi-bin/wppagehiststat.pl. This whole argument is off by an order of magnitude if you assume that only 1 in 4 authors is significant. And how do you tell precisely which of these 4000 authors is, in fact, significant?
-- phoebe
On Thu, Oct 23, 2008 at 3:37 PM, phoebe ayers phoebe.wiki@gmail.com wrote:
On Thu, Oct 23, 2008 at 1:22 PM, Anthony wikimail@inbox.org wrote:
On Thu, Oct 23, 2008 at 10:29 AM, Milos Rancic millosh@gmail.com wrote:
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page).
Half of a page for the list of authors of France. Now, I just checked, and a printed copy of the article on France takes up about 25 pages. So attribution takes up about 2% overhead, if indeed there are 100 authors like you say.
And *I* just checked that, and there are in fact 4077 authors (2100 IP addresses) for [[France]] on en:wp currently, according to http://vs.aka-online.de/cgi-bin/wppagehiststat.pl. This whole argument is off by an order of magnitude if you assume that only 1 in 4 authors is significant. And how do you tell precisely which of these 4000 authors is, in fact, significant?
To follow up, that's 2.5 pages of non-duplicated names, when you run them together in 10pt font. -- phoebe
On Thu, Oct 23, 2008 at 4:06 PM, phoebe ayers phoebe.wiki@gmail.com wrote:
On Thu, Oct 23, 2008 at 3:37 PM, phoebe ayers phoebe.wiki@gmail.com wrote:
On Thu, Oct 23, 2008 at 1:22 PM, Anthony wikimail@inbox.org wrote:
On Thu, Oct 23, 2008 at 10:29 AM, Milos Rancic millosh@gmail.com wrote:
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page).
Half of a page for the list of authors of France. Now, I just checked, and a printed copy of the article on France takes up about 25 pages. So attribution takes up about 2% overhead, if indeed there are 100 authors like you say.
And *I* just checked that, and there are in fact 4077 authors (2100 IP addresses) for [[France]] on en:wp currently, according to http://vs.aka-online.de/cgi-bin/wppagehiststat.pl. This whole argument is off by an order of magnitude if you assume that only 1 in 4 authors is significant. And how do you tell precisely which of these 4000 authors is, in fact, significant?
To follow up, that's 2.5 pages of non-duplicated names, when you run them together in 10pt font. -- phoebe
Whoops! I made a mistake. The 2.5 pages is only the first 1000 authors (I got the list from http://vs.aka-online.de/cgi-bin/wppagehiststat.pl, again). So the whole list of authors would be between 9-10 pages for the 25-page article.
Pity the person who wants to reprint [[George W. Bush]] from en:wp... it has 13228 authors (6366 IP addresses!) Sure, most of them are vandalism, but I haven't seen any tool to pull out significant revisions. Does anyone know of such a tool or script?
--phoebe
On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
Pity the person who wants to reprint [[George W. Bush]] from en:wp... it has 13228 authors (6366 IP addresses!) Sure, most of them are vandalism, but I haven't seen any tool to pull out significant revisions. Does anyone know of such a tool or script?
On Wikitech-l we just had thread WikiTrust and authorship that discussed how such a tool could be made. It is doable.
On Fri, Oct 24, 2008 at 1:18 PM, Nikola Smolenski smolensk@eunet.yu wrote:
On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
Pity the person who wants to reprint [[George W. Bush]] from en:wp... it has 13228 authors (6366 IP addresses!) Sure, most of them are vandalism, but I haven't seen any tool to pull out significant revisions. Does anyone know of such a tool or script?
On Wikitech-l we just had thread WikiTrust and authorship that discussed how such a tool could be made. It is doable.
For copyright attribution purposes? Show me.
Most greedy "auto-attributing" code I've seen has a tendency to incorrectly attribute text in cases of simple re-ordering. It's reasonable enough for measuring the text churn rate in articles, and it may be good enough as a starting point for attribution, but if their is no way to correct it when it's wrong then it probably can't be used for that purpose. (Also, consider the case where half of an article is copy and paste moved from another article.) Not that it shouldn't be done, but I don't expect it could replace other past proposal such as adding a second 'talk' page entitled "Credits".
On Friday 24 October 2008 20:44:31 Gregory Maxwell wrote:
On Fri, Oct 24, 2008 at 1:18 PM, Nikola Smolenski smolensk@eunet.yu wrote:
On Friday 24 October 2008 01:19:20 phoebe ayers wrote:
Pity the person who wants to reprint [[George W. Bush]] from en:wp... it has 13228 authors (6366 IP addresses!) Sure, most of them are vandalism, but I haven't seen any tool to pull out significant revisions. Does anyone know of such a tool or script?
On Wikitech-l we just had thread WikiTrust and authorship that discussed how such a tool could be made. It is doable.
For copyright attribution purposes? Show me.
Most greedy "auto-attributing" code I've seen has a tendency to incorrectly attribute text in cases of simple re-ordering. It's
That isn't the biggest of our concerns: it is acceptable that we have occasional false positive (person who didn't make significant edits is listed among the authors) rather than false negative (person who did make significant edits is not listed among the authors).
A suggestion by Tei is simple and promising: simply make a list of all the words in each version, sort it alphabetically, and make a diff. Number of changed lines is number of changed words. Edits that changed only a few words are not significant for our purpose.
be used for that purpose. (Also, consider the case where half of an article is copy and paste moved from another article.) Not that it
And even that could be mostly identifiable, though it would use a lot of resources. Fortunately, it happens relatively rarely.
On Thu, Oct 23, 2008 at 6:37 PM, phoebe ayers phoebe.wiki@gmail.com wrote:
On Thu, Oct 23, 2008 at 1:22 PM, Anthony wikimail@inbox.org wrote:
On Thu, Oct 23, 2008 at 10:29 AM, Milos Rancic millosh@gmail.com
wrote:
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page).
Half of a page for the list of authors of France. Now, I just checked,
and
a printed copy of the article on France takes up about 25 pages. So attribution takes up about 2% overhead, if indeed there are 100 authors
like
you say.
And *I* just checked that, and there are in fact 4077 authors (2100 IP addresses) for [[France]] on en:wp currently, according to http://vs.aka-online.de/cgi-bin/wppagehiststat.pl. This whole argument is off by an order of magnitude if you assume that only 1 in 4 authors is significant.
Seems like a poor assumption, considering more than half of the authors are essentially anonymous. I'd bet that less than 250 of those authors are significant.
And how do you tell precisely which of these 4000 authors is, in fact, significant?
Precisely? You'd have to go through each edit one by one.
Anthony
On Thu, Oct 23, 2008 at 7:40 PM, Anthony wikimail@inbox.org wrote:
And *I* just checked that, and there are in fact 4077 authors (2100 IP addresses) for [[France]] on en:wp currently, according to http://vs.aka-online.de/cgi-bin/wppagehiststat.pl. This whole argument is off by an order of magnitude if you assume that only 1 in 4 authors is significant.
Seems like a poor assumption, considering more than half of the authors are essentially anonymous. I'd bet that less than 250 of those authors are significant.
I'd be surprised if it were even that many. Significant at some point in time, yes, but for any particular version?
More importantly: You've picked a extreme corner case. Extreme corner cases shouldn't be neglected completely, but they are bad places to start policy discussions. The overwhelming majority of WP articles are very few authors and even fairly few accounts who have edited them.
2008/10/24 Gregory Maxwell gmaxwell@gmail.com:
More importantly: You've picked a extreme corner case. Extreme corner cases shouldn't be neglected completely, but they are bad places to start policy discussions.
Please do take into account that the most popular and "interesting" articles are also often the most likely to have a very large history, though. So if you're compiling a collection that's not focused on fringe subjects, you're likely to hit some articles that have very many authors.
On Fri, Oct 24, 2008 at 2:47 PM, Erik Moeller erik@wikimedia.org wrote:
2008/10/24 Gregory Maxwell gmaxwell@gmail.com:
More importantly: You've picked a extreme corner case. Extreme corner cases shouldn't be neglected completely, but they are bad places to start policy discussions.
Please do take into account that the most popular and "interesting" articles are also often the most likely to have a very large history, though. So if you're compiling a collection that's not focused on fringe subjects, you're likely to hit some articles that have very many authors.
That is a fair point.
Though you could still find more representative article than GWB, even among popular articles: at least at one point in time it had the longest revision history of any article. It's also unusually long, and atypically popular. It's probably a worst case, or close to it, in terms of both possible and actual author count.
I don't know that "fringe" is really the right word either. There are many subject areas which are not at all fringe, things which get whole sections in libraries, where none of the articles are massively multi-authored. So I'd probably reverse the sense of your point: If you're working on anything on a popular media subject you'll certainly come across some articles with long lists of authors.
The end result of both outlooks is, I suppose, the same but I think the notion that most (or all) Wikipedia articles are massively multi-authored is fairly widespread, and thats not a correct position on an article by article basis most of the time (while it's quite true for Wikipedia as a whole), so I like to take the opportunity to point that out.
On Fri, Oct 24, 2008 at 12:10 PM, Gregory Maxwell gmaxwell@gmail.com wrote:
On Fri, Oct 24, 2008 at 2:47 PM, Erik Moeller erik@wikimedia.org wrote:
2008/10/24 Gregory Maxwell gmaxwell@gmail.com:
More importantly: You've picked a extreme corner case. Extreme corner cases shouldn't be neglected completely, but they are bad places to start policy discussions.
Please do take into account that the most popular and "interesting" articles are also often the most likely to have a very large history, though. So if you're compiling a collection that's not focused on fringe subjects, you're likely to hit some articles that have very many authors.
That is a fair point.
Though you could still find more representative article than GWB, even among popular articles: at least at one point in time it had the longest revision history of any article. It's also unusually long, and atypically popular. It's probably a worst case, or close to it, in terms of both possible and actual author count.
The original example was [[France]], with 4077 authors, which is still 9-10 pages of authors in 10pt type. And I don't think [[France]] is a corner case for reprinting at all -- I would hope that it and its fellow country articles would get included in any typical educational compilation, atlas, children's encyclopedia, etc. based on Wikipedia content that got put out.
Yes, [[George Bush]] is atypical, but the chances of someone wanting to reprint it -- again, for any educational compilation with biographies it seems like a fair choice -- seem pretty high. I think any attribution rule that gets made has to take these cases as well as "more typical" 10-author articles into account.
-- phoebe
On Thursday 23 October 2008 22:22:26 Anthony wrote:
On Thu, Oct 23, 2008 at 10:29 AM, Milos Rancic millosh@gmail.com wrote:
If I counted well, article about France has between 8.000 and 9.000 edits up to this moment. I think that it is reasonable to suppose that this article will have 100 distinctive and significant authors -- if not now -- then in 5 or 10 years.
I am reading now a B5 format book with ~40x70=2800 characters per page.
One name has, let's say, 15 characters (btw, I am sure that we will demand listing the names if they are available, not just user names; as I said before, some kind of user boxes may be used for that). 100 names would consume 1500 characters (let's say, 1400, a half of the page).
Half of a page for the list of authors of France. Now, I just checked, and a printed copy of the article on France takes up about 25 pages. So attribution takes up about 2% overhead, if indeed there are 100 authors like you say.
200 articles about countries with 100 distinctive names per article means that the list will be 100 pages long.
200 articles the size of [[France]], which would be a 5000 page book. I take it this is going to be split into volumes.
I'm sorry, your numbers are pulled too wildly from the air to be useful. A 300 page book about 200 countries? You're better off rewriting everything "ab initio" than copying from Wikipedia for that. The work to cull down the information into that small of a format is going to far outweigh the savings from plagiarizing the content anyway.
He's referring to possibility to create a book that would have only the introduction from each article, yet it would have to list all authors (because you can't determine who was writing in the introduction and who wasn't).
On Fri, Oct 24, 2008 at 1:15 PM, Nikola Smolenski smolensk@eunet.yu wrote:
On Thursday 23 October 2008 22:22:26 Anthony wrote:
I'm sorry, your numbers are pulled too wildly from the air to be useful.
A
300 page book about 200 countries? You're better off rewriting
everything
"ab initio" than copying from Wikipedia for that. The work to cull down the information into that small of a format is going to far outweigh the savings from plagiarizing the content anyway.
He's referring to possibility to create a book that would have only the introduction from each article, yet it would have to list all authors (because you can't determine who was writing in the introduction and who wasn't).
Sometimes, sadly, it's not possible to get something for nothing. Is it really part of the mission of the Foundation to allow publishers to create such books? Would such a book even be worth more than the paper it's printed on? It seems like one of those books I can get for $0.10 at the thrift store, or $2.00 at the bargain bin section of a bookstore.
Then again, this whole thread seems to be leading to the conclusion that Wikipedia and the right to attribution are incompatible. If that's truly the case, the only fair thing to do is to start over from scratch under terms that make it clear to all contributors that they have no right to attribution. I honestly hope it isn't the case, and that I'm just missing something.
Anthony
2008/10/23 Ray Saintonge saintonge@telus.net:
David Gerard wrote:
The threat model we're taking about is: what does a reuser say if taken to court by an insane and obsessive author? Would a judge consider the reuser's actions reasonable, given accepted behaviour regarding said licence to date? That sort of squishy, arguable, grey area thing.
There is no inoculation to prevent insanity and obsession. Whatever model is chosen can provide opportunities for the litigious. Thus if we go with the five principal authors, what's to prevent number six from arguing that he should be in the top five.
Precisely - would the reuser's behaviour and demonstrable good faith and actions in accordance with common practice be sufficient for the judge to say "haha no" and throw the case out, possibly awarding costs against the plaintiff? If "yes" then all is well.
- d.
David Gerard wrote:
2008/10/23 Ray Saintonge:
David Gerard wrote:
The threat model we're taking about is: what does a reuser say if taken to court by an insane and obsessive author? Would a judge consider the reuser's actions reasonable, given accepted behaviour regarding said licence to date? That sort of squishy, arguable, grey area thing.
There is no inoculation to prevent insanity and obsession. Whatever model is chosen can provide opportunities for the litigious. Thus if we go with the five principal authors, what's to prevent number six from arguing that he should be in the top five.
Precisely - would the reuser's behaviour and demonstrable good faith and actions in accordance with common practice be sufficient for the judge to say "haha no" and throw the case out, possibly awarding costs against the plaintiff? If "yes" then all is well.
That course of action presupposes that there is someone foolish enough to take the thing to court in the first place, and that that person has the resources to mount a credible case That credible case must include an estimate of monetary damage.
I would love it if we could find such a fool. That would give us a decision to wave under the noses of the paraniacs.
Ec
wikimedia-l@lists.wikimedia.org