On Tue, Oct 10, 2017 at 9:43 AM, James Salsman jsalsman@gmail.com wrote:
On Tue, Oct 10, 2017 at 9:38 AM, Leinonen Teemu teemu.leinonen@aalto.fi wrote:
Hi all,
This is super interesting and important discussion. One idea.
On 10 Oct 2017, at 3.44, Erik Moeller eloquence@gmail.com wrote: And for most of the sources amalgamated in this manner, if provenance is indicated at all, we don't find any of the safeguards we have for Wikimedia content (revisioning, participatory decision-making, transparent policies, etc.). Editability, while opening the floodgate to a category of problems other sources don't have, is in fact also a safeguard: making it possible to fix mistakes instead of going through a "feedback" form that ends up who knows where.
Would it make sense to help and maybe even demand the proprietary
service providers and AI application (Siri, Google, etc) using the Wikimedia content to include a statement if their reuse is from a "native version of live Wikimedia” and also this way tell that they do not?
That is a fantastic idea! CC-BY-SA says, "You must attribute the work in the manner specified by the author or licensor."
Is there anything preventing us from specifying attribution in a manner that makes clear the revision date?
Well, Wikidata was, after some to-and-fro and a little controversy, assigned the CC-0 licence, which does not require any attribution whatsoever from re-users. In my view, that was a really big mistake, because it obscures data provenance for the end user.
Given the amount of data Wikidata bots import from Wikipedia, is was also quite possibly a violation of Wikipedia's content licence.
The legal situation is admittedly complex, but don't let anyone tell you that "facts cannot be copyrighted, and that is the end of it." The WMF's own legal department disagreed with that view.[1]
I would love to see the re-users have to do that. Are there any downsides?
As for re-users of CC-BY-SA Wikipedia content, I refer you to the Amazon Echo discussion that started here on this list in July:
https://lists.gt.net/wiki/foundation/828583
In that discussion, concerns were expressed that the Amazon Echo's "Alexa" voice assistant reads snippets from Wikipedia in response to queries, without identifying Wikipedia as the source. Adele Vrana said she would inquire with Amazon and get back to us probably in September. Last I heard from her, she said she was continuing to ping Amazon, but hadn't heard anything. This month, Adele has been out of the office and will be for another week or so.
I think this is a fairly important matter, and I'm somewhat disappointed with the lack of progress to date. It's a potential thin-end-of-the-wedge thing: if the WMF lets Amazon get away with infringing the CC licence (if indeed it is an infringement – to determine that, we would first need to have a response and legal rationale from Amazon and have lawyers examine it), then others will follow.
My fear – largely based on the Wikidata decision – is that some within the WMF are not really interested in enforcing attribution, preferring to make things as convenient as possible for for-profit companies in order to maximise re-use. I'd find that repugnant, because transparent data provenance is important for a whole host of reasons. But I am not convinced WMF folks see it as important at all. The lack of response to date to the Echo question tends to reinforce my doubts in that regard.
Best, Andreas
[1] https://meta.wikimedia.org/wiki/Wikilegal/Database_Rights