On Fri, Nov 27, 2015 at 11:14 AM, Lila Tretikov lila@wikimedia.org wrote:
What I hear in email from Andreas and Liam is not as much the propagation of the error (which I am sure happens with some % of the cases), but the fact that the original source is obscured and therefore it is hard to identify and correct errors, biases, etc. Because if the source of error is obscured, that error is that much harder to find and to correct. In fact, we see this even on Wikipedia articles today (wrong dates of births sourced from publications that don't do enough fact checking is something I came across personally). It is a powerful and important principle on Wikipedia, but with content re-use it gets lost. Public domain/CC0 in combination with AI lands our content for slicing and dicing and re-arranging by others, making it something entirely new, but also detached from our process of validation and verification. I am curious to hear if people think it is a problem. It definitely worries me.
This conversation seems to have morphed into trying to solve some problems that we are speculating Google might have (no one here actually *knows* how the Knowledge Graph works, of course; maybe it's sensitive to manipulation of Wikidata claims, maybe not). That seems like an entirely fruitless line of discourse to me; if the problem exists, it is Google's problem to solve (since they are the ones in a position to tell if it's a real problem or not; not to mention they have two or three magnitudes more resources to throw at it than the Wikimedia movement would). Trying to make our content less free for fear that someone might misuse it is a shamefully wrong frame of mind for and organization that's supposed to be a leader of the open content movement, IMO.