On Sun, Dec 13, 2015 at 6:10 PM, Andrea Zanni zanni.andrea84@gmail.com wrote:
I really feel we are drowning in a glass of water. The issue of "data quality" or "reliability" that Andreas raises is well known: what I don't understand if the "scale" of it is much bigger on Wikidata than Wikipedia, and if this different scale makes it much more important. The scale of the issue is maybe something worth discussing, and not the issue itself? Is the fact that Wikidata is centralised different from statements on Wikipedia? I don't know, but to me this is a more neutral and interesting question.
Wikidata's (envisaged) centralised nature certainly makes a difference, because the promise was that it would inform the Wikipedias.
Wikipedia started out with people just writing from their personal knowledge. The early articles had no footnotes. Then after a while people noticed problems like cranks filling pages with their abstruse theories (hence the ban on original research), people adding material from their blogs, etc. Over the course of a decade, Wikipedia developed the idea and the culture that you have to cite a professionally published source for everything you add to Wikipedia.
Wikidata is in its early stages. In a way it really is like Wikipedia in 2003. New content welcome! No references required!
But at the same time, Wikidata is supposed to inform the Wikipedias, as a central data repository. This creates a mismatch between Wikidata's "early days -- anything goes, let's just get content in, we'll sort it out later" attitude and the relatively mature Wikipedias where editors insist on sources for any new content added.
This out-of-synch-ness is a real problem if you want Wikipedias to actually use Wikidata content. Wikipedians will not accept content generation models that take Wikipedia back to its bad old days where you could write anything you liked without a source to back it up.
Wikipedia is of course still a long way away from citing such sources for all its content. There are vast amounts of legacy material left over from the early days. But in the pages that are being created now (like developing news stories, an area where the quality of Wikipedia's coverage is often praised), pages that see a lot of traffic, pages that are controversial, etc., it is well established that you have to cite sources for any new assertions.
Unsourced content is unceremoniously deleted.
If Wikipedia's reputation for reliability has improved since 2003, that change in culture from the early days is the reason.
The Age for example published an article the other day that is probably one of the most celebratory articles ever written about Wikipedia.[1] If you're a Wikipedian, you'll probably enjoy reading it.
Among the aspects that the author, Elizabeth Farrelly, said she liked most about Wikipedia was "its ruthless commitment to the printed, demonstrable source." She ended the article as follows:
---o0o---
But most interesting to me is the ban on primary research. The demand that every input be traced to a published and authoritative source doesn't make it true, necessarily, but does enable genuine crowd-sourcing of scholarship. This is a revelation, and a revolution.
So yes, Wikipedia is flawed. Above all, it needs more female input. But the obvious response, for you-and-me users who encounter something stupid or biased or just plain wrong, is to hop in there and fix it. I'll see you there, yes? Oh, and honey? Cite away!
---o0o---
Abandoning the principles that have elicited such praise -- traceability to published sources, verifiable citations -- is not something Wikipedians will entertain. To them, it would be a step back. If Wikidata wants to be an input to Wikipedia, it will have to bear that in mind.
[1] http://www.theage.com.au/comment/why-wikipedia-at-15-is-a-beautiful-exercise...
I often say that the Wikimedia world made quality an "heisemberghian" feature: you always have to check if it's there. The point is: it's been always like this. We always had to check for quality, even when we used Britannica or authority controls or whatever "reliable" sources we wanted. Wikipedia, and now Wikidata, is made for everyone to contribute, it's open and honest in being open, vulnerable, prone to errors. But we are transparent, we say that in advance, we can claim any statement to the smallest detail. Of course it's difficult, but we can do it. Wikidata, as Lydia said, can actually have conflicting statements in every item: we "just" have to put them there, as we did to Wikipedia.
If Google uses our data and they are wrong, that's bad for them. If they correct the errors and do not give us the corrections, that's bad for us and not ethical from them. The point is: there is no license (for what I know) that can force them to contribute to Wikidata. That is, IMHO, the problem with "over-the-top" actors: they can harness collective intelligent and "not give back." Even with CC-BY-SA, they could store (as they are probably already doing) all the data in their knowledge vault, which is secret as it is an incredible asset for them.
I'd be happy to insert a new clause of "forced transparency" in CC-BY-SA or CC0, but it's not there.
So, as we are working via GLAMs with Wikipedia for getting reliable sources and content, we are working with them also for good statements and data. Putting good data in Wikidata makes it better, and I don't understand what is the problem here (I understand, again, the issue of putting too much data and still having a small community). For example: if we are importing different reliable databases, andthe institutions behind them find it useful and helpful to have an aggregator of identifiers and authority controls, what is the issue? There is value in aggregating data, because you can spot errors and inconsistencies. It's not easy, of course, to find a good workflow, but, again, that is *another* problem.
So, in conclusion: I find many issues in Wikidata, but not on the mission/vision, just in the complexity of the project, the size of the dataset, the size of the community.
Can we talk about those?
Aubrey
On Sun, Dec 13, 2015 at 6:40 PM, Andreas Kolbe jayen466@gmail.com wrote:
On Sun, Dec 13, 2015 at 5:32 PM, geni geniice@gmail.com wrote:
On 13 December 2015 at 15:57, Andreas Kolbe jayen466@gmail.com
wrote:
Jane,
The issue is that you can't cite one Wikipedia article as a source in another.
However you can within the same article per [[WP:LEAD]].
Well, of course, if there are reliable sources cited in the body of the article that back up the statements made in the lead. You still need to cite a reliable source though; that's Wikipedia 101. _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe