Re: [Wikimedia-l] Quality issues

28 Nov 2015


      Hoi,
There is no problem considering these points. You go in a direction that
has little to do with what we are and where we stand. Wikidata is a wiki.
That implies that it does not have to be perfect. It implies that
approaches are taken that arguably wacky and we will see in time how it
pans out. For instance, "Frankfurt" "instance of" "big city", big city is a
city that is over a certain size. The size is debatable and consequently it
is really poor as a concept. from a Wikidata point of view. It can be
inferred and therefore even redundant. Does it matter? Not really because
in time "we" will see the light.
Our data is incomplete. Arguably importing data enables us to share more of
the sum of all knowledge to our users. A given percentage of all data is
incorrect. However having no data is arguably 100% incorrect and 100% not
in line with our goal of serving the sum of all knowledge. Quality is
important so processes and workflows are exceedingly important to have. We
lack in that department so far. But comparing external data sources like
VIAF or DNB in an iterative way is obvious when you want to identify those
items and statements that are suspect. The data in Wikidata makes it easy
because we have spend considerable effort linking external sources first to
Wikipedia and now to Wikidata. It is easy to mark items with issues using
qualifiers on the external source ID and have a basis for such workflows
and quality markers.
When you make a point of external sources trusting Wikidata, these external
sources may be consumers or they can be partners. When they are partners,
we can provide RSS feeds informing of issues that have been found and they
can do their curation on their data. When they are consumers we can still
provide such an RSS but we do not know what they do with it, it is their
problem more than it is ours.
As I say so often, Wikidata is immature. It is silly to blindly trust
Wikidata. It is largely based on Wikipedia and it has constructs of its own
that we do not need/want in Wikidata. Big cities is one example. We have
items because of interwiki links that are a mix of all kinds eg a listed
building and an organisation. This is conceptually wrong at Wikidata and it
needs to be split. This is where many Wikipedians become uncomfortable but
hey, Wikidata does not tell them to rewrite their article.
So yes you can continue with this point but it has little impact on
Wikidata and when you think it should do consider what impact it has on
Wikidata as a wiki. It is NOT an academic resource or a reference source
perse. It is a wiki, it is allowed to be wrong particularly when it has
proper workflows to improve quality.
If anything THIS is where we can do with a lot more talk and preferably
action. This is where Wikidata is obviously lacking and when we do have
proper workflows in place, we do NOT need the dump that is the "primary
sources" as this is the antithesis of a wiki and it prevents us from
sharing available knowledge.
Thanks,
      GerardM
On 28 November 2015 at 07:05, Wil Sinclair wllm@wllm.com wrote:
...
Gergo, do you mind if people continue discussing this? I'm finding it
very interesting and fruitful. I hadn't thought through these issues
before, and there are likely to be others on this list who haven't
either.
Best!
,Wil
On Fri, Nov 27, 2015 at 5:17 PM, Gergo Tisza gtisza@wikimedia.org wrote:
...
On Fri, Nov 27, 2015 at 11:14 AM, Lila Tretikov lila@wikimedia.org
wrote:
...
...
What I hear in email from Andreas and Liam is not as much the
propagation
...
...
of the error (which I am sure happens with some % of the cases), but the
fact that the original source is obscured and therefore it is hard to
identify and correct errors, biases, etc. Because if the source of
error is
...
...
obscured, that error is that much harder to find and to correct. In
fact,
...
...
we see this even on Wikipedia articles today (wrong dates of births
sourced
...
...
from publications that don't do enough fact checking is something I came
across personally). It is a powerful and important principle on
Wikipedia,
...
...
but with content re-use it gets lost. Public domain/CC0 in combination
with
...
...
AI lands our content for slicing and dicing and re-arranging by others,
making it something entirely new, but also detached from our process of
validation and verification. I am curious to hear if people think it is
a
...
...
problem. It definitely worries me.
This conversation seems to have morphed into trying to solve some
problems
...
that we are speculating Google might have (no one here actually *knows*
how
...
the Knowledge Graph works, of course; maybe it's sensitive to
manipulation
...
of Wikidata claims, maybe not). That seems like an entirely fruitless
line
...
of discourse to me; if the problem exists, it is Google's problem to
solve
...
(since they are the ones in a position to tell if it's a real problem or
not; not to mention they have two or three magnitudes more resources to
throw at it than the Wikimedia movement would). Trying to make our
content
...
less free for fear that someone might misuse it is a shamefully wrong
frame
...
of mind for and organization that's supposed to be a leader of the open
content movement, IMO.
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
...
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Wikimedia-l] Quality issues