Hoi,
Wikidata is a wiki and, you seem to always forget that.
The corruption of data .. how? Each statement is its own data item how do
you corrupt that? As I say so often, when you get a collection that is 80%
correct you have an error rate of 20%. When you do not include that data
you have an error rate of 100%. When you have an other source that is 90%
correct that has similar data and you have an overlap of 50%, you can be
smart and at the start or later compare the data and curate.. When you only
import at the start what is the same, you probably get something like 84%
correct data imported. You can gamify the rest but however you slice it,
what you do not have and could have is 100% wrong.
Wikidata is NOT Wikipedia. It is much easier to curate data and
consequently your argument is FUD. The big thing we have not learned is
cooperation. We do not cooperate. We do not have per standard RSS feeds for
the changes to the items that belong to a specific source. We are happy to
get data but we do not reach out and give back. For me the fact that VIAF
uses Wikidata as a link is an opportunity to do better. The German DNB
cooperation are the projects that we should emulate.
When you talk about quality, you talk in an insular fashion. We have to do
it, our community. At Wikidata our community can include other
organisations with rich collections of data with high quality. We can
share, compare, curate. Even with our current low quality, we have subsets
of data that shine. Subsets of data that our of at least the same quality
as Wikipedia. However this quality is often marred with a lack of quantity,
quantity we can have when collaboration is what we do.
You are afraid of our reputation. Reputation has many aspects. Jane023
presented at the Dutch Wikimedia conference. She uses a tool that is easier
on her because no Wikipedians bother her because it is a Wikidata based
list. A similar list is now used for its quality on the Welsh Wikipedia.
The data is of a quality that Google actually uses it as she reported.
When I see the religious application of Wikipedia sentiments. I find that
we do not even care for the life of one of our own. Bassel is executed or
likely to be executed soon and some think our neutrality is so important.
FOR WHAT? So that we may not even protect our own? Is it right to protest
against TTIP (and we should) and not protest for a Wikipedian that embodies
our values?
Wikipedia think is not applicable at this stage for Wikidata. Its quality
is arguably piss poor but better in places. Many items are corrupt because
they follow the structure of Wikipedia articles. A structure Wikipedians
insist on because they wrote that article and "Wikidata is only a service
project".
I do agree that we need more quality. My approach has set theory on its
side, it embodies the wiki approach and yours is one where Pallas Athena is
to rise from the brain of Zeus in full armour. You may have noticed that my
arguments are easy to follow and conform to something that is measurable.
Yours is private, there is no possibility to verify the accuracy of your
argument. I call bullshit on your argument, not because you do not make a
fine argument but because it is an argument that prevents us from improving
Wikidata.
My hope is that we can work constructively on our quality and have a
measurable effect.
Thanks,
GerardM
On 29 November 2015 at 02:05, Gnangarra <gnangarra(a)gmail.com> wrote:
While I happily agree that Sources are good, I will not ask people to
start
adding Sources at this point of time it will not
improve quality
signifcantly. It makes more sense once we are at a stage where multiple
sources disagree on values for statements. Adding sources is signifcantly
more meaningful and useful once we start curating data.
the problems will that by the time Wikidata starts to curate data it'll
will have corrupted that data with its own data, and secondly past
experience with wiki's is that fixing data after its been entered is
actually harder and more time consuming to do, along with the fact that the
damage to reputation will have a lasting impact and fixing that consumes
millions of dollars in Donner money.. As said earlier there are lesson in
the development of Wikipedia that should be heeded in an attempt to avoid
those same pitfalls
On 29 November 2015 at 08:37, Gerard Meijssen <gerard.meijssen(a)gmail.com>
wrote:
> Hoi,
> It was from the Myanmar WIkipedia that a lot of data was imported to
> Wikidata. Data that did not exist elsewhere. I do not care really what
> "Freedom House" says. I do not know them, I do know that the data is
> relevant and useful It was even the subject on a blogpost..
>
> You may ignore data that is not from a source that you like. This
> indiscriminate POV is not a NPOV.
>
> As to Grasulf, you failed to get the point. It was NOT about the data
> itself but about the presentation. I worked on this item because a
> duplicate was created with even less data.
While I happily agree that Sources are good, I will not ask people to
start
adding Sources at this point of time it will not
improve quality
signifcantly. It makes more sense once we are at a stage where multiple
sources disagree on values for statements. Adding sources is signifcantly
more meaningful and useful once we start curating data. Statistically
most
errors will be found where sources disagree.
When people add conflicting data, it is indeed really relevant to add
Sources. My practice for adding data is that I will only add data that
fulfils some minimal criteria. Typically I am not interested in adding
data
that already exists. I will remove less precise
for more precise data.
The biggest issue with data is that we do not have enough of it and the
second most relevant issue is that we need processes to compare sources
with Wikidata and have a workflow to curate differences.
Thanks,
GerardM
On 28 November 2015 at 19:23, Andreas Kolbe <jayen466(a)gmail.com> wrote:
> Gerard,
>
>
> On Fri, Nov 27, 2015, Gerard Meijssen <gerard.meijssen(a)gmail.com>
wrote:
>
> When you compare the quality of Wikipedias with what en.wp used to be
you
> > are comparing apples and oranges. The
Myanmar Wikipedia is better
> informed
> > on Myanmar than en.wp etc.
> >
>
>
> Is it? The entire Burmese Wikipedia contains a mere 31,646 content
pages
at
the time of writing, covering (or trying to
cover) all countries of the
world, and all aspects of human knowledge.[1]
The English Wikipedia's WikiProject Myanmar, meanwhile, has 6,713 pages
within its purview.[2] I dare say that's more articles on Myanmar than
the
> Burmese Wikipedia contains. As an indication, the English Wikipedia's
> article on Myanmar is more than twice as long as the one in the Burmese
> Wikipedia.
>
> Moreover, according to Freedom House[3], the internet in Myanmar is not
> free:
>
> "The government detained and charged internet users for online
activities
> [...] Government officials pressured social
media users not to
distribute
> or share content that offends the military,
or disturbs the functions
of
government."
> When you qualify a Wikipedia as fascist, it does not follow that the
data
> > is suspect. Certainly when data in a source that you so easily
dismiss
is
> > typically the same, there is not much meaning in what you say from a
> > Wikidata point of view.
> >
>
>
> Data are always generated within a social context, and data generated
by
political
extremists or people living under oppressive regimes are
suspect
whenever they have political implications.
(Looking at the descriptions
of
> Burmese politics, my feeling is the Burmese Wikipedia is not under
> significant government control, but largely written by ex-pats.
However,
the
situation is quite different in some other Wikipedias serving
countries
labouring under similar regimes.)
PS What does your librarian think when she knows
It was a he, but I'll leave him to join in himself if he chooses to.
I happen to work on Dukes of Friuli. Compare the data from Wikidata and
the
> > information by Reasonator based on the same item for one of them.
> >
> >
https://tools.wmflabs.org/reasonator/?&q=2471519
> >
https://www.wikidata.org/wiki/Q2471519
> >
>
>
> Let's look at this example. Reasonator says of Grasulf II of Friulim,
"He
> died in 653". There is no source.
Wikidata says he died in 653, and the
> indicated source is the Italian Wikipedia.
>
> However, when you look at the (very brief) Italian Wikipedia
article[4],
you will
find that the year 653 is given with a question mark. The
English
> Wikipedia, in contrast, states, in its similarly brief article[5],
>
> "Nothing more is known about Grasulf and the date of his death is
> uncertain."
>
> Do you now see the problem about nuance? Reasonator and Wikidata
> confidently proclaim as uncontested fact something that in fact is
rather
uncertain.
The sole source cited by both the English and the Italian Wikipedia is
the
Historia Langobardorum, available in
Wikisource.[6] My Latin is a bit
rusty, but while the Historia mentions that Ago succeeded Grasulf upon
the
> latter's death, it says nothing specific about when that was. The
> Historia's time indications are in general very vague, usually limited
to
> the phrase "Circa haec tempora",
meaning "about this time". So it is in
> this case.
>
> For reference, the Google Knowledge Graph states equally confidently
that
> Grasulf II of Friuli died in 651AD. This may
be based on the English
> Wikipedia's unsourced claim (in the template at the bottom of the
English
Wikipedia
article) that his reign ended c. 651, or on some other source
like Freebase.
The other Wikipedias that have articles on Grasulf II provide the
following
death dates
Catalan: 651
Galician: 653
Lithuanian: 653
Polish: 651
Romanian: Unknown
Russian: 653
Ukrainian: 651
As for published sources, I can offer Ersch's Allgemeine Encyclopädie
(1849), which states on page 209 that Grasulf II died in 651.[7]
The extreme vagueness of the available dates is pointed out by Thomas
Hodgkin in Vol. 7 of "Italy and Her Invaders" (1895). Hodgkin puts the
end
of Grasulf's reign at 645, "as a mere
random guess", and adds that "De
Rubeis, following Sigonius", puts the accession of Ago in 661.[8]
There may well be better and more recent sources beyond my reach, but
having these published dates in Wikidata, with the source references,
would
actually make some sense. Unsourced data, not so
much.
Answers are comfortable, but they are not knowledge when they are
unverifiable and/or wrong.
[1]
https://meta.wikimedia.org/wiki/List_of_Wikipedias#10_000.2B_articles
[2]
https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Myanmar_(Burma)/Assessm…
https://it.wikipedia.org/w/index.php?title=Grasulfo_II_del_Friuli&oldid…
https://en.wikipedia.org/w/index.php?title=Grasulf_II_of_Friuli&oldid=6…
https://books.google.co.uk/books?id=FzxYAAAAYAAJ&pg=PA209&dq=grasul…
https://books.google.co.uk/books?id=8ToOAwAAQBAJ&dq=grasulf+friuli+651%…
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
--
GN.
President Wikimedia Australia
WMAU:
http://www.wikimedia.org.au/wiki/User:Gnangarra
Photo Gallery:
http://gnangarra.redbubble.com
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l(a)lists.wikimedia.org
Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>