<grin> I do work on quality issues. I blog about them. I work towards
implementing solutions. </grin> I have fixed quite a few errors in Wikidata
and I do not rack up as many edits as I could because of it.
In the mean time with your "I do not want to be involved attitude" you are
the proverbial sailor who stays on shore. It is your option to get your
hands dirty or not. However, a friend of mine mentioned this attitude and
compared it to the people who said that Wikipedia would never work. That is
fine so I will just move on away from many of your arguments..
I do not care about profit. I have over 2 million edits on Wikidata alone
and I have a few others on other projects as well. They may, it is implicit
in the license make a profit. The point is that as more data is freed, it
will free more data. With more free data we can inform more people. We can
share more of the sum of all available knowledge.
I wonder, there are many ways in which quality can be improved and all you
do is refer to others. Why should I bother with your arguments when they
are not yours and when you do not show how to make a difference? My
arguments are plausible and I actively work towards getting them
implemented. I do not need to convince people to do my work. The only thing
I want to do is ask people for their support so that we get sooner to the
stage where we will share in the sum of all available knowledge, something
we do not really do at this stage.
On 1 December 2015 at 15:30, Andreas Kolbe <jayen466(a)gmail.com> wrote:
On Sun, Nov 29, 2015 at 2:55 PM, Gerard Meijssen <
So identify an issue and it can be dealt with.
The fact an issue *can* be dealt with does not mean that it *will* be dealt
For example, in the post that opened this discussion a little over a week
ago, you said:
"At Wikidata we often find issues with data imported from a Wikipedia.
Lists have been produced with these issues on the Wikipedia involved and
arguably they do present issues with the quality of Wikipedia or Wikidata
for that matter. So far hardly anything resulted from such outreach."
These were your own words: "hardly anything resulted from such outreach."
Wikimedia is three years into this project. If people produce lists of
quality issues, that's great, but if nothing happens as a result, that's
not so great.
An example of this is available in this very thread. Three days ago I
mentioned the issues with the Grasulf II of Friuli entries on Reasonator
and Wikidata. I didn't expect that you or anyone else would fix them, and
they haven't been, at the time of writing.
You certainly could have fixed them -- you have made hundreds of edits on
Wikidata since replying to that post of mine -- but you haven't. Adding new
data is more satisfying than sourcing and improving an obscure entry. (If
you're wondering why I didn't fix the entry myself, see the section "And to
answer the obvious question …" in last month's Signpost op-ed.)
This problem is replicated across the Wikimedia universe. Wikimedia
projects are run by volunteers. They work on what interests them, or
whatever they have an investment in. Fixing old errors is not as appealing
as importing 2 million items of new data (including tens or hundreds of
thousands of erroneous ones), because fixing errors is slow work. It
retards the growth of your edit count! You spend one hour researching a
date, and all you get for that effort is one lousy edit in your
contributions history. There are plenty of tasks allowing you to rack up
500 edits in 5 minutes. People seem to prefer those.
That is why Wikipedia has the familiar backlogs in areas like copyright
infringement or AfC. Even warning templates indicating bias or other
problematic content often sit for years without being addressed.
There is a systemic mismatch between data creation and data curation. There
is a lot of energy for the former, and very little energy for the latter.
That is why initiatives like the one started by WMF board member James
Heilman and others, to have the English Wikipedia's medical articles
peer-reviewed, are so important. They are small steps in the right
When we are afraid about a Seigenthaler type of
event based on Wikidata,
rest assured there is plenty wrong in either Wikipedia or Wikidata tha
makes it possible for it to happen. The most important thing is to deal
with it responsibly. Just being afraid will not help us in any way. Yes
need quality and quantity. As long as we make a
best effort to improve
data, we will do well.
That's "eventualism". "Quality is terrible, but eventually it will be
great, because ... we're all trying, and it's a wiki!" To me that sounds
more like religious faith or magical thinking than empirical science.
Things being on a wiki does not guarantee quality; far from it.
As to the Wikipedian is residence, that is his
opinion. At the same time
the article on ebola has been very important. It may not be science but
certainly encyclopaedic. At the same time this
Wikipedian in residence is
involved, makes a positive contribution and while he may make mistakes he
is part of the solution.
I am happy that you propose that work is to be done. What have you done
more importantly what are you going to do? For me
there is "Number of
I will do what I can to encourage Wikimedia Foundation board members and
management to review the situation, in consultation with outside academics
like those at the Oxford Internet Institute who are concerned about present
developments, and to consider whether more stringent sourcing policies are
required for Wikidata in order to assure the quality and traceability of
data in the Wikidata corpus.
The public is the most important stakeholder in this, and should be
informed and involved. If there are quality issues, the Wikimedia
Foundation should be completely transparent about them in its public
communications, neither minimising nor exaggerating the issues. Known
problems and potential issues should be publicised as widely as possible in
order to minimise the harm to society resulting from uncritical reuse of
I have started to reach out to scholars and journalists, inviting them to
review this thread as well as related materials, and form their own
conclusions. I may write an op-ed about it in the Signpost, because I
believe it's an important issue that deserves wider attention and debate.
As far as my own contributions are concerned, I am more inclined to boycott
Apart from all the issues discussed over the past few days, there is
another aspect to my reluctance to contribute to Wikidata.
The Knowledge Graph is a major new Google feature. It adds value to
Google's search engine results pages. It stops people from clicking through
to other sources, including Wikipedia. The recent downturn in Wikipedia
pageviews has been widely linked to the Knowledge Graph.
By ensuring that more people visit Google's ad-filled pages, and stay on
them rather than clicking through to other sites, the Knowledge Graph is at
least partly responsible for recent increases in Google's revenue, which
currently stands at around $200 million a day. (Income after expenses is
about a third of that, i.e. $65 million.)
The development of Wikidata was co-funded by Google, which I understand
donated 325,000 Euros (about $345,000) to that effort. A little bit of
arithmetic shows that, with Google's profits running at $65 million a day,
it takes Google less than 8 minutes to earn that amount of money. Given how
much Google stands to benefit from this development, it seems a paltry
This set me thinking. If we assume that Wikipedia's and Wikidata's
contribution to Google's annual revenue via the Knowledge Graph is just
1/365 – the revenue of one day per year – the monetary value of these
projects to Google is still astronomical.
There have been around 2.5 billion edits to Wikimedia projects to date.
If Google chose to give one day's revenue each year to Wikimedia
volunteers, as a thank-you, this would average out at about 200,000,000 /
2,500,000,000 = 8 cents per edit. Someone like Koavf, who's made 1.5
million edits, would stand to receive around $120,000 a year. Even my
paltry 50,000 edits would net me about $4,000 a year. That's the value of
And that's just Google. Other major players like Facebook and Bing profit,
Wikidata seems custom-made to benefit Google and Microsoft, at the expense
of Wikipedia and other sites. Given my other commitments to Wikimedia
projects, the limited number of hours in a day, and all the other concerns
mentioned in this thread, I feel little inclined at present to further
expand my volunteering in order to work for these multi-billion dollar
corporations for free.
Wikimedia-l mailing list, guidelines at: