Re: [Wikimedia-l] Quality issues

1 Dec 2015

Hoi,
<grin> I do work on quality issues. I blog about them. I work towards
implementing solutions. </grin> I have fixed quite a few errors in Wikidata
and I do not rack up as many edits as I could because of it.

In the mean time with your "I do not want to be involved attitude" you are
the proverbial sailor who stays on shore. It is your option to get your
hands dirty or not. However, a friend of mine mentioned this attitude and
compared it to the people who said that Wikipedia would never work. That is
fine so I will  just move on away from many of your arguments..

I do not care about profit. I have over 2 million edits on Wikidata alone
and I have a few others on other projects as well. They may, it is implicit
in the license make a profit. The point is that as more data is freed, it
will free more data. With more free data we can inform more people. We can
share more of the sum of all available knowledge.

I wonder, there are many ways in which quality can be improved and all you
do is refer to others. Why should I bother with your arguments when they
are not yours and when you do not show how to make a difference? My
arguments are plausible and I actively work towards getting them
implemented. I do not need to convince people to do my work. The only thing
I want to do is ask people for their support so that we get sooner to the
stage where we will share in the sum of all available knowledge, something
we do not really do at this stage.
Thanks,
      GerardM

On 1 December 2015 at 15:30, Andreas Kolbe &lt;jayen466(a)gmail.com&gt; wrote:

...
  On Sun, Nov 29, 2015 at 2:55 PM, Gerard Meijssen <
 gerard.meijssen(a)gmail.com&gt;
 wrote:

  So identify an issue and it can be dealt with.

 The fact an issue *can* be dealt with does not mean that it *will* be dealt
 with.

 For example, in the post that opened this discussion a little over a week
 ago, you said:

 "At Wikidata we often find issues with data imported from a Wikipedia.
 Lists have been produced with these issues on the Wikipedia involved and
 arguably they do present issues with the quality of Wikipedia or Wikidata
 for that matter. So far hardly anything resulted from such outreach."

 These were your own words: "hardly anything resulted from such outreach."
 Wikimedia is three years into this project. If people produce lists of
 quality issues, that's great, but if nothing happens as a result, that's
 not so great.

 An example of this is available in this very thread. Three days ago I
 mentioned the issues with the Grasulf II of Friuli entries on Reasonator
 and Wikidata. I didn't expect that you or anyone else would fix them, and
 they haven't been, at the time of writing.

 You certainly could have fixed them -- you have made hundreds of edits on
 Wikidata since replying to that post of mine -- but you haven't. Adding new
 data is more satisfying than sourcing and improving an obscure entry. (If
 you're wondering why I didn't fix the entry myself, see the section "And to
 answer the obvious question …" in last month's Signpost op-ed.[1])

 This problem is replicated across the Wikimedia universe. Wikimedia
 projects are run by volunteers. They work on what interests them, or
 whatever they have an investment in. Fixing old errors is not as appealing
 as importing 2 million items of new data (including tens or hundreds of
 thousands of erroneous ones), because fixing errors is slow work. It
 retards the growth of your edit count! You spend one hour researching a
 date, and all you get for that effort is one lousy edit in your
 contributions history. There are plenty of tasks allowing you to rack up
 500 edits in 5 minutes. People seem to prefer those.

 That is why Wikipedia has the familiar backlogs in areas like copyright
 infringement or AfC. Even warning templates indicating bias or other
 problematic content often sit for years without being addressed.

 There is a systemic mismatch between data creation and data curation. There
 is a lot of energy for the former, and very little energy for the latter.
 That is why initiatives like the one started by WMF board member James
 Heilman and others, to have the English Wikipedia's medical articles
 peer-reviewed, are so important. They are small steps in the right
 direction.

  When we are afraid about a Seigenthaler type of
event based on Wikidata,
 rest assured there is plenty wrong in either Wikipedia or Wikidata tha
 makes it possible for it to happen. The most important thing is to deal
 with it responsibly. Just being afraid will not help us in any way. Yes  we
  need quality and quantity. As long as we make a
best effort to improve  our
  data, we will do well.

 That's "eventualism". "Quality is terrible, but eventually it will be
 great, because ... we're all trying, and it's a wiki!" To me that sounds
 more like religious faith or magical thinking than empirical science.

 Things being on a wiki does not guarantee quality; far from it.[2][3][4][5]

  As to the Wikipedian is residence, that is his
opinion. At the same time
 the article on ebola has been very important. It may not be science but  it
  certainly encyclopaedic. At the same time this
Wikipedian in residence is
 involved, makes a positive contribution and while he may make mistakes he
 is part of the solution.

 I am happy that you propose that work is to be done. What have you done  but
  more importantly what are you going to do? For me
there is "Number of
 edits:
 2,088,923" <https://www.wikidata.org/wiki/Special:Contributions/GerardM>

 I will do what I can to encourage Wikimedia Foundation board members and
 management to review the situation, in consultation with outside academics
 like those at the Oxford Internet Institute who are concerned about present
 developments, and to consider whether more stringent sourcing policies are
 required for Wikidata in order to assure the quality and traceability of
 data in the Wikidata corpus.

 The public is the most important stakeholder in this, and should be
 informed and involved. If there are quality issues, the Wikimedia
 Foundation should be completely transparent about them in its public
 communications, neither minimising nor exaggerating the issues. Known
 problems and potential issues should be publicised as widely as possible in
 order to minimise the harm to society resulting from uncritical reuse of
 faulty data.

 I have started to reach out to scholars and journalists, inviting them to
 review this thread as well as related materials, and form their own
 conclusions. I may write an op-ed about it in the Signpost, because I
 believe it's an important issue that deserves wider attention and debate.

 As far as my own contributions are concerned, I am more inclined to boycott
 Wikidata.

 Apart from all the issues discussed over the past few days, there is
 another aspect to my reluctance to contribute to Wikidata.

 The Knowledge Graph is a major new Google feature. It adds value to
 Google's search engine results pages. It stops people from clicking through
 to other sources, including Wikipedia. The recent downturn in Wikipedia
 pageviews has been widely linked to the Knowledge Graph.

 By ensuring that more people visit Google's ad-filled pages, and stay on
 them rather than clicking through to other sites, the Knowledge Graph is at
 least partly responsible for recent increases in Google's revenue, which
 currently stands at around $200 million a day.[6] (Income after expenses is
 about a third of that, i.e. $65 million.)

 The development of Wikidata was co-funded by Google, which I understand
 donated 325,000 Euros (about $345,000) to that effort.[8] A little bit of
 arithmetic shows that, with Google's profits running at $65 million a day,
 it takes Google less than 8 minutes to earn that amount of money. Given how
 much Google stands to benefit from this development, it seems a paltry
 investment.

 This set me thinking. If we assume that Wikipedia's and Wikidata's
 contribution to Google's annual revenue via the Knowledge Graph is just
 1/365 – the revenue of one day per year – the monetary value of these
 projects to Google is still astronomical.

 There have been around 2.5 billion edits to Wikimedia projects to date.[7]
 If Google chose to give one day's revenue each year to Wikimedia
 volunteers, as a thank-you, this would average out at about 200,000,000 /
 2,500,000,000 = 8 cents per edit. Someone like Koavf, who's made 1.5
 million edits[9], would stand to receive around $120,000 a year. Even my
 paltry 50,000 edits would net me about $4,000 a year. That's the value of
 free content.

 And that's just Google. Other major players like Facebook and Bing profit,
 too.

 Wikidata seems custom-made to benefit Google and Microsoft, at the expense
 of Wikipedia and other sites. Given my other commitments to Wikimedia
 projects, the limited number of hours in a day, and all the other concerns
 mentioned in this thread, I feel little inclined at present to further
 expand my volunteering in order to work for these multi-billion dollar
 corporations for free.

 [1]
 https://en.wikipedia.org/wiki/Wikipedia:Wikipedia_Signpost/2015-10-07/Op-ed
 [2]

http://www.newsweek.com/2015/04/03/manipulating-wikipedia-promote-bogus-bus…
 [3]

 http://www.salon.com/2013/05/17/revenge_ego_and_the_corruption_of_wikipedia/
 [4]

https://www.washingtonpost.com/news/the-intersect/wp/2015/04/15/the-great-w…
 [5]

http://www.dailydot.com/politics/croatian-wikipedia-fascist-takeover-contro…
 [6] https://investor.google.com/earnings/2015/Q2_google_earnings.html
 [7] https://tools.wmflabs.org/wmcounter/
 [8] https://www.wikimedia.de/wiki/Pressemitteilungen/PM_3_12_Wikidata_EN
 [9]

https://www.washingtonpost.com/news/the-intersect/wp/2015/07/22/you-dont-kn…
 _______________________________________________
 Wikimedia-l mailing list, guidelines at:
 https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
 Wikimedia-l(a)lists.wikimedia.org
 Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
 <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Wikimedia-l] Quality issues