A few years ago I suggested comparing Wikipedias and producing lists of biographies of people who were alive on your version of Wikipedia but dead on another. A bot writer went away, wrote a bot and at one point we had reports running in 11 languages using data from 80 language versions.
Sadly the bot writer retired and the project is on hiatus. But it might resume if another bot writer comes forth.
We fixed thousands of errors, mostly when long retired people were logged as dead in their own language but not in other languages. But we also fixed a bunch of anomalies ranging from intrawiki links combining different people of the same name, category errors where someone had put the year of birth into a year of death category, and an assortment of other errors and even use of a fake news site; one of my few French edits had an edit summary of "death was a hoax" and a link to a site where the agent assured the fans that the chap was still alive. French wasn't one of the eleven languages we produced reports for, but some cross wiki working resulted in edits in a lot of places.
Three things I remember:
first the error rate was actually quite low, and mostly sins of omission.
Secondly, quality of referencing varies a lot by language. Hence some ongoing anomalies where we can't change the English version because we don't have a source to cite, but we weren't confident changing the other language version either, and judging from the age, the English version saying the person is still alive might well be the wrong one.
Thirdly there was an interesting cultural difference re assumptions about the very old. Different projects have different cut offs to decide whether a sportsperson who hasn't troubled the global press since they were thirty has shuffled off the mortal coil.
~~~~
WSC
2017-04-16 9:44 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, How can you check for consistency when you are not able to appreciate
if
certain facts (like date of death) exist and are the same? What can you
say
about sources when some Wikipedias insist on sources in their own
language
and sources in other languages you cannot read? How do you check for consistency when we have over 280 Wikipedias with possible content?
Do know that only Wikidata approaches a state where it knows about all
our
projects and we have not, to the best of my knowledge, assessed what" quality of Wikidata is on interwiki links.. Case in point, I fixed an
error
today about a person that was said to be dead because a Commons
category
was not correctly linked.
When you study the consistency of English Wikipedia only, you only add
to
the current bias in research.
When you want to know about the half life of an error, you can find in
the
history when for instance a date was mentioned for a first time and
find
the same date in another language. This is not trivial as the format
of a
language is diverse think Thai for instance. Thanks, GerardM
On 16 April 2017 at 02:08, John Erling Blad jeblad@gmail.com wrote:
This is more about checking consistency between projects. It is interesting, but not quite what I was asking about. It is very
interesting
if it would be possible to say something about half-life of an error.
I'm
pretty sure this follows number of page views if ordinary logged-in
editing
is removed.
On Sun, Apr 16, 2017 at 12:08 AM, Gerard Meijssen < gerard.meijssen@gmail.com
wrote:
Hoi, Would checking if a date of death exists in articles be of interest
to
you.
The idea is that Wikidata knows about dates of death and for
"living
people" the fact of a death should be the same in all projects.
When
the
date of death is missing, there is either an issue at Wikidata (not
the
same precision is one) or at a project.
When a difference is found, the idea is that it is each projects responsibility to do what is needed. No further automation. Thanks, GerardM
On 15 April 2017 at 23:50, John Erling Blad jeblad@gmail.com
wrote:
> Are anyone doing any work on automated quality assurance of
articles?
Not
> the ORES-stuff, that is about creating hints from measured
features.
I'm
> thinking about verifying existence and completeness of citations,
and
> structure of logical arguments. > > John > _______________________________________________ > Wikimedia-l mailing list, guidelines at:
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ > wiki/Wikimedia-l > New messages to: Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/
mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=
unsubscribe>
Wikimedia-l mailing list, guidelines at:
wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/
mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org
?subject=unsubscribe>
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Message: 2 Date: Sun, 16 Apr 2017 17:26:41 +0000 From: John Erling Blad jeblad@gmail.com To: Wikimedia Mailing List wikimedia-l@lists.wikimedia.org Subject: Re: [Wikimedia-l] Quality assurance of articles Message-ID: CAJcMX2nBzec0jtWt-4J-vzfVYmMbXiB9exKZx1OnQGhofqTrdA@mail.gmail.com Content-Type: text/plain; charset=UTF-8
Sorry for the sprellig, I write this on a mobile with Norwegian spellchecker.
Gerrards last question is about coverage, and bias, which is part of the overall quality for the project as such.
Den søn. 16. apr. 2017, 19.22 skrev John Erling Blad jeblad@gmail.com:
I wrote a proposal a few years ago on how we could identfy some types of bias. The idea was to compare ranking of pageviews, and notify other projects about missing articles. I don't think anyone has done any followup om that
Den søn. 16. apr. 2017, 19.12 skrev Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, Humans are overrated. I saw this answer on Facebook [1] and [2] compare the two and tell me why we accept the bias in our editors. Why are we satisfied with what we write about when there is more to inform about. Remember what we aim to achieve. It does not say text, it says share the sum of all knowledge. Thanks, GerardM
[1]
https://upload.wikimedia.org/wikipedia/commons/0/07/Geotagged_articles_in_en... [2]
https://upload.wikimedia.org/wikipedia/commons/2/2b/WorldmapGeonamesallCount...
On 16 April 2017 at 18:59, Ziko van Dijk zvandijk@gmail.com wrote:
Hello John,
Article quality is an interesting subject. I guess that it depends extremely on what is the scientific discipline you come from, and what questions you want to be answered. A linguist will have a very different approach than a computer scientist, for example. If you ask me, only a human being can judge an article if it comes to content quality and
textual
quality, by the way. Maybe you want to elaborate on what are your questions?
Kind regards Ziko
2017-04-16 9:44 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, How can you check for consistency when you are not able to appreciate
if
certain facts (like date of death) exist and are the same? What can
you
say
about sources when some Wikipedias insist on sources in their own
language
and sources in other languages you cannot read? How do you check for consistency when we have over 280 Wikipedias with possible content?
Do know that only Wikidata approaches a state where it knows about all
our
projects and we have not, to the best of my knowledge, assessed what
the
quality of Wikidata is on interwiki links.. Case in point, I fixed an
error
today about a person that was said to be dead because a Commons
category
was not correctly linked.
When you study the consistency of English Wikipedia only, you only
add to
the current bias in research.
When you want to know about the half life of an error, you can find in
the
history when for instance a date was mentioned for a first time and
find
the same date in another language. This is not trivial as the format
of a
language is diverse think Thai for instance. Thanks, GerardM
On 16 April 2017 at 02:08, John Erling Blad jeblad@gmail.com wrote:
This is more about checking consistency between projects. It is interesting, but not quite what I was asking about. It is very
interesting
if it would be possible to say something about half-life of an
error.
I'm
pretty sure this follows number of page views if ordinary logged-in
editing
is removed.
On Sun, Apr 16, 2017 at 12:08 AM, Gerard Meijssen < gerard.meijssen@gmail.com > wrote:
> Hoi, > Would checking if a date of death exists in articles be of
interest
to
you. > The idea is that Wikidata knows about dates of death and for
"living
> people" the fact of a death should be the same in all projects.
When
the
> date of death is missing, there is either an issue at Wikidata
(not
the
> same precision is one) or at a project. > > When a difference is found, the idea is that it is each projects > responsibility to do what is needed. No further automation. > Thanks, > GerardM > > On 15 April 2017 at 23:50, John Erling Blad jeblad@gmail.com
wrote:
> >> Are anyone doing any work on automated quality assurance of
articles?
Not >> the ORES-stuff, that is about creating hints from measured
features.
I'm >> thinking about verifying existence and completeness of
citations,
and
>> structure of logical arguments. >> >> John >> _______________________________________________ >> Wikimedia-l mailing list, guidelines at:
>> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ >> wiki/Wikimedia-l >> New messages to: Wikimedia-l@lists.wikimedia.org >> Unsubscribe: https://lists.wikimedia.org/
mailman/listinfo/wikimedia-l,
>> <mailto:wikimedia-l-request@lists.wikimedia.org?subject=
unsubscribe>
> _______________________________________________ > Wikimedia-l mailing list, guidelines at:
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ > wiki/Wikimedia-l > New messages to: Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/
mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-request@lists.wikimedia.org
?subject=unsubscribe>
Wikimedia-l mailing list, guidelines at:
wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org
?subject=unsubscribe>
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Message: 3 Date: Sun, 16 Apr 2017 21:10:12 +0200 From: "Peter Southwood" peter.southwood@telkomsa.net To: "'Wikimedia Mailing List'" wikimedia-l@lists.wikimedia.org Subject: Re: [Wikimedia-l] Quality assurance of articles Message-ID: 00ce01d2b6e5$156e6c90$404b45b0$@telkomsa.net Content-Type: text/plain; charset="utf-8"
Gerard, I looked at the two images, but have no idea of what point you are trying to make about them. Could you be a bit more descriptive? Cheers, Peter
-----Original Message----- From: Wikimedia-l [mailto:wikimedia-l-bounces@lists.wikimedia.org] On Behalf Of Gerard Meijssen Sent: Sunday, April 16, 2017 7:11 PM To: Wikimedia Mailing List Subject: Re: [Wikimedia-l] Quality assurance of articles
Hoi, Humans are overrated. I saw this answer on Facebook [1] and [2] compare the two and tell me why we accept the bias in our editors. Why are we satisfied with what we write about when there is more to inform about. Remember what we aim to achieve. It does not say text, it says share the sum of all knowledge. Thanks, GerardM
[1] https://upload.wikimedia.org/wikipedia/commons/0/07/Geotagged_articles_in_en... [2] https://upload.wikimedia.org/wikipedia/commons/2/2b/WorldmapGeonamesallCount...
On 16 April 2017 at 18:59, Ziko van Dijk zvandijk@gmail.com wrote:
Hello John,
Article quality is an interesting subject. I guess that it depends extremely on what is the scientific discipline you come from, and what questions you want to be answered. A linguist will have a very different approach than a computer scientist, for example. If you ask me, only a human being can judge an article if it comes to content quality and textual quality, by the way. Maybe you want to elaborate on what are your questions?
Kind regards Ziko
2017-04-16 9:44 GMT+02:00 Gerard Meijssen gerard.meijssen@gmail.com:
Hoi, How can you check for consistency when you are not able to appreciate if certain facts (like date of death) exist and are the same? What can you
say
about sources when some Wikipedias insist on sources in their own
language
and sources in other languages you cannot read? How do you check for consistency when we have over 280 Wikipedias with possible content?
Do know that only Wikidata approaches a state where it knows about all
our
projects and we have not, to the best of my knowledge, assessed what the quality of Wikidata is on interwiki links.. Case in point, I fixed an
error
today about a person that was said to be dead because a Commons category was not correctly linked.
When you study the consistency of English Wikipedia only, you only add to the current bias in research.
When you want to know about the half life of an error, you can find in
the
history when for instance a date was mentioned for a first time and find the same date in another language. This is not trivial as the format of a language is diverse think Thai for instance. Thanks, GerardM
On 16 April 2017 at 02:08, John Erling Blad jeblad@gmail.com wrote:
This is more about checking consistency between projects. It is interesting, but not quite what I was asking about. It is very
interesting
if it would be possible to say something about half-life of an error.
I'm
pretty sure this follows number of page views if ordinary logged-in
editing
is removed.
On Sun, Apr 16, 2017 at 12:08 AM, Gerard Meijssen < gerard.meijssen@gmail.com
wrote:
Hoi, Would checking if a date of death exists in articles be of interest
to
you.
The idea is that Wikidata knows about dates of death and for "living people" the fact of a death should be the same in all projects. When
the
date of death is missing, there is either an issue at Wikidata (not
the
same precision is one) or at a project.
When a difference is found, the idea is that it is each projects responsibility to do what is needed. No further automation. Thanks, GerardM
On 15 April 2017 at 23:50, John Erling Blad jeblad@gmail.com
wrote:
Are anyone doing any work on automated quality assurance of
articles?
Not
the ORES-stuff, that is about creating hints from measured
features.
I'm
thinking about verifying existence and completeness of citations,
and
structure of logical arguments.
John _______________________________________________ Wikimedia-l mailing list, guidelines at:
wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/
mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=
unsubscribe>
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/
mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscr ibe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscrib e
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/ wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/ wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Subject: Digest Footer
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
End of Wikimedia-l Digest, Vol 157, Issue 43
wikimedia-l@lists.wikimedia.org