In my experience we are now leaving Phase one of filling Wikidata with basic data.
This first phase involved many botcreated items and wellmeaning semi-manual mass updates. This has resulted in many problems. Bots that is filling wrong Item, with same name but a different object, creating mess. Bot update that put in deathdates from a list of retirement dates creating a lot of angriness as it showed in Google search a lot of living persons being dead. And a lot of bewildering "Intance of"
Our reaction to this is that we are now putting a major effort to manually go through important classes of WD items in order to enter correct data, specially "Instance of". This in order to correct and stop erroneous data to be entered from now on. It also give a big advantage as fact control on datasets is much easier done by using Wikdata. Also after base data is correct, make mass updates and correction by bot.
We also have the vision to "freeze" correct data (of data that should not change) by using Literialists, first to easy control big dataset, but after a while putting in logic that hampers changes in his data.
So not manually or Bot but both.
Anders
Den 2019-03-20 kl. 12:46, skrev Gerard Meijssen:
Hoi, The biggest benefit of Wikidata is that it knows about more subjects than any Wikipedia has articles. Like Wikipedia it has its own problems but it has its own benefits. The biggest problem with Wikidata is not its quality and the biggest benefit of Wikipedia is not its quality. Both have issues. All Wikimedia projects rely on their communities, that is where things are the same.
The notion that a community and text is better is in itself a fallacy because the integrity of data is easier to check with data and not so much with text. An example: I have repeatedly indicated that 6% of all the entries in a list in a Wikipedia is wrong. The problem is one of disambiguation.. For instance, for a chemistry award you would expect at least scientists better chemists. When a hockey player or a movie star is among them, it follows that you want to check this out. Easy to do at Wikidata, impossible at Wikipedia. It is possible but only only when Wikipedians and Wikidatans collaborate (they are not really).
When you suggest that bots are less secure than humans you are wrong as well. Research shows that a human with the best of intentions has an error rate of something like 6%. However when a list like a Wikipedia category of alumni of a given university is considered, there are no new errors introduced by a bot. All the errors included in Wikidata are the errors that already exist in a Wikipedia,. When we were to have consolidation processes, once a person is known to have studied at a university we could synchronise categories and data. In addition to this, bots import authorised data from ORCID indicating former students of universitiies. A consolidation process could update update both Wikidata and all Wikipedias who take an interest.
In addition when people search withing Wikidata, never mind the language they will find what Wikidata has to offer. Any Wikipedia is a subset of what a Wikipedia has to offer.
So as much as both Wikidata Wikipedia are wonderful products, there is room for improvement. Improvement will only happen when we truly care about sharing in the sum of all knowledge, when we truly care about quality and not assume that "we" (whoever we is) has a superior proposition. Thanks, GerardM
On Wed, 20 Mar 2019 at 10:45, Gabriel Thullen gabriel@thullen.com wrote:
Sorry about this mail, I hate to rain on somebody's parade but: Ever since Wikidata was set up, there have been more edit made by bots than by humans (registered contributor + anonymous contributor), except for a few periods in 2017 and 2018. On the other hand, the activity of the bots on the English Wikipedia has almost always been lower than the activity of anonymous contributors, and that activity has always been well below that of registered contributors. There was one exception, in 2013 where there was a spike of bot activity. We could also talk about the average number of edits per contributor which appears to be around 100 on the English Wikipedia and 1,200 on Wikidata (these numbers are after removing the estimated edits done by bots). Quite a difference. The different Wikimedia projects rely on the community to police and curate the content of these encyclopedias and data collections. I am therefore a bit wary of what is happening with Wikidata where more edits are still being done by bots than by real humans (by "real" I mean "real" not like "real" as in the TV series "real humans")
Best regards Gabe
On Wed, Mar 20, 2019 at 9:25 AM Olushola Olaniyan < olaniyanshola15@gmail.com> wrote:
This is a good news.
Cheers!!!
Olaniyan Olushola CEO DataAccess Systems Ltd President, Wikimedia Nigeria Member, Affcom ( Wikimedia Foundation) Co-director Wiki Women Radio www.wikimedia.org.ng shola@wikimedia.org.ng olaniyanshola15@gmail.com +2348167352512
On Wed, Mar 20, 2019, 08:52 Ziko van Dijk <zvandijk@gmail.com wrote:
Hello Ariel Glenn, Thanks for the notification, very interesting. Well, we all know that making a lot of edits on Wikidata is "easier" or happens quicker than
on
Wikipedia, for various reasons. But still it is a nice milestone to congratulate to Wikidata. Hereby. :-) Kind regards Ziko
Am Mi., 20. März 2019 um 07:58 Uhr schrieb Gerard Meijssen < gerard.meijssen@gmail.com>:
Hoi, So in stead of calling us all Wikipedia, let us be known as
Wikidata...
HUUUUU Thanks, GerardM
On Wed, 20 Mar 2019 at 07:48, Ariel Glenn WMF ariel@wikimedia.org
wrote:
Wikidata surpassed the English language Wikipedia in the number of revisions in the database, about 45 minutes ago today.I was tipped
off
by a
tweet [1] a few day ago and have been watching via a script that
displays
the largest revision id and its timestamp. Here's the point where
Wikidata
overtakes English Wikipedia (times in UTC):
[ariel@bigtrouble wikidata-huge]$ python3 ./get_revid_info.py -d www.wikidata.org -r 888603998,888603999,888604000 revid 888603998 at 2019-03-20T06:00:59Z revid 888603999 at 2019-03-20T06:00:59Z revid 888604000 at 2019-03-20T06:00:59Z [ariel@bigtrouble wikidata-huge]$ python3 ./get_revid_info.py -d en.wikipedia.org -r 888603998,888603999,888604000 revid 888603998 at 2019-03-20T06:00:59Z revid 888603999 at 2019-03-20T06:00:59Z revid 888604000 at 2019-03-20T06:01:00Z
Only 45 minutes later, the gap is already over 2000 revsions:
[ariel@bigtrouble wikidata-huge]$ python3 ./compare_sizes.py Last enwiki revid is 888606979 and last wikidata revid is 888629401 2019-03-20 06:46:03: diff is 22422
Have a nice day!
Ariel
[1] https://twitter.com/MonsieurAZ/status/1106565116508729345 _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org
?subject=unsubscribe>
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe