Wikidata August 2019

wikidata@lists.wikimedia.org

41 participants
48 discussions

Weekly Summary #338
by Léa Lacroix 10 Oct '20

10 Oct '20

4 4

Accessing tabular data from SPARQL?
by Daniel Mietchen 07 Jul '20

07 Jul '20

Hi, I'm looking into ways to use tabular data like https://commons.wikimedia.org/wiki/Data:Zika-institutions-test.tab in SPARQL queries but could not find anything on that. My motivation here is in part coming from the time out limits, and the basic idea here would be to split queries that typically time out into sets of queries that do not time out and - if their results were aggregated - would yield the results that would be expected for the original query would it not time out. The second line of motivation here is that of keeping track of how things develop over time, which would be interesting for both content and maintenance queries as well as usage of things like classes, references, lexemes or properties. I would appreciate any pointers or thoughts on the matter. Thanks, Daniel

3 5

Several primary database masters switchovers scheduled (read-only required)
by Manuel Arostegui 26 Sep '19

26 Sep '19

Hello, The following primary database masters will be switched over during the next few weeks (more details at https://phabricator.wikimedia.org/T230788): Impact: *Writes will be blocked* *Reads will remain unaffected* These are the concrete days, hours and affected wikis: * s8: 10th Sept from 05:00-05:30 UTC. The affected wiki is: wikidatawiki - tracking task: T230762 * s2: 17th Sept from 05:00-05:30 UTC. The list of affected wikis is at: https://raw.githubusercontent.com/wikimedia/operations-mediawiki-config/mas… - tracking task: T230785 * s3: 24th Sept from 05:00-05:30 UTC. The list of affected wikis is at: https://raw.githubusercontent.com/wikimedia/operations-mediawiki-config/mas… - tracking task: T230783 * s4: 26th Sept from 05:00-05:30 UTC. The affected wiki is: commonswiki - tracking task: T230784 If everything goes well, we do not expect to use those 30 minutes of read-only and rather just a few minutes. We will email send an email the day of each failover before and after it is done. Sorry for any inconvenience this might cause.

2 9

"Collaborating on the sum of all knowledge across languages"
by Denny Vrandečić 06 Sep '19

06 Sep '19

Hi all! I really try not to spam the chat too much with pointers to my work on the Abstract Wikipedia, but this one is probably also interesting for Wikidata contributors. It is the draft for a chapter submitted to Koerner and Reagle's Wikipedia@20 book, and talks about knowledge diversity under the light of centralisation through projects such as Wikidata. Public commenting phase is open until July 19, and very welcome: "Collaborating on the sum of all knowledge across languages" About the book: https://meta.wikimedia.org/wiki/Wikipedia@20 Link to chapter: https://wikipedia20.pubpub.org/pub/vyf7ksah Cheers, Denny

2 2

Wikimania report
by Houcemeddine A. Turki 04 Sep '19

04 Sep '19

Dear all, I thank you for your efforts. I invite to see my Wikimania report at https://meta.m.wikimedia.org/wiki/Wikimedia_France/Micro-financement/Wikima…. Waiting for the video of my session entitled Wikidata and Health: Current situation and perspectives. Yours Sincerely, Houcemeddine Turki (he/him) Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia Undergraduate Researcher, UR12SP36 GLAM and Education Coordinator, Wikimedia TN User Group Member, WikiResearch Tunisia Member, Wiki Project Med Member, WikiIndaba Steering Committee Member, Wikimedia and Library User Group Steering Committee Co-Founder, WikiLingua Maghreb Founder, TunSci ____________________ +21629499418

3 2

Virtuoso hosted Wikidata Instance
by Kingsley Idehen 02 Sep '19

02 Sep '19

Hi Everyone, A little FYI. We have loaded Wikidata into a Virtuoso instance accessible via SPARQL [1]. One benefit is helping to understand Wikidata using our Faceted Browsing Interface for Entity Relationship Types [2][3]. Links: [1] http://wikidata.demo.openlinksw.com/sparql -- SPARQL endpoint [2] http://wikidata.demo.openlinksw.com/fct -- Faceted Browsing Interface [3] About New York <https://wikidata.demo.openlinksw.com/describe/?url=http%3A%2F%2Fwww.wikidat…> Enjoy! Feedback always welcome too :) -- Regards, Kingsley Idehen Founder & CEO OpenLink Software Home Page: http://www.openlinksw.com Community Support: https://community.openlinksw.com Weblogs (Blogs): Company Blog: https://medium.com/openlink-software-blog Virtuoso Blog: https://medium.com/virtuoso-blog Data Access Drivers Blog: https://medium.com/openlink-odbc-jdbc-ado-net-data-access-drivers Personal Weblogs (Blogs): Medium Blog: https://medium.com/@kidehen Legacy Blogs: http://www.openlinksw.com/blog/~kidehen/ http://kidehen.blogspot.com Profile Pages: Pinterest: https://www.pinterest.com/kidehen/ Quora: https://www.quora.com/profile/Kingsley-Uyi-Idehen Twitter: https://twitter.com/kidehen Google+: https://plus.google.com/+KingsleyIdehen/about LinkedIn: http://www.linkedin.com/in/kidehen Web Identities (WebID): Personal: http://kingsley.idehen.net/public_home/kidehen/profile.ttl#i : http://id.myopenlink.net/DAV/home/KingsleyUyiIdehen/Public/kingsley.ttl#this

6 10

Mailing list Request
by Jitendra Goswami 29 Aug '19

29 Aug '19

I want to join the mailing list. Please take me on the mailing list Thanks.

1 0

Proposal for the introduction of a practicable Data Quality Indicator in Wikidata
by Uwe Jung 29 Aug '19

29 Aug '19

Hello, As the importance of Wikidata increases, so do the demands on the quality of the data. I would like to put the following proposal up for discussion. Two basic ideas: 1. Each Wikidata page (item) is scored after each editing. This score should express different dimensions of data quality in a quickly manageable way. 2. A property is created via which the item refers to the score value. Certain qualifiers can be used for a more detailed description (e.g. time of calculation, algorithm used to calculate the score value, etc.). The score value can be calculated either within Wikibase after each data change or "externally" by a bot. For the calculation can be used among other things: Number of constraints, completeness of references, degree of completeness in relation to the underlying ontology, etc. There are already some interesting discussions on the question of data quality which can be used here ( see https://www.wikidata.org/wiki/Wikidata:Item_quality; https://www.wikidata.org/wiki/Wikidata:WikiProject_Data_Quality, etc). Advantages - Users get a quick overview of the quality of a page (item). - SPARQL can be used to query only those items that meet a certain quality level. - The idea would probably be relatively easy to implement. Disadvantage: - In a way, the data model is abused by generating statements that no longer describe the item itself, but make statements about the representation of this item in Wikidata. - Additional computing power must be provided for the regular calculation of all changed items. - Only the quality of pages is referred to. If it is insufficient, the changes still have to be made manually. I would now be interested in the following: 1. Is this idea suitable to effectively help solve existing quality problems? 2. Which quality dimensions should the score value represent? 3. Which quality dimension can be calculated with reasonable effort? 4. How to calculate and represent them? 5. Which is the most suitable way to further discuss and implement this idea? Many thanks in advance. Uwe Jung (UJung <https://www.wikidata.org/wiki/User:UJung>) www.archivfuehrer-kolonialzeit.de/thesaurus

8 12

Re: [Wikidata] Proposal for the introduction of a practicable Data Quality Indicator in Wikidata - 3rd round
by Uwe Jung 29 Aug '19

29 Aug '19

Hello, thank you very much for your contributions and comments. I would sign most of your remarks without hesitation. But I would like to clarify some things again: - The importance of Wikidata grows with its acceptance by the "unspecialized" audience. This includes also a lot of people who are allowed to decide about project funds or donations. As a rule, they have little time to inform themselves sufficiently about the problems of measuring data quality. In these hectic times, it is unfortunately common for the audience to demand solutions that are as simple and quick to analyse as possible. (I will leave the last sentence here as a hypothesis.) I think it is important to try to meet these expectations. - Recoin is known. And yes, it serves only in connection with the dimension of *relative* completeness. At present, however, it is primarily aimed at people who enter data manually. Thus it remains invisible or unusable for many others. To stick with the idea - would it not be possible to calculate a one- or multi-dimensional value from the recoin information, which then can be stored as a literal via a property "relative completeness" into the item? The advantage would be that this value can be queried via SPARQL together with the item. Possible decision-makers from the field of "jam science" can thus gain an overview of how complete the data from this field are in Wikidata and for which data completion projects funds may still have to be provided. As described in my last article, a single property "relative completeness" is not sufficient to describe data quality. - I am sorry if I expressed it in a misleading way. I use this mailing list to get feedback for an idea. It may be "my" idea (or not), but it is far from being "my" project. However, if the idea should ever be realized by anyone in any way, I would be interested in making my small modest contribution. - It's true that the number of current Wikidata items is hard to imagine. If a single instance would need only one second per item to calculate the different quality scores, it would take about 113 years for all. The fact that many items are modified over and over again and therefore have to be recalculated is not yet taken into account in the calculation. Therefore, the implemented approach would have to use strategies that make the first results visible with less effort. One possibility is to initially concentrate on the part of the data that is being used. We are hear at the question about dynamic quality. - People need support so that they can use data and find and fix their flaws. In the foreseeable future, there will not be so many supporters who will be able to manually check all 60 million items for errors. This is another reason why information about the quality of the data should be queried together with the data. Thanks Uwe Jung

1 0

Call for contributors, A survey of wiki sites
by Houcemeddine A. Turki 29 Aug '19

29 Aug '19

Dear all, I thank you for your efforts. I am managing to begin writing a survey about wiki sites and publish it in an appropriate journal... This survey will involve the software used (Mediawiki or other), the size, reference support, topic... I am looking for contributors to this project. Anyone who has published two papers about wikis in a research journal is invited to join the initiative. Yours Sincerely, Houcemeddine Turki (he/him) Medical Student, Faculty of Medicine of Sfax, University of Sfax, Tunisia Undergraduate Researcher, UR12SP36 GLAM and Education Coordinator, Wikimedia TN User Group Member, WikiResearch Tunisia Member, Wiki Project Med Member, WikiIndaba Steering Committee Member, Wikimedia and Library User Group Steering Committee Co-Founder, WikiLingua Maghreb Founder, TunSci _____________________ +21629499418

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata August 2019