Wikidata August 2015

wikidata@lists.wikimedia.org

64 participants
36 discussions

We plan to drop the wb_entity_per_page table
by hoo 01 Jun '17

01 Jun '17

Hey folks, we plan to drop the wb_entity_per_page table sometime soon[0], because it is just not required (as we will likely always have a programmatic mapping from entity id to page title) and it does not supported non -numeric entity ids as it is now. Due to this removing it is a blocker for the commons metadata. Is anybody using that for their tools (on tool labs)? If so, please tell us so that we can give you instructions and a longer grace period to update your scripts. Cheers, Marius [0]: https://phabricator.wikimedia.org/T95685

5 5

People who died in 2015 who were Dutch
by Gerard Meijssen 31 Aug '16

31 Aug '16

Hoi, Jura1 created a wonderful list of people who died in Brazil in 2015 [1]. It is a page that may update regularly from Wikidata thanks to the ListeriaBot. Obviously, there may be a few more because I am falling ever more behind with my quest for registering deaths in 2015. I have copied his work and created a page for people who died in the Netherlands in 2015 [2]. It is trivially easy to do this and, the result is great. The result looks great, it can be used for any country in any Wikipedia The Dutch Wikipedia indicated that they nowadays maintain important metadata at Wikidata. I am really happy that we can showcase their work. It is important work because as someone reminded me at some stage, this is part of what amounts to the policy of living people... Thanks, GerardM [1] https://www.wikidata.org/wiki/User:Jura1/Recent_deaths_in_Brazil [2] https://www.wikidata.org/wiki/User:Jura1/Recent_deaths_in_the_Netherlands

10 35

Maintenance scripts for clients
by John Erling Blad 31 Jan '16

31 Jan '16

We lack several maintenance scripts for the clients, that is human readable special pages with reports on which pages lacks special treatment. In no particular order we need some way to identify unconnected pages in general (the present one does not work [1]), we need some way to identify pages that are unconnected but has some language links, we need to identify items that are used in some language and lacks labels (almost like [2],but on the client and for items that are somehow connected to pages on the client), and we need to identify items that lacks specific claims and the client pages use a specific template. There are probably more such maintenance pages, these are those that are most urgent. Now users start to create categories to hack around the missing maintenance pages, which create a bunch of categories.[3] At Norwegian Bokmål there are just a few scripts that utilize data from Wikidata, still the number of categories starts to grow large. For us at the "receiving end" this is a show stopper. We can't convince the users that this is a positive addition to the pages without the maintenance scripts, because them we more or less are in the blind when we try to fix errors. We can't use random pages to try to prod the pages to find something that is wrong, we must be able to search for the errors and fix them. This summer we (nowiki) have added about ten (10) properties to the infobokses, some with scripts and some with the property parser function. Most of my time I have not been coding, and I have not been fixing errors. I have been trying to explain to the community why Wikidata is a good idea. At one point the changes was even reverted because someone disagree with what we had done. The whole thing basically revolves around "my article got an Q-id in the infobox and I don't know how to fix it". We know how to fix it, and I have explained that to the editors at nowiki several times. They still don't get it, so we need some way to fix it, and we don't have maintenance scripts to do it. Right now we don't need more wild ideas that will swamp the development for months and years to come, we need maintenance scripts, and we need them now! [1] https://no.wikipedia.org/wiki/Spesial:UnconnectedPages [2] https://www.wikidata.org/wiki/Special:EntitiesWithoutLabel [3] https://no.wikipedia.org/wiki/Spesial:Prefiksindeks/Kategori:Artikler_hvor John Erling Blad /jeblad

4 7

Goal: Establish a framework to engage with data engineers and open data organizations
by Quim Gil 31 Jan '16

31 Jan '16

Hi, it's first of July and I would like to introduce you a quarterly goal that the Engineering Community team has committed to: Establish a framework to engage with data engineers and open data organizations https://phabricator.wikimedia.org/T101950 We are missing a community framework allowing Wikidata content and tech contributors, data engineers, and open data organizations to collaborate effectively. Imagine GLAM applied to data. If all goes well, by the end of September we would like to have basic documentation and community processes for open data engineers and organizations willing to contribute to Wikidata, and ongoing projects with one open data org. If you are interested, get involved! We are looking for * Wikidata contributors with good institutional memory * people that has been in touch with organizations willing to contribute their open data * developers willing to help improving our software and programming missing pieces * also contributors familiar with the GLAM model(s), what works and what didn't work This goal has been created after some conversations with Lydia Pintscher (Wikidata team) and Sylvia Ventura (Strategic Partnerships). Both are on board, Lydia assuring that this work fits into what is technically effective, and Sylvia checking our work against real open data organizations willing to get involved. This email effectively starts the bootstrapping of this project. I will start creating subtasks under that goal based on your feedback and common sense. -- Quim Gil Engineering Community Manager @ Wikimedia Foundation http://www.mediawiki.org/wiki/User:Qgil

12 19

next 2 rounds of arbitrary access rollouts
by Lydia Pintscher 16 Sep '15

16 Sep '15

Hey folks :) We'll be rolling out arbitrary access to a large number of wikis next week and the week after. See https://www.wikidata.org/wiki/Wikidata:Arbitrary_access for the full list. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

3 7

Re: [Wikidata] [Commons-l] Trends in links from Wikidata items to Commons
by James Heald 13 Sep '15

13 Sep '15

To pick up on a few different comments from this thread: @revi (Hong, Yongmin) -- Yes, of course you are correct that it is categories rather than galleries that are the important structure for finding and navigating images on Commons. But even if we were to change the status-quo and ban sitelinks from Wikidata items to Commons galleries (which might not be 100% popular, since we currently have 87,000 sitelinks to galleries, up by about 3000 in the last year) -- even if we were to ban sitelinks to galleries, this would still leave the question of whether Commons categories should be sitelinked to categories or to articles. Currently there are 311,000 Commonscats sitelinked to category items on Wikidata, and 207,000 sitelinked to article items on Wikidata. So the status-quo is for Commonscats to be sitelinked to category items (and in the past has been even more so). The problem is that, the way the software exists at the moment, you can't have both. So if a Commonscat is sitelinked to an article item, that precludes a Commonscat being sitelinked to a category item, and vice-versa. At the moment, the expectation is that a Commonscat will be sitelinked to a category item, if possible. Of 323,825 Commonscats that can be identified with a Wikidata category item, 311,000 are connected by corresponding sitelinks. So if people are writing scripts or queries to look for such relationships, they will most likely look for sitelinks. On the other hand, of 884,439 Commonscats that can be identified with article-like Wikidata items, only 207,494 (= 23.4%) are connected by sitelinks -- even if this number has doubled in the last 12 months, the expectation in the current status quo is that such a connection is more likely *not* to be represented in a sitelink, by a three-to-one margin. Instead, links between Commonscats and article-like Wikidata items are currently overwhelmingly represented by the P373 property, which at the moment records 807,776 (= 91.3%) of such identifications. This is the property that the script https://commons.wikimedia.org/wiki/User:Jheald/wdcat.js looks up in order to display a link to Reasonator if there is a Wikidata article-item for a Commonscat. To get the best idea of whether there is a corresponding Wikidata item and Wikipedia articles for a Commonscat, Commons users should therefore use wdcat.js -- which is easily activated by adding a line to the common.js file, such as at https://commons.wikimedia.org/wiki/User:Jheald/common.js -- because this will pick up four times more connections than currently exist as sitelinks. The important think is to establish and preserve clear expectations, that software can build against. At the moment, with the confusion of sitelinks of Commonscats to articles and to categories, there is no guarantee that it is possible to create a sitelink to an article. On the other hand, it should always be possible to create a P373 connection. They are the connections that systematically *can* be made, so it is important to make sure that they systematically *are* made, to drag up the return rate from the current 91.3% to nearer 100%. That would be helped by a policy that was absolutely systematic in prescribing what should and what should not be sitelinked. @ RomaineWiki: You say that actively enforcing the longstanding Wikidata sitelink policy of only sitelinking Commons categories to category-like items, and Commons galleries to article-like items would be a plan to "demolish the navigational structure of Commons". But it wouldn't change any of the internal structures on Commons, and would merely underline the current fact that even now only 23% of Commonscats are linked to articles by sitelinks, compared to 91% by P373. Isn't it better to get Commons users used to using (and improving) the wdcat.js script, which uses the P373 property that can always be added, rather than perpetuating the current muddle of Commonscat <-> article sitelinks, which are so haphazard ? @ Steinsplitter As I understand it, the long-term plans for a new Wikibase structure specifically for Commons are currently no longer an immediate development priority; but will presumably start to move forward again sooner or later. On the other hand, this discussion was specifically about sitelinks. Here I believe what has driven the Wikidata side has been the desire to have a rule that is simple and consistent and predictable, because that is the foundation needed to develop queries and scripts and tools and user-interfaces on top of. Combining that desideratum with the technical restriction of only allowing one sitelink to each item from each wiki and vice-versa, is what has led to the recommended scheme of linking Commons categories <-> category-like items Commons galleries <-> article-like items with property P373 to handle identification of Commons categories <-> article-like items. This fulfils the requirements of simplicity, consistency and predictability. It's not ideal from a user-interface point of view (or a philosophical point of view). But so long as the rule is applied consistently, the limitations it leads to can be worked round with appropriate software improvements -- eg in the first instance the wdcat.js script. But to encourage people to develop and improve such software, it is helpful for the above structure to be applied consistently. In contrast perpetuating inconsistency and muddle blurs what is needed, and works against the stable predictable basis needed to make such software work. All best, James. On 29/08/2015 14:39, Steinsplitter Wiki wrote: > Wikidata needs to ask the Commons Community before doing commons related changes. > > It is so hard to understand what the wikidata people like to do with commons. Tons of text, hard to read. I don't understand what they like to do, but if this change is affecting commons then commons community consensus is needed. > > > Date: Fri, 28 Aug 2015 17:34:28 +0200 > From: romaine.wiki(a)gmail.com > To: wikidata(a)lists.wikimedia.org; commons-l(a)lists.wikimedia.org > Subject: Re: [Commons-l] [Wikidata] Trends in links from Wikidata items to Commons > > As I wrote before, that thought is too simple. You only say that a zero belongs to a zero, and a two belongs to a two, then you only describe the type of page, but you ignore the subject of a page. That subject matters much more than the namespace number. > > Especially Wikinews is a wrong example, as most categories on Commons do not have a 1 to 1 relationship with Commons. > However, articles on Wikipedia do have mostly a 1 on 1 relationship with categories on Commons. > > Romaine > > 2015-08-28 17:09 GMT+02:00 Luca Martinelli <martinelliluca(a)gmail.com>: > 2015-08-28 12:09 GMT+02:00 Romaine Wiki <romaine.wiki(a)gmail.com>: > >> And I agree completely with what Revi says: > >>> Wikidata ignores this Commons' fact by trying to enforce ridiculous rules > >>> like this. > > > > It's not such a ridiculous rule, if you think of the rationale behind > > it: if gallery = ns0 and category = ns2, linking ns0 <--> ns2 in the > > same item is IMHO not a rational thing to do (not even for Wikinews if > > you ask me, but I'm digressing). > > > > So the *practical* problem that we have to address is the list of > > links in the left column. We really don't have any possibilty to > > exploit P373 in any way, not even with a .js, to fix this? > > > > L. > > > > _______________________________________________ > > Wikidata mailing list > > Wikidata(a)lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikidata > > > > > _______________________________________________ > Commons-l mailing list > Commons-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/commons-l > > > > _______________________________________________ > Commons-l mailing list > Commons-l(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/commons-l >

8 7

Wikidata needs your votes
by Lydia Pintscher 10 Sep '15

10 Sep '15

Hey folks, As you know Wikidata is one of the 100 winners of the "Germany - Land der Ideen" competition 2015. We now have the chance to win the Public’s Choice Award as well! At the moment, Wikidata is ranked 15th. We need to be among the first 10 to enter the next round and win the Public’s Choice Award. Voting is open until 23 August. The information about the competition and projects is available in English (https://www.land-der-ideen.de/en/projects-germany/landmarks-land-ideas/comp… and https://www.land-der-ideen.de/en/projects-germany/landmarks-land-ideas/2015…) but the voting process only in German. But it’s pretty easy to vote anyway: * Go to the Wikidata voting page at https://www.land-der-ideen.de/ausgezeichnete-orte/preistraeger/wikidata * Click the yellow button on the right ("Jetzt abstimmen"). * Type in your email address and tick the box to agree to the voting rules. * You will receive a link via email. Click the link within 24 hours. You can repeat this daily until 23 August. You have one vote every day. Let's win this! :D Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

11 22

Naming Conventions for URIs
by Paul Houle 04 Sep '15

04 Sep '15

Tell me if I am right or wrong about this. If I am coining a URI for something that has an identifier in an outside system is is straightforward to append the identifier (possibly modified a little) to a prefix, such as http://dbpedia.org/resource/Stellarator Then you can write @prefix dbpedia: <http://dbpedia.org/resource/> and then refer to the concept (in either Turtle or SPARQL) as dbpedia:Stellarator. I will take one step further than this and say that for pedagogical and other coding situations, the extra length of prefix declarations is an additional cognitive load on top of all the other cognitive loads of dealing with the system, so in the name of concision you can do something like @base <http://dbpedia.org/resource/> @prefix : <http://dbpedia.org/ontology/> and then you can write :someProperty and <Stellarator>, and your queries are looking very simple. The production for a QName cannot begin with a number so it is not correct to write something like dbpedia:100 or expect to have the full URI squashed to that. This kind of gotcha will drive newbies nuts, and the realization of RDF as a universal solvent requires squashing many of them. Another example is isbn:9971-5-0210-0 If you look at the @base declaration above, you see a way to get around this, because with the base above you can write <100> which works just fine in the dbpedia case. I like what Wikidata did with using fairly dense sequential integers for the ids, so a dbpedia resource URI looks like http://www.wikidata.org/entity/Q4876286 which is always a QName, so you can write @base <http://www.wikidata.org/entity/> @prefix wd: <http://www.wikidata.org/entity/> and then you can write wd:Q4876286 <Q4876286> and it is all fine, because (i) wikidata added the alpha prefix and (ii) started at the beginning with it, and (iii) made up a plausible explanation for it is that way. Freebase mids have the same property, so :BaseKB has it too I think customers would expect to be able to give us isbn:0884049582 and have it just work, but because a number is never valid in the QName, you can encode the URI like this: http://isbn.example.com/I0884049582 and then write isbn:I0884049582 <I0884049582> which is not too bad. Note, however, if you want to write <0884049582> you have to encode as http://isbn.example.com/I0884049582 because, at least with the Jena framework, the same thing happens if you write @base <http://isbn.example.com/I> or @base <http://isbn.example.com/> so you can't choose a representation which supports that mode of expression and a :+prefix mode. Now what bugs me is, what to do in the case of something which "might or might not be numeric". What internal prefix would find good acceptability for end users? -- Paul Houle *Applying Schemas for Natural Language Processing, Distributed Systems, Classification and Text Mining and Data Lakes* (607) 539 6254 paul.houle on Skype ontology2(a)gmail.com :BaseKB -- Query Freebase Data With SPARQL http://basekb.com/gold/ Legal Entity Identifier Lookup https://legalentityidentifier.info/lei/lookup/ <http://legalentityidentifier.info/lei/lookup/> Join our Data Lakes group on LinkedIn https://www.linkedin.com/grp/home?gid=8267275

7 8

mobile improvements going live soon
by Lydia Pintscher 03 Sep '15

03 Sep '15

Hey folks :) As part of his internship with us Bene has been working on making Wikidata more usable on mobile devices. We'll be redirecting users on mobile devices there soon just like it is done on Wikipedia. You can go and have a look at it here: http://m.wikidata.beta.wmflabs.org/wiki/Q15905 Tracking bug: https://phabricator.wikimedia.org/T78430 Remaining issues: * Only labels, descriptions, aliases and sitelinks editable (via special pages) https://phabricator.wikimedia.org/T95878 * Special pages don't lead to mobile version https://phabricator.wikimedia.org/T103428 * Other languages section currently hidden https://phabricator.wikimedia.org/T91397 * Search currently not working correctly but WIP https://phabricator.wikimedia.org/T85368 * Identifiers not linked as no gadgets work in MobileFrontend https://phabricator.wikimedia.org/T85365 (will be fixed with https://phabricator.wikimedia.org/T95682) * Too many edit buttons for sitelinks, should be only one for all sitelinks https://phabricator.wikimedia.org/T110901 * Table of contents is too large https://phabricator.wikimedia.org/T110902 * Diffs do not adjust to mobile styling https://phabricator.wikimedia.org/T95883 Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

2 2

First version for units is ready for testing!
by Lydia Pintscher 01 Sep '15

01 Sep '15

Hi everyone :) We've finally done all the groundwork for unit support. I'd love for you to give the first version a try on the test system here: http://wikidata.beta.wmflabs.org/wiki/Q23950 There are a few known issues still but since this is one of the things holding back Wikidata I made the call to release now and work on these remaining things after that. What I know is still missing: * We're showing the label of the item of the unit. We should be showing the symbol of the unit in the future. (https://phabricator.wikimedia.org/T77983) * We can't convert between units yet - we only have the groundwork for it so far. (https://phabricator.wikimedia.org/T77978) * The items representing often-used units should be ranked higher in the selector. (https://phabricator.wikimedia.org/T110673) * When editing an existing value you see the URL of unit's item. This should be replaced by the label. (https://phabricator.wikimedia.org/T110675) * When viewing a diff of a unit change you see the URL of the unit's item. This should be replaced by the label. (https://phabricator.wikimedia.org/T108808) * We need to think some more about the automatic edit summaries for unit-related changes. (https://phabricator.wikimedia.org/T108807) If you find any bugs or if you are missing other absolutely critical things please let me know here or file a ticket on phabricator.wikimedia.org. If everything goes well we can get this on Wikidata next Wednesday. Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

5 12

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata August 2015