Wikidata January 2018

wikidata@lists.wikimedia.org

37 participants
27 discussions

Wiki PageID
by Gintautas Sulskus 05 Oct '18

05 Oct '18

Hi, I have a couple of questions regarding the Wiki Page ID. Does it always stay unique for the page, where the page itself is just a placeholder for any kind of information that might change over time? Consider the following cases: 1. The first time someone creates page "Moon" it is assigned ID=1. If at some point the page is renamed to "The_Moon", the ID=1 remains intact. Is this correct? 2. What if we have page "Moon" with ID=1. Someone creates a second-page "The_Moon" with ID=2. Is it possible that page "Moon" is transformed into a redirect? Then, "Moon" would be redirecting to page "The_Moon"? 3. Is it possible for page "Moon" to become a category "Category:Moon" with the same ID=1? Thanks, Gintas

7 11

Wikidata HDT dump
by Laura Morales 02 Oct '18

02 Oct '18

Hello everyone, I'd like to ask if Wikidata could please offer a HDT [1] dump along with the already available Turtle dump [2]. HDT is a binary format to store RDF data, which is pretty useful because it can be queried from command line, it can be used as a Jena/Fuseki source, and it also uses orders-of-magnitude less space to store the same data. The problem is that it's very impractical to generate a HDT, because the current implementation requires a lot of RAM processing to convert a file. For Wikidata it will probably require a machine with 100-200GB of RAM. This is unfeasible for me because I don't have such a machine, but if you guys have one to share, I can help setup the rdf2hdt software required to convert Wikidata Turtle to HDT. Thank you. [1] http://www.rdfhdt.org/ [2] https://dumps.wikimedia.org/wikidatawiki/entities/

15 81

GraFa: Faceted browser for RDF/Wikidata [feedback requested]
by Aidan Hogan 30 Mar '18

30 Mar '18

Hey all, A Masters student of mine (José Moreno in CC) has been working on a faceted navigation system for (large-scale) RDF datasets called "GraFa". The system is available here loaded with a recent version of Wikidata: http://grafa.dcc.uchile.cl/ Hopefully it is more or less self-explanatory for the moment. :) If you have a moment to spare, we would hugely appreciate it if you could interact with the system for a few minutes and then answer a quick questionnaire that should only take a couple more minutes: https://goo.gl/forms/h07qzn0aNGsRB6ny1 Just for the moment while the questionnaire is open, we would kindly request to send feedback to us personally (off-list) to not affect others' responses. We will leave the questionnaire open for a week until January 16th, 17:00 GMT. After that time of course we would be happy to discuss anything you might be interested in on the list. :) After completing the questionnaire, please also feel free to visit or list something you noticed on the Issue Tracker: https://github.com/joseignm/GraFa/issues Many thanks, Aidan and José

4 9

GlobalFactSync
by Magnus Knuth 16 Feb '18

16 Feb '18

Dear all, last year, we applied for a Wikimedia grant to feed qualified data from Wikipedia infoboxes (i.e. missing statements with references) via the DBpedia software into Wikidata. The evaluation was already quite good, but some parts were still missing and we would like to ask for your help and feedback for the next round. The new application is here: https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSync The main purpose of the grant is: - Wikipedia infoboxes are quite rich, are manually curated and have references. DBpedia is already extracting that data quite well (i.e. there is no other software that does it better). However, extracting references is not a priority on our agenda. They would be very useful to Wikidata, but there are no user requests for this from DBpedia users. - DBpedia also has all the infos of all infoboxes of all Wikipedia editions (>10k pages), so we also know quite well, where Wikidata is used already and where information is available in Wikidata or one language version and missing in another. - side-goal: bring the Wikidata, Wikipedia and DBpedia communities closer together Here is a diff between the old an new proposal: - extraction of infobox references will still be a goal of the reworked proposal - we have been working on the fusion and data comparison engine (the part of the budget that came from us) for a while now and there are first results: 6823 birthDate_gain_wiki.nt 3549 deathDate_gain_wiki.nt 362541 populationTotal_gain_wiki.nt 372913 total We only took three properties for now and showed the gain where no Wikidata statement was available. birthDate/deathDate is already quite good. Details here: https://drive.google.com/file/d/1j5GojhzFJxLYTXerLJYz3Ih-K6UtpnG_/view?usp=… Our plan here is to map all Wikidata properties to the DBpedia Ontology and then have the info to compare coverage of Wikidata with all infoboxes across languages. - we will remove the text extraction part from the old proposal (which is here for you reference: https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/CrossWikiFact). This will still be a focus during our work in 2018, together with Diffbot and the new DBpedia NLP department, but we think that it distracted from the core of the proposal. Results from the Wikipedia article text extraction can be added later once they are available and discussed separately. - We proposed to make an extra website that helps to synchronize all Wikipedias and Wikidata with DBpedia as its backend. While the external website is not an ideal solution, we are lacking alternatives. The Primary Sources Tool is mainly for importing data into Wikidata, not so much synchronization. The MediaWiki instances of the Wikipedias do not seem to have any good interfaces to provide suggestions and pinpoint missing info. Especially to this part, we would like to ask for your help and suggestions, either per mail to the list or on the talk page: https://meta.wikimedia.org/wiki/Grants_talk:Project/DBpedia/GlobalFactSync We are looking forward to a fruitful collaboration with you and we thank you for your feedback! All the best Magnus -- Magnus Knuth Universität Leipzig Institut für Informatik Abt. Betriebliche Informationssysteme, AKSW/KILT Augustusplatz 10 04109 Leipzig DE mail: knuth(a)informatik.uni-leipzig.de tel: +49 177 3277537 webID: http://magnus.13mm.de/

2 3

Installations of Wikibase outside Wikimedia projects
by Jens Ohlig 04 Feb '18

04 Feb '18

Hello, I'm interested in installations of Wikibase (the software behind Wikidata) that are used outside Wikimedia projects. While we have already collected some, we are interested in getting to know more of them. Do you run an installation? Do you know of installations of Wikibase "in the wild"? Comments and links are much appreciated! Cheers, Jens -- Jens Ohlig Software Communication Strategist Wikimedia Deutschland e.V. | Tempelhofer Ufer 23-24 | 10963 Berlin Tel. (030) 219 158 26-0 http://wikimedia.de Stellen Sie sich eine Welt vor, in der jeder Mensch an der Menge allen Wissens frei teilhaben kann. Helfen Sie uns dabei! http://spenden.wikimedia.de/ Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

4 4

Who's on First
by David Barratt 01 Feb '18

01 Feb '18

Once upon a time there was an awesome maps company called Mapzen <https://mapzen.com/>. All of their software was open source and all their data sets were licensed under a free license. Sadly they are shutting down <https://mapzen.com/blog/shutdown/>. One of their projects was Who's on First <https://www.whosonfirst.org/> which is a gazetteer <https://en.wikipedia.org/wiki/Gazetteer> of places. The data is a huge collection <https://github.com/whosonfirst-data/whosonfirst-data> of geojson files hosted on github. The place files include a concordance of ids in other systems... including Wikidata. :) Mapzen used the data in Wikidata in their Places API <https://mapzen.com/documentation/places/> (source code <https://github.com/whosonfirst/whosonfirst-www>) to translate place names. I'm not sure if we would have any interest in using/importing this data in Wikidata, but I thought I'd make more people aware of an awesome free license data set that needs some help now that Mapzen is going away. Thanks for everything you do! David Barratt

3 3

Is there a Q item in Wikidata for every symbol in all 8, 475 languages total?
by Info WorldUniversity 31 Jan '18

31 Jan '18

Is there a Q item in Wikidata for every symbol in all 8,475 languages (per Glottolog - http://glottolog.org/glottolog/language) total? For example, here's the letter "A" in Wikidata: https://www.wikidata.org/wiki/Q9659 . Could such be used for planning for Wikimedia developing into all of the 7,000 languages, by 2030, that Katherine Maher mentioned last August 2017 at Wikimania? Or for Wiktionary? How many symbols are there in all 8,475 languages (per Glottolog - http://glottolog.org/glottolog/language …) total? https://wiki.worlduniversityandschool.org/wiki/Languages "Unicode 7 has 113,021 assigned code points (excluding reserved and unassigned ones)" https://www.quora.com/How-many-characters-are-there-in-all-the-worlds-langu… … Re Gutenberg Press, how to combine these anew? https://twitter.com/WorldUnivAndSch/status/957695447497244672 Cheers, Scott Languages-World Univ: https://twitter.com/sgkmacleod/status/957695289644609536 -- - Scott MacLeod - Founder & President - 415 480 4577 - http://scottmacleod.com - World University and School - http://worlduniversityandschool.org - CC World University and School - like CC Wikipedia with best STEM-centric CC OpenCourseWare - incorporated as a nonprofit university and school in California, and is a U.S. 501 (c) (3) tax-exempt educational organization. IMPORTANT NOTICE: This transmission and any attachments are intended only for the use of the individual or entity to which they are addressed and may contain information that is privileged, confidential, or exempt from disclosure under applicable federal or state laws. If the reader of this transmission is not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this transmission in error, please notify me immediately by email or telephone. World University and School is sending you this because of your interest in free, online, higher education. If you don't want to receive these, please reply with 'unsubscribe' in the body of the email, leaving the subject line intact. Thank you.

2 3

Next IRC office hour on January 30th
by Léa Lacroix 30 Jan '18

30 Jan '18

Hello all, The Wikidata team will organize an IRC office hour, *January 30th, at 17:00 UTC <https://www.timeanddate.com/worldclock/converter.html?iso=20180130T170000&p…>* (18:00 Berlin time). It will take place in the Wikimedia office IRC channel #wikimedia-officeconnect <http://webchat.freenode.net/?channels=#wikimedia-office>. As usual, we will present you some news from the development team, the projects to come, and collect your feedback. This year, we would like to try something different, and have a topic to focus on during the office hour. If you have any topic you'd like to bring for the first meeting, please share it here! Thanks, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

1 2

Wikidata vandalism dashboard (for Wikipedians)
by Amir Ladsgroup 29 Jan '18

29 Jan '18

Hello, People usually ask me how they can patrol edits that affect their Wikipedia or their language. The proper way to do so is by using watchlist and recentchanges (with "Show Wikidata edits" option enabled) in Wikipedias but sometimes it shows too many unrelated changes. Also, it would be good to patrol edits for languages you know because the descriptions are being shown and editable in the Wikipedia app making it vulnerable to vandalism (so many vandalism in this area goes unnoticed for a while and sometimes gets fixed by another reader which is suboptimal). So Lucas [1] and I had a pet project to allow you see unpatrolled edits related to a language in Wikidata. It has some basic integration with ORES and if you see a good edit and mark it as patrolled it goes away from this list. What I do usually is to check this page twice a day for Persian langauge which given the size of it, that's enough. It's in https://tools.wmflabs.org/wdvd/index.php the source code is in https://github.com/Ladsgroup/Vandalism-dashboard and you can report issues/bug/feature requests in https://github.com/Ladsgroup/Vandalism-dashboard/issues Please spread the word and any feedback about this tool is very welcome :) [1]: <https://www.wikidata.org/wiki/User:Lucas_Werkmeister_(WMDE)> https://www.wikidata.org/wiki/User:Lucas_Werkmeister_(WMDE) <https://www.wikidata.org/wiki/User:Lucas_Werkmeister_(WMDE)> Best

5 5

Weekly Summary #297
by Léa Lacroix 29 Jan '18

29 Jan '18

*Here's your quick overview of what has been happening around Wikidata over the last week.* Discussions - Closed request for comments: Defining account creators <https://www.wikidata.org/wiki/Wikidata:Requests_for_comment/Defining_accoun…> Events <https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:Events>/ Press/Blogs <https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:Press_coverage> - Upcoming: IRC office hour on the channel #wikimedia-office, January 30th, at 18:00 (UTC+1). Special topic: how to address the growth of Wikidata - Upcoming: Wikidata hackathon in London <https://www.eventbrite.co.uk/e/wikidata-hackathon-in-london-all-levels-of-e…>, February 3rd - Past: PIDapalooza 2018 <https://pidapalooza.org/> - Slides on FigShare <https://pidapalooza.figshare.com/> - Mapping Wikidata to Bibframe <https://medium.com/@thisismattmiller/mapping-wikidata-to-bibframe-f674a4eab…> (representation of books) Other Noteworthy Stuff - Breaking change: wbcheckconstraints status parameter <https://www.wikidata.org/wiki/Wikidata:Project_chat#BREAKING_CHANGE:_wbchec…> - Wikidata-driven infoboxes <https://commons.wikimedia.org/wiki/Commons:Village_pump#Would_small_Wikidat…>, with multilingual labels, are now available on Wikimedia Commons category pages - Wikidata vandalism dashboard for Wikipedians <https://lists.wikimedia.org/pipermail/wikidata/2018-January/011741.html> - Grant proposals looking for review: WikidataJS <https://meta.wikimedia.org/wiki/Grants:Project/WikidataJS>, GlobalFactSync <https://meta.wikimedia.org/wiki/Grants:Project/DBpedia/GlobalFactSync> Did you know? - Newest properties <https://www.wikidata.org/wiki/Special:ListProperties>: - General datatypes: none - External identifiers: cinematografo film ID <https://www.wikidata.org/wiki/Property:P4786>, CiNii author ID (articles) <https://www.wikidata.org/wiki/Property:P4787> - Query examples: - Buildings higher than the Eiffel Tower <https://query.wikidata.org/#%23defaultView%3AMap%0ASELECT%20DISTINCT%20%3Fs…> (source <https://twitter.com/SciHiBlog/status/957677571595960321>) - Descendants of Charlemagne by year of birth <https://query.wikidata.org/#%23defaultView%3AScatterChart%0ASELECT%20%3Fyea…> (source <https://twitter.com/belett/status/957571805048397826>) - Cinemas in Berlin <https://query.wikidata.org/#%23defaultView%3AImageGrid%0ASELECT%20%3Fct%20%…> (source <https://twitter.com/AndreasP_RV/status/956909575109783555>) - Things (or people) depicted in the xkcd webcomic <https://query.wikidata.org/#SELECT%20%3Fobject%20%3FobjectLabel%20%28COUNT%…> (source <https://twitter.com/johl/status/956170983555002369>) - Image grid of inductees to WITI Hall of Fame <https://query.wikidata.org/#%23defaultView%3AImageGrid%0ASELECT%20%3Fawarde…> (source <https://twitter.com/WikiDigi/status/956157242855784448>) - Movies filmed in Austria <https://query.wikidata.org/#%23defaultView%3AImageGrid%0ASELECT%20%3Fmovie%…> (source <https://twitter.com/WikidataAT/status/955867103797575687>) - Newest WikiProjects <https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:WikiProjects>: - Newest gadgets: Mix'n'Match gadget <https://www.wikidata.org/wiki/User:Magnus_Manske/mixnmatch_gadget.js>: loads all Mix’m’match entries about an item, shows descriptions, lets you drag’n’drop entries as references - Newest database reports: tennis federations, Davis Cup and Fed Cup teams <https://www.wikidata.org/wiki/Wikidata:WikiProject_Tennis/Lists/federations…> - Showcase items <https://www.wikidata.org/wiki/Wikidata:Showcase_items>: Development - Constraint violations can now be checked on qualifiers and references ( phab:T168532 <https://phabricator.wikimedia.org/T168532>) - Implemented usage tracking deduplication to reduce database load ( phab:T178079 <https://phabricator.wikimedia.org/T178079>). This should not have any effect on what users see on recent changes and watchlists. - Redirects on client wikis that are connected to a Wikidata item can have a tracking category <https://en.wikipedia.org/wiki/MediaWiki:Connected-redirect-category>, if set up (phab:T185743 <https://phabricator.wikimedia.org/T185743>). Thanks, Matěj! - Improved documentation <https://www.wikidata.org/w/api.php?action=help&modules=query%2Bpageterms> of the pageterms query module (gerrit:406240 <https://gerrit.wikimedia.org/r/406240>). Thanks, Niedzielski! - Improved empty "content was:" in deletion logs for entities ( phab:T184025 <https://phabricator.wikimedia.org/T184025>) - Fixed links to external user pages in recent changes (phab:T183019 <https://phabricator.wikimedia.org/T183019>) - Fixed user names beginning with a star sometimes being rendered as a list (phab:T182800 <https://phabricator.wikimedia.org/T182800>) - Fixed constraint check results possibly showing up in the wrong language (phab:T185688 <https://phabricator.wikimedia.org/T185688>) - Fixed ArticlePlaceholders possibly not showing up in search results ( gerrit:406168 <https://gerrit.wikimedia.org/r/406168>) - Some of the developers attended to the Wikimedia Developer Summit 2018. You can find some notes on the Phabricator board <https://phabricator.wikimedia.org/project/view/3119> You can see all open tickets related to Wikidata here <https://phabricator.wikimedia.org/maniphest/query/4RotIcw5oINo/#R>. Monthly Tasks - Add labels, in your own language(s), for the new properties listed above. - Comment on property proposals: all open proposals <https://www.wikidata.org/wiki/Wikidata:Property_proposal/Overview> - Suggested and open tasks <https://www.wikidata.org/wiki/Wikidata:Contribute/Suggested_and_open_tasks> ! - Contribute to a Showcase item <https://www.wikidata.org/wiki/Special:MyLanguage/Wikidata:Showcase_items> . - Help translate <https://www.wikidata.org/wiki/Special:LanguageStats> or proofread the interface and documentation pages, in your own language! - Help merge identical items <https://www.wikidata.org/wiki/User:Pasleim/projectmerge> across Wikimedia projects. - Help write the next summary! <https://www.wikidata.org/wiki/Wikidata:Status_updates/Next> -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata January 2018