Wikidata June 2015

wikidata@lists.wikimedia.org

69 participants
44 discussions

by Thad Guidry

In Freebase, we had bot scripts that went through and removed "Lists of Things" topic entities since they are lists of entities and not useful clumped together and normalized in a graph database. Does Wikidata have something similar or a user review process for deletion of these ? Ex. List of tallest buildings in Wuhan - https://www.wikidata.org/wiki/Q6642364 Thad +ThadGuidry <https://www.google.com/+ThadGuidry>

8 years, 10 months

University automatically being College alias ?

by Thad Guidry

There are times when a few keywords are often interchanged with each other in different languages and dialects. It seems advantageous to somehow tell Wikidata Search that when someone types Harvard College to interchange and also look for Harvard University, and vice versa. An interchange mapping table might suffice just for this use case, or something else, dunno... How far in the future is this feature ? Roadblocks ? Thad +ThadGuidry <https://www.google.com/+ThadGuidry>

8 years, 10 months

weekly summary #162

by Lydia Pintscher

Hey folks :) Here's what's been happening around Wikidata over the past week: Discussions - Should we have a Wikidata User Group <https://meta.wikimedia.org/wiki/Wikidata_Community_User_Group>? Events <https://www.wikidata.org/wiki/Wikidata:Events>/Press/Blogs <https://www.wikidata.org/wiki/Wikidata:Press_coverage> - Past: Thanks to everyone who participated in the Wikidata:Menu Challenge <https://www.wikidata.org/wiki/Wikidata:Menu_Challenge> which made Wikidata shine at *A Taste of Stockholm*. The live results are still up at https://wikimedia-sverige.github.io/tastydata . - Upcoming: Donostia-San Sebastián Wikidata Editathon <https://sites.google.com/site/donostiasansebastianenwikidata/> - Upcoming: Ateliers Wikidata in Paris <https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Ateliers_Wikidata> - News about Wikidata FIST <http://magnusmanske.de/wordpress/?p=319> - Review of the big interwiki link migration <http://addshore.com/2015/06/review-of-the-big-interwiki-link-migration/> - Timeline of Christopher Lee films <http://histropedia.com/blog/timeline-of-christopher-lee-films/> - Wikidata map - 19 months on <http://addshore.com/2015/06/wikidata-map-19-months-on/> - A challenge was raised <http://ultimategerardm.blogspot.nl/2015/06/wikidata-six-degrees-of-separati…> to calculate the degrees of separation to Kevin Bacon. Already one routine has been produced, the next step is to use live data so that we can work on reducing the number of intermediary steps. Other Noteworthy Stuff - ViziData has been improved significantly <https://lists.wikimedia.org/pipermail/wikidata/2015-June/006365.html> - Wikidata Visualization Challenge winners announced <https://se.wikimedia.org/wiki/Projekt:%C3%96ppna_data_2015/Visualisering/T%…>: first prize to ViziData. - New paper: Peer-production system or collaborative ontology development effort: what is Wikidata? <http://eprints.soton.ac.uk/377397/> - Jura1 posted some data around property usage on items about people <https://www.wikidata.org/wiki/User:Jura1/People_charts> - Tpt is interning at Google now to free up more content from Freebase for us. He's adding more data to the Primary Sources Tool <https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool>. Did you know? - Newest properties: street key <https://www.wikidata.org/wiki/Property:P1945>, relief location map <https://www.wikidata.org/wiki/Property:P1944>, location map <https://www.wikidata.org/wiki/Property:P1943>, McCune-Reischauer romanization <https://www.wikidata.org/wiki/Property:P1942>, conifers.org ID <https://www.wikidata.org/wiki/Property:P1940>, Dyntaxa ID <https://www.wikidata.org/wiki/Property:P1939>, Project Gutenberg ID <https://www.wikidata.org/wiki/Property:P1938>, UN/LOCODE <https://www.wikidata.org/wiki/Property:P1937>, Digital Atlas of the Roman Empire ID <https://www.wikidata.org/wiki/Property:P1936>, DBCS ID <https://www.wikidata.org/wiki/Property:P1935>, Animator.ru ID <https://www.wikidata.org/wiki/Property:P1934>, MobyGames ID <https://www.wikidata.org/wiki/Property:P1933>, stated as <https://www.wikidata.org/wiki/Property:P1932>, PGCH ID <https://www.wikidata.org/wiki/Property:P1931> Development - Deployed arbitrary access on all Wikivoyage and Wikiquote projects and announced the next ones. See d:Wikidata:Arbitrary access <https://www.wikidata.org/wiki/Wikidata:Arbitrary_access> for more. - More fine tuning on entity usage tracking (relevant for arbitrary access) - Fixed bug that sometimes allowed multiple properties to have the same label in a given language (phabricator:T102148 <https://phabricator.wikimedia.org/T102148>) - More work on automatically creating redirects when merging items - More code review of the Wikidata Quality extensions (improved constraint reports and checks against 3rd party databases). Starting to look good for a first deployment. - The sitelinks heading hierarchy changed and includes a “Site links” heading now that is only shown on mobile. This is a DOM change needed to make Wikidata work better on mobile. - Started working on having PHPCS coverage for the major Wikibase.git code base to find small code issues more easily - Prepared Wikibase.git for the DataModel 3.0 switch - Released Wikibase DataModel Serialization 1.4 - Released Wikibase Internal Serialization 1.4 - New releases of several DataValue components, including DataValues Number 0.5, DataValues JavaScript 0.7 and ValueView 0.14.5 You can see all open bugs related to Wikidata here <https://phabricator.wikimedia.org/maniphest/query/4RotIcw5oINo/#R>. Monthly Tasks - Hack on one of these <https://phabricator.wikimedia.org/maniphest/query/R8GRzX1eH0tb/#R>. - Help fix these items <https://www.wikidata.org/wiki/Wikidata:The_Game/Flagged_items> which have been flagged using Wikidata - The Game. - Help develop the next summary here! <https://www.wikidata.org/wiki/Wikidata:Status_updates/Next> - Contribute to a Showcase item <https://www.wikidata.org/wiki/Wikidata:Showcase_items> - Help translate <https://www.wikidata.org/wiki/Special:LanguageStats> or proofread pages in your own language! - Add labels, in your own language, for the new properties listed above. Anything to add? Please share! :) Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

8 years, 11 months

Wikidata data dump: JSON versus XML

by gnosygnu

Hi. Currently, the dump service offers two different dumps for wikidata: * XML: http://dumps.wikimedia.org/wikidatawiki/latest/ * JSON: http://dumps.wikimedia.org/wikidatawiki/entities/ According to http://www.wikidata.org/wiki/Wikidata:Database_download, the JSON dump is listed as the recommended dump format. Also, at the time of writing, the JSON dump has been generating regularly every week whereas the XML dump has been delayed for 2+ months. Going forward, will both dumps continue to be supported? Or will the XML dump be phased out and only the JSON dump remain? Or are these plans still to be determined based on upcoming changes to the dumping infrastructure as per https://phabricator.wikimedia.org/T88728? If the JSON dump is to be the sole data format, is there any way to address the following omissions? * '''Non-JSON pages not available''': The JSON dump only provides JSON content-type pages in the main and property namespaces. Pages in other namespaces are not available, including the Main Page. For example, here are the counts from the 2015-03-30 dump id name count ---- ------------ ----- 4 Wikidata 10280 8 MediaWiki 2244 10 Template 4701 12 Help 779 14 Category 3073 828 Module 175 1198 Translations 83524 * '''Page metadata not available''' : For the JSON pages, the page_touched and page_id is not available. * '''Other tables not provided''': Other tables are not provided, notably categorylinks and page_props Thanks in advance for any information.

8 years, 11 months

accessing data from a wikidata concept page

by Benjamin Good

I recently introduced wikidata to a (very computationally savvy) colleague by sending him this link: https://www.wikidata.org/wiki/Q423111 His response is indicative of an interface problem that I think is actually very important: "Is there a simple way to get the RDF for a given concept? The page seems to only present the english names for the concept and its linked concepts." Leaving aside RDF, it is really not straightforward for newcomers to get from a concept page like that to the corresponding structured data. This could be solved with the consistent addition of a simple link like "view json/xml/rdf" to each of the concept pages on wikidata. They would just be links to the API calls: e.g. http://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q423111 in this case. As the concept pages themselves get tossed around a lot, such an addition could be extremely valuable in teaching the uninitiated what its all about and would come at very little cost - to me, this button is akin to the 'view source' action on web pages - an absolutely fundamental part of how the web grows - even now. -Ben

8 years, 11 months

Data amplitude and representation

by David Cuenca Tudela

Hi, I am investigating some concepts about signal processing and relating them to data manipulation. It is somehow difficult because the way computer scientists relate to concepts is very dogmatic, something is either black or white, however I have not found much on "things that under certain circumstances can be considered black-ish, and under another set of circumstances can be considered white-ish" http://freethoughtblogs.com/singham/2012/06/25/shades-of-grey-optical-illus… In signal processing there is the concept of amplitude which is just the signal strength. For humans language is like an amplitude communication process where the receiver picks up not only the signal, but also its amplitude depending on context, awareness, previous knowledge, etc. factors which in turn can be considered waves being processed by the ontological biological-organizational complex, the body-mind. It is tough to describe that a certain concept might have a certain amplitude in some situations and other amplitude in other situations, and perhaps even harder to make a human interface for it. Has anyone attempted it in the past? If Q items are not static entities, what is the best way to convey that they are not? And is it possible or desirable at all? Perhaps these questions are more suitable for a Wikidata 2.0, or perhaps it is already doable, who knows. Any thoughts? Cheers, Micru

8 years, 11 months

Re: [Wikidata] Historic buildings/National trust

by Mike Cummins

I have had an interesting correspondence with Historic England and have had a couple of spreadsheets detailing different information. I have linked these into a Fusion table: https://www.google.com/fusiontables/DataSource?docid=14xbBg7o-h-1IY2RwKji9n… Is any of this of use to Wikidata? I can extract, massage, whatever to most formats... Mike

8 years, 11 months

Way to check how many claims there are per item-sitelink from WP article category or list

by Jane Darnell

Hi all, After looking at the list of items without any claims, I was wondering if I could help with the cleanup by checking the categories I am familiar with. Is there any way to get a breakdown of #claims per item given a list of items from, say, a Wikipedia category? Thanks in advance, Jane

8 years, 11 months

No links, wrong data: Scottland's orphans need help

by Markus Krötzsch

Hi all, Looking at more "orphaned items", I found several pairs of items that look like these two: https://www.wikidata.org/wiki/Q17574663 https://www.wikidata.org/wiki/Q17569687 Same label and description, same coordinates, no Wikidata articles, "identified" by different Historic Scotland IDs. If you follow the ID links, however, you can see that the first of the items has data that does not match the ID, while the second is correct. The direct question is: How to fix these errors? There are other cases, such as Q17572335 and Q17570206. I did not do a systematic study, but something seems to have gone wrong here in more than one case. I cannot fix mass edits one by one without having a clue what has happened and why. The indirect question is: How can I find out who did this and maybe ask the person to fix it? The history is of no help (Reinheitsgebot/Widar). Posting every error in Wikidata to this list to ask also seems like a bad idea. Finally, the technical question is: Why is this even possible? I thought that, in each language, label+description are a key (globally unique), yet here we have many pairs of items with exactly the same label and description. Or is the problem that no description was entered and so the system does not apply the key? In any case, a data integration helper application that looks at equal labels+descriptions would probably make sense, especially for orphaned items. (As I know Wikidata, someone might well reply to this email with a link to where this is already found ;-). Regards Markus

8 years, 11 months

weekly summary #161

by Lydia Pintscher

Hey folks :) Here's what has been happening around Wikidata over the past week for you: Discussions - RfC on usage of Wikidata on German-language Wikipedia <https://de.wikipedia.org/wiki/Wikipedia:Meinungsbilder/Nutzung_von_Daten_au…> Events <https://www.wikidata.org/wiki/Wikidata:Events>/Press/Blogs <https://www.wikidata.org/wiki/Wikidata:Press_coverage> - Past: Pigsonthewing and aude were at State of the Map US to talk about Wikidata/OpenStreetMap cooperation and more - Past: Lucie and Marius gave an intro to Wikidata at Gulaschprogrammiernacht in Karlsruhe (slides <https://www.wikidata.org/wiki/File:Wikidata_Workshop_-_GPN_2015.pdf>) - Upcoming: Office hour on IRC on 19th <https://lists.wikimedia.org/pipermail/wikidata/2015-June/006376.html> Other Noteworthy Stuff - The English-language Wikipedia has decided to deprecate Persondata <https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28proposals%29/Archiv…> in favour of Wikidata. - New tool by Magnus to make it easier to add references <http://magnusmanske.de/wordpress/?p=310> - The results are in for the Wikimedia Foundation's board election <https://blog.wikimedia.org/2015/06/05/board-election-results/>. One name should sound familiar to you ;-) - Addshore created new maps of the geocoordinates on Wikidata <https://twitter.com/wikidata/status/606086887820955649>. More coming. - Sylvain made an overview of the overlap in topics that the biggest Wikipedias have <https://gist.github.com/Ash-Crow/ae1a3ae8318994afe45d>. - Wikidata has passed German Wikipedia in number of items with images (743850 dewp articles vs. 804885 items) - now only second to English-language Wikipedia. - Catalan Wikipedia is moving Taxon IDs to Wikidata. - Zolo did a comparison <https://www.wikidata.org/wiki/User:Zolo/Wikipedia_content> of English/French/German/Chinese and Cebuano Wikipedias in terms of main types of articles. Biographies were compared by period, nationality and occupation. - Connectivity <https://www.wikidata.org/wiki/User:Pasleim/Connectivity> statistics were updated. Did you know? - Newest properties: DMS V <https://www.wikidata.org/wiki/Property:P1930>, Clinvar Accession Number <https://www.wikidata.org/wiki/Property:P1929>, Vaccine Ontology ID <https://www.wikidata.org/wiki/Property:P1928>, VIOLIN ID <https://www.wikidata.org/wiki/Property:P1925>, vaccine for <https://www.wikidata.org/wiki/Property:P1924>, participating team <https://www.wikidata.org/wiki/Property:P1923>, first line <https://www.wikidata.org/wiki/Property:P1922>, URI pattern for RDF resource <https://www.wikidata.org/wiki/Property:P1921>, Commonwealth War Graves Commission burial ground identifier <https://www.wikidata.org/wiki/Property:P1920>, Ministry of Education of Chile school ID <https://www.wikidata.org/wiki/Property:P1919> - Newest gadgets: Mark as patrolled <https://www.wikidata.org/wiki/MediaWiki:Gadget-Mark_as_patrolled> by User:Petr Matas <https://www.wikidata.org/wiki/User:Petr_Matas> Development - Italian Wikipedia and all remaining Wikisource projects now have arbitrary access. The rollout will continue. The schedule for the next projects is at d:Wikidata:Arbitrary access <https://www.wikidata.org/wiki/Wikidata:Arbitrary_access>. - The Content Translation tool now automatically connects translated articles to Wikidata. (Thanks Content Translation developers!) Previously translated but unconnected articles have been connected by a bot. (Thanks Amir!) - Code review of the extensions written by a team of students to improve the constraint reports and make it possible to automatically check our data against other databases. A first version will go live soon pending further codereview and fixes. - Two new methods have been added to the lua library provided by Wikibase: wikibase.resolvePropertyId <https://www.mediawiki.org/wiki/Extension:Wikibase_Client/Lua#mw.wikibase.re…> and wikibase.entity:getBestStatements <https://www.mediawiki.org/wiki/Extension:Wikibase_Client/Lua#mw.wikibase.en…> . - Further work on making unserializable values editable in the UI (This can happen for example if a property is deleted.) - Made the Wikidata JSON dumps available on Labs in the standard dumps location there (/public/dumps). - Worked on automatically creating a redirect when merging items via the API. - Rewrote the script that generates a map image based on geocoordinates in Wikidata. (Result see above.) - Released Wikibase DataModel 3.0 You can see all open bugs related to Wikidata here <https://phabricator.wikimedia.org/maniphest/query/4RotIcw5oINo/#R>. Monthly Tasks - Hack on one of these <https://phabricator.wikimedia.org/maniphest/query/R8GRzX1eH0tb/#R>. - Help fix these items <https://www.wikidata.org/wiki/Wikidata:The_Game/Flagged_items> which have been flagged using Wikidata - The Game. - Help develop the next summary here! <https://www.wikidata.org/wiki/Wikidata:Status_updates/Next> - Contribute to a Showcase item <https://www.wikidata.org/wiki/Wikidata:Showcase_items> - Help translate <https://www.wikidata.org/wiki/Special:LanguageStats> or proofread pages in your own language! - Add labels, in your own language, for the new properties listed above. Anything to add? Please share! :) Cheers Lydia -- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

8 years, 11 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata June 2015