Wikidata May 2019

wikidata@lists.wikimedia.org

48 participants
27 discussions

by Léa Lacroix

3 years, 6 months

Accessing tabular data from SPARQL?

by Daniel Mietchen

Hi, I'm looking into ways to use tabular data like https://commons.wikimedia.org/wiki/Data:Zika-institutions-test.tab in SPARQL queries but could not find anything on that. My motivation here is in part coming from the time out limits, and the basic idea here would be to split queries that typically time out into sets of queries that do not time out and - if their results were aggregated - would yield the results that would be expected for the original query would it not time out. The second line of motivation here is that of keeping track of how things develop over time, which would be interesting for both content and maintenance queries as well as usage of things like classes, references, lexemes or properties. I would appreciate any pointers or thoughts on the matter. Thanks, Daniel

3 years, 9 months

[Breaking change] Important for Wikidata tools maintainers: wb_terms table to be dropped at the end of May

by Léa Lacroix

Hello all, This is an important announcement for all the tool builders and maintainers who access Wikidata’s data by *querying directly Labs database replicas*. In May-June 2019, the Wikidata development team will drop the wb_terms table from the database in favor of a new optimized schema. Over years, this table has become too big, causing various issues. This change requires the tools using wb_terms to be updated. Developers and maintainers will need to *adapt their code* to the new schema before the migration starts and switch to the new code when the migration starts. The migration will start on *May 29th*. On May 15th, a test system will be available for you to test your code. The table being used by plenty of external tools, we are setting up a process to make sure that the change can be done together with the developers and maintainers, without causing issues and broken tools. Most of the documentation and updates will take place on Phabricator: - In this Phabricator task <https://phabricator.wikimedia.org/T221764>, you can find a description of the changes and the process, and you can ask for more details or for help in the comments. This is also where updates will be announced if necessary. - On the Tool Builders Migration board <https://phabricator.wikimedia.org/tag/wb_terms_-_tool_builders_migration> you will find all the details about the migration, how to update your tool <https://phabricator.wikimedia.org/T221765>, and you can add your own tasks. - If you need to discuss with the Wikidata developers or get more specific help, we set up two dedicated IRC meetings and a session at the Wikimedia hackathon. More information in this task <https://phabricator.wikimedia.org/T221764>. We are aware that this change will ask you to make some important changes in your code, and we are willing to help you as much as our resources allow us to. We hope that you will understand that this change is made to avoid bigger issues in the near future. Note that this change is not impacting Wikibase instances outside of Wikidata. A dedicated migration plan and announcement will follow. We strongly encourage you to not wait until last minute to make the changes in your code. If you have any question or issue, we will be happy to help. In order to keep the discussions in one place, please ask questions or raise issues directly in the Phabricator task and board. Thanks for your understanding, Cheers, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

4 years, 9 months

[Breaking Change] wbeditentity including empty alias set will remove all aliases

by Léa Lacroix

Hello all, This change is relevant for everyone using the *wbeditentity* endpoint of Wikidata’s API. While working on editing the termbox from mobile, we discovered a bug in our code of the wbeditentity endpoint, that does not conform with the implicit interpretation of the documentation <https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/docs/change-…>. A request including {"aliases":{"en":[]}} should, according to the implicit interpretation of its documentation, replace all aliases in English by an empty string, meaning removing all aliases. However, at the moment this action is not actually performed, meaning that this request would leave the aliases untouched. We want to fix this bug, because we need this request to work in order to be able to remove all aliases also in the new termbox on mobile. We are treating this bug fix as a breaking change because the documentation was ambiguous, and there may be some tools currently sending requests with empty alias arrays when nothing need to be touched, intentionally or not. If you are maintaining a tool, please *inspect your tool usage of wbeditentity endpoint*, and make sure that no calls with empty alias arrays are sent unless the intention is to remove these aliases. According to our breaking change policy, this bug fix will be first deployed on beta.wikidata.org later on May 28th, then on wikidata.org on *June 12th*. If you have any question or issue, feel free to discuss in the related ticket <https://phabricator.wikimedia.org/T203337>. Cheers, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

4 years, 10 months

About Functional QuadStore for WikiData (was Are we ready for our future)

by Amirouche Boubekki

GerardM post triggered my interest to post to the mailing list. As you might know I am working on functional quadstore that is quadstore that keeps around old version of data, like a wiki but in direct-acyclic-graph. It only stores differences between commits. It rely on snapshot of the latest version for fast reads. My ultimate goal is to build somekind of portable knowlege base. That is something like WikiBase + blazegraph but that you spinup on regular machine with the press of button. Enought brag about me. I wont't reply to all the message of the threads one by one but: Here is what SHOULD BE possible: - incremental dumps - time traveling queries - full dumps - The federation of wikibase SHOULD BE possible since it stored in a history like GIT and git pull git push are planned in the ROADMAP And online edition of the quadstore. Access Control List are not designed yet, I except that this should be enforced by the application layer. I planned start working on Data Management System (something like CKAN) with search featrure. But I would gadly work with wikimedia instead. Also, given it modeled after git, one can do merge-request like features, ie. exist the massive import that is crippled. What I would need is logs possibly with timing of queries (read and write) to do benchmarks. Maybe I should ask for fund at mediawiki? FWIW, I got 2 times faster than blazegraph on microbenchmark. > Hoi, > Wikidata grows like mad. This is something we all experience in the really > bad response times we are suffering. It is so bad that people are asked > what kind of updates they are running because it makes a difference in the > lag times there are. > > Given that Wikidata is growing like a weed, it follows that there are two > issues. Technical - what is the maximum that the current approach supports > - how long will this last us. Fundamental - what funding is available to > sustain Wikidata. > > For the financial guys, growth like Wikidata is experiencing is not > something you can reliably forecast. As an organisation we have more money > than we need to spend, so there is no credible reason to be stingy. > > For the technical guys, consider our growth and plan for at least one > year. When the impression exists that the current architecture will not > scale beyond two years, start a project to future proof Wikidata. > > It will grow and the situation will get worse before it gets better. > Thanks, > GerardM > > PS I know about phabricator tickets, they do not give the answers to the > questions we need to address. >

4 years, 10 months

Wikimedia language conference on 4-5 July in Cornwall

by Léa Lacroix

Hello all, The Celtic Knot Conference <https://meta.wikimedia.org/wiki/Celtic_Knot_Conference_2019>, dedicated to languages in the Wikimedia projects, will take place on 4th and 5th of July in Penryn, Cornwall, UK. The conference aims to bring people together to share their experiences of working on sharing information in minority languages, and to help people learn how to encourage the flow of information across language barriers and support associated communities. During the previous editions <https://wikimedia.org.uk/wiki/Celtic_Knot_Conference_2018#Programme>, a lot of Wikidata-related sessions took place, and the conference especially helped bringing together editors of small Wikipedia communities who wanted to make more use of Wikidata <https://blog.wikimedia.org.uk/2018/08/celtic-knot-2018-how-can-wikidata-sup…>, but lacked resources or technical skills. We hope that this gathering will bring together again plenty of people who are enthusiastic about sharing knowledge across (project) borders and languages! The call for program submissions is now open until May 16th. More information about attendance and possible ways to get funded will certainly follow. I already submitted a Query Booth <https://meta.wikimedia.org/wiki/Celtic_Knot_Conference_2019/Submissions/Que…> - the format where people can exchange knowledge about SPARQL and help each other to build queries has been successfully tested at the WikidataCon and other events. If you plan to join the conference and would like to help at the booth, feel free to register on the page! if you have any question, feel free to reach me or the main organisation contact, Mark Trevethan. Cheers, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

4 years, 10 months

Where did label filtering break recently and how?

by Thad Guidry

My Query: SELECT ?item ?itemLabel WHERE { ?item wdt:P31 wd:Q2085381. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } # FILTER(CONTAINS(LCASE(?itemLabel), "simon")) # FILTER (LANG(?itemLabel)="en") } and if I enable any of the FILTER lines, it returns 0 results. What changed / Why ? Thad https://www.linkedin.com/in/thadguidry/

4 years, 10 months

Shape Expressions arrive on Wikidata on May 28th

by Léa Lacroix

Hello all, After several months of development and testing together with the WikiProject ShEx <https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx>, Shape Expressions are about to be enabled on Wikidata. *First of all, what are Shape Expressions?* ShEx (Q29377880) <https://www.wikidata.org/wiki/Q29377880> is a concise, formal modeling and validation language for RDF structures. Shape Expressions can be used to define shapes within the RDF graph. In the case of Wikidata, this would be sets of properties, qualifiers and references that describe the domain being modeled. See also: - a short video about ShEx <https://www.youtube.com/watch?v=AR75KhEoRKg> made by community members during the Wikimedia hackathon 2019 - introduction to ShEx <http://shex.io/shex-primer/> - more details about the language <http://shex.io/shex-semantics/> *What can it be used for?* On Wikidata, the main goal of Shape Expressions would be to describe what the basic structure of an item would be. For example, for a human, we probably want to have a date of birth, a place of birth, and many other important statements. But we would also like to make sure that if a statement with the property “children” exists, the value(s) of this property should be humans as well. Schemas will describe in detail what is expected in the structure of items, statements and values of these statements. Once Schemas are created for various types of items, it is possible to test some existing items against the Schema, and highlight possible errors or lack of information. Subsets of the Wikidata graph can be tested to see whether or not they conform to a specific shape through the use of validation tools. Therefore, Schemas will be very useful to help the editors improving the data quality. We imagine this to be especially useful for wiki projects to more easily discuss and ensure the modeling of items in their domain. In the spirit of Wikidata not restricting the world, Shape Expressions are a tool to highlight, not prevent, errors. On top of this, one could imagine other uses of Schemas in the future, for example building a tool that would suggest, when creating a new item, what would be the basic structure for this item, and helping adding statements or values. A bit like this existing tool, Cradle <https://tools.wmflabs.org/wikidata-todo/cradle/#/>, that is currently not based on ShEx. *What is going to change on Wikidata?* - A new extension will be added to Wikidata: EntitySchema <https://www.mediawiki.org/wiki/Extension:EntitySchema>, defining the Schema namespace and its behavior as well as special pages related to it. - A new entity type, EntitySchema, will be enabled to store Shape Expressions. Schemas will be identified with the letter E. - The Schemas will have multilingual labels, descriptions and aliases (quite similar to the termbox on Items), and the schema text one can fill with a syntax called ShEx Compact Syntax (ShExC) <http://shex.io/shex-semantics/#shexc>. You can see an example here <https://wikidata-shex.wmflabs.org/wiki/EntitySchema:E2>. - The external tool shex-simple <https://tools.wmflabs.org/shex-simple/wikidata/packages/shex-webapp/doc/she…> is directly linked from the Schema pages in order to check entities of your choice against the schema. *When is this happening?* Schemas will be enabled on on test.wikidata.org on May 21st and on wikidata.org on May 28th. After this release, they will be integrated to the regular maintenance just like the rest of Wikidata’s features. *How can you help?* - Before the release, you can try to edit or create Shape Expressions on our test system <https://wikidata-shex.wmflabs.org/wiki/Main_Page> - If you find any issue or feature you’d like to have, feel free to create a new task on Phabricator with the tag shape-expressions - Once Schemas are enabled, you can discuss about it on your favorite wikiprojects: for example, what types of items would you like to model? - You can also get more information about how to create a Schema <https://www.wikidata.org/wiki/Wikidata:WikiProject_ShEx/How_to_get_started%…> *See also: * - Main Phabricator board <https://phabricator.wikimedia.org/tag/shape_expressions/> - Technical documentation of the extension <https://meta.wikimedia.org/wiki/Extension:EntitySchema> - To enhance the interface, you can use this user script <https://www.wikidata.org/wiki/User:Zvpunry/EntitySchemaHighlighter.js> to highlight items and properties in the schema code and turn the IDs into links If you have any questions, feel free to reach me. Cheers, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

4 years, 11 months

Authority control and Wikidata

by Thomas Douillard

Hi all, I find it hard to know what are the connections between Wikidata and authority control organizations and big live databases. We got plenty of property that maps entities to external datas in Wikidata, but information about their freshness and the deepness of the collaboration between community and the organisations is harder to find. There is a degree of collaboration, which come from « community does all the work, maintenance and synchronisation, updates on Wikidata are not reflected on the database » to « the organisation fully collaborate and maintains a two way synchronization of the mapping and data from Wikidata (meaning, corrections on Wikidata are reflected on the database and the other way around) ». Do we know where we are on the scale for the major authorities ? Pages like https://en.wikipedia.org/wiki/Wikipedia_talk:Authority_control/VIAF seems to be a little dead, which suggest the collaboration which VIAF seems a little dead right now. Did I miss anything ?

4 years, 11 months

[CfP] VOILA!2019 @ISWC2019 5th International Workshop on Visualization and Interaction for Ontologies and Linked Data

by Valentina Ivanova

CALL FOR CONTRIBUTIONS VOILA 2019 - Visualization and Interaction for Ontologies and Linked Data 5th International Workshop at ISWC 2019, 18th International Semantic Web Conference October 26 or 27, Auckland, New Zealand http://voila2019.visualdataweb.org -------------------------------------------------- Abstracts Deadline: June 21, 2019 Submission Deadline: June 28, 2019 -------------------------------------------------- We are looking for submissions addressing one or more of the following topics, subjects, and contexts (or related ones): * Topics: - visualizations - user interfaces - visual analytics - requirements analysis - case studies - user evaluations - cognitive aspects * Subjects: - ontologies - linked data - knowledge graphs - ontology engineering (development, collaboration, ontology design patterns, alignment, debugging, evolution, provenance, etc.) * Contexts: - classical interaction contexts (desktop, keyboard, mouse, etc.) - modern interaction contexts (mobile, touch, gesture, speech, etc.) - special settings (large, high-resolution, and multiple displays, etc.) - specific user groups and needs (people with disabilities, domain experts, etc.) Submission Guidelines ========== The following types of contributions are welcome. The recommended page length is given in brackets. There is NO strict page limit but the length of a paper should be commensurate with its contribution. - Full research papers (8-12 pages); - Experience papers (8-12 pages); - Position papers (6-8 pages); - Short research papers (4-6 pages); - System papers (4-6 pages). Accepted papers will be published as a volume in the CEUR Workshop Proceedings series. Important Dates ========== Abstract: June 21, 2019 Submission: June 28, 2019 Notification: July 24, 2019 Camera-ready: August 16, 2019 Workshop: October 26 or 27, 2019 Organizers ========== Catia Pesquita, University of Lisbon, Portugal Patrick Lambrix, Linköping University, Sweden Steffen Lohmann, Fraunhofer IAIS, Germany Valentina Ivanova, RISE Research Institutes of Sweden, Sweden Vitalis Wiens, University of Bonn, Fraunhofer IAIS & TIB, Germany Looking forward to your submissions & meeting you there! Catia, Patrick, Steffen, Valentina, and Vitalis

4 years, 11 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata May 2019