Wikidata September 2018

wikidata@lists.wikimedia.org

67 participants
46 discussions

Semantic annotation of red links on Wikipedia

by Maarten Dammers

Hi everyone, According to https://www.youtube.com/watch?v=TLuM4E6IE5U : "Semantic annotation is the process of attaching additional information to various concepts (e.g. people, things, places, organizations etc) in a given text or any other content. Unlike classic text annotations for reader's reference, semantic annotations are used by machines to refer to." (more at https://ontotext.com/knowledgehub/fundamentals/semantic-annotation/ ) On Wikipedia a red link is a link to an article that hasn't been created (yet) in that language. Often another language does have an article about the subject or at least we have a Wikidata item about the subject. Take for example https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris . It has over 250 incoming links, but the person doesn't have an article in Dutch. We have a Wikidata item with links to 7 Wikipedia's at https://www.wikidata.org/wiki/Q116510 , but no way to relate https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris with https://www.wikidata.org/wiki/Q116510 . Wouldn't it be nice to be able to make a connection between the red link on Wikipedia and the Wikidata item? Let's assume we have this list somewhere. We would be able to offer all sorts of nice features to our users like: * Hover of the link to get a hovercard in your favorite backup language * Generate an article placeholder for the user with basic information in the local language * Pre-populate the translate extension so you can translate the article from another language (probably plenty of other good uses) Where to store this link? I'm not sure about that. On some Wikipedia's people have tested with local templates around the red links. That's not structured data, clutters up the Wikitext, it doesn't scale and the local communities generally don't seem to like the approach. That's not the way to go. Maybe a better option would be to create a new property on Wikidata to store the name of the future article. Something like Q116510: Pxxx -> (nl)"Friedrich Ris". Would be easiest because the infrastructure is there and you can just build tools on top of it, but I'm afraid this will cause a lot of noise on items. A couple of suggestions wouldn't be a problem, but what is keeping people from adding the suggestion in 100 languages? Or maybe restrict the usage that a Wikipedia must have at least 1 (or n) incoming links before people are allowed to add it? We could create a new projects on the Wikimedia Cloud to store the links, but that would be quite the extra time investment setting up everything. What do you think? Maarten

5 years, 6 months

Looking for "data quality check" bots

by Ettore RIZZA

Dear all, I'm looking for Wikidata bots that perform accuracy audits. For example, comparing the birth dates of persons with the same date indicated in databases linked to the item by an external-id. I do not even know if they exist. Bots are often poorly documented, so I appeal to the community to get some example. Many thanks. Ettore Rizza

5 years, 6 months

Fwd: [Wikimedia-l] Captioning Wikidata items?

by Pine W

I don't have enough knowledge about neural nets to evaluate the email below, but I'm forwarding it in case it's of interest to others on two relevant lists. Pine ( https://meta.wikimedia.org/wiki/User:Pine ) ---------- Forwarded message --------- From: John Erling Blad <jeblad(a)gmail.com> Date: Wed, Sep 26, 2018 at 6:23 PM Subject: [Wikimedia-l] Captioning Wikidata items? To: Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org> Just a weird idea. It is very interesting how neural nets can caption images. Quite interesting. It is done by building a state-model of the image, that is feed into a kind of neural net (RNN) and that net (a black box) will transform the state-model into running text. In some cases the neural net is steered. That is called an attention control, and it creates relationship between parts in the image. Swap out the image wit an item, and a virtually identical setup can generate captions for items. The caption for an item is whats called the description in Wikidata. It is also the first sentence with a lead-in in Wikipedia articles. It is possible to steer the attention, that is to tell the network what items should be used, and thus the later sentences will be meaningful. What that means is that we could create meaningful stub entries for the article placeholder, that is the "AboutTopic" special page. We can't automate this for very small projects, but somewhere between small and mid sized languages it will start to make sense. To make this work we need some very special knowledge, which we probably don't have, like how to turn an item into a state-model by using the highly specialized rdf2vec algorithm (hello Copenhagen) and verifying the stateful language model (hello Helsinki and Tromsø). I wonder if the only real problems are what do the community want, and what is the acceptable error limit. John Erling Blad /jeblad _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l New messages to: Wikimedia-l(a)lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>

5 years, 6 months

Official Website (P856) value constraint

by Victor Villas Bôas Chaves

Hi there! Do you people think that https?://(\S+\.)+\S+(/\S*)?<https://regex101.com/?regex=https?://(\S+\.)+\S+(/\S*)?> can be improved? I think it can. For instance, why require the protocol spec? For most use case scenarios (just clicking it), it's not useful. Arguably, it's not part of the website. Also, the regex is lax. That's not a big issue, but a more strict one could prevent duplicate urls from being added by bots and data imports. Aside from the http/s discussion, constraints like these: - warn against urls ending in / - warn against uppercase would help data imports from external databases or general batch edits with bots to avoid adding duplicate values. Att, Victor

5 years, 6 months

Mapping back to Schema.org needs "broader external class"

by Thad Guidry

Hi Team ! +Dan Brickley <danbri(a)google.com> +Lydia Pintscher <lydia.pintscher(a)wikimedia.de> Schema.org mapping is progressing on every new Weekly Summary "Newest properties" listing. That's great ! And thanks to Léa and team for providing the new properties listing ! What's not great, is many times, we cannot apply a "broader external class" to map to a Schema.org Type. This is because "broader concept" https://www.wikidata.org/wiki/Property:P4900 is constrained to "qualifiers only and not for use on statements". We are able to use the existing "narrower external class" <https://www.wikidata.org/wiki/Property:P3950> , for example like here on this topic, https://www.wikidata.org/wiki/Q7406919 , but there is no "broader external class" property in Wikidata yet from what we see. It would be *awesome* if someone could advocate for that new property to help map Wikidata to external vocabularies that have broader concepts quite often, such as Schema.org. -Thad +ThadGuidry <https://plus.google.com/+ThadGuidry>

5 years, 6 months

Next IRC office hour on September 25th

by Léa Lacroix

Hello all, The next IRC office hour with the Wikidata team will take place on September 25th, from 18:00 to 19:00 (UTC+2, Berlin time <https://www.timeanddate.com/worldclock/converter.html?iso=20180925T160000&p…>), on the channel #wikimedia-office. As usual, we will present you some news from the development team, the projects to come, and collect your feedback. If you have any special topic you'd like to see as a focus, please share it here! Thanks, -- Léa Lacroix Project Manager Community Communication for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/029/42207.

5 years, 7 months

Weekly Summary #331

by Léa Lacroix

5 years, 7 months

Mapping Wikidata to other ontologies

by Maarten Dammers

Hi everyone, Last week I presented Wikidata at the Semantics conference in Vienna ( https://2018.semantics.cc/ ). One question I asked people was: What is keeping you from using Wikidata? One of the common responses is that it's quite hard to combine Wikidata with the rest of the semantic web. We have our own private ontology that's a bit on an island. Most of our triples are in our own private format and not available in a more generic, more widely use ontology. Let's pick an example: Claude Lussan. No clue who he is, but my bot seems to have added some links and the item isn't too big. Our URI is http://www.wikidata.org/entity/Q2977729 and this is equivalent of http://viaf.org/viaf/29578396 and http://data.bibliotheken.nl/id/thes/p173983111 . If you look at http://www.wikidata.org/entity/Q2977729.rdf this equivalence is represented as: <wdtn:P214 rdf:resource="http://viaf.org/viaf/29578396"/> <wdtn:P1006 rdf:resource="http://data.bibliotheken.nl/id/thes/p173983111"/> Also outputting it in a more generic way would probably make using it easier than it is right now. Last discussion about this was at https://www.wikidata.org/wiki/Property_talk:P1921 , but no response since June. That's one way of linking up, but another way is using equivalent property ( https://www.wikidata.org/wiki/Property:P1628 ) and equivalent class ( https://www.wikidata.org/wiki/Property:P1709 ). See for example sex or gender ( https://www.wikidata.org/wiki/Property:P21) how it's mapped to other ontologies. This won't produce easier RDF, but some smart downstream users have figured out some SPARQL queries. So linking up our properties and classes to other ontologies will make using our data easier. This is a first step. Maybe it will be used in the future to generate more RDF, maybe not and we'll just document the SPARQL approach properly. The equivalent property and equivalent class are used, but not that much. Did anyone already try a structured approach with reporting? I'm considering parsing popular ontology descriptions and producing reports of what is linked to what so it's easy to make missing links, but I don't want to do double work here. What ontologies are important because these are used a lot? Some of the ones I came across: * https://www.w3.org/2009/08/skos-reference/skos.html * http://xmlns.com/foaf/spec/ * http://schema.org/ * https://creativecommons.org/ns * http://dbpedia.org/ontology/ * http://vocab.org/open/ Any suggestions? Maarten

5 years, 7 months

Is there an "indigenous to" for Plants ?

by Thad Guidry

Hi Community ! Is there a property for "indigenous to" or "native in" for Plants usage ? I only see https://www.wikidata.org/wiki/Property:P2341 which is constrained for People and not Animals or Plants. Any discussion links about this before ? Thanks in advance ! -Thad +ThadGuidry <https://plus.google.com/+ThadGuidry>

5 years, 7 months

Number of properties

by Adrian Bielefeldt

Hello everyone, is there some web page that states how many properties exist in Wikidata? Greetings, Adrian

5 years, 7 months

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

Wikidata September 2018