Hi everyone,
According to https://www.youtube.com/watch?v=TLuM4E6IE5U : "Semantic
annotation is the process of attaching additional information to various
concepts (e.g. people, things, places, organizations etc) in a given
text or any other content. Unlike classic text annotations for reader's
reference, semantic annotations are used by machines to refer to."
(more at
https://ontotext.com/knowledgehub/fundamentals/semantic-annotation/ )
On Wikipedia a red link is a link to an article that hasn't been created
(yet) in that language. Often another language does have an article
about the subject or at least we have a Wikidata item about the subject.
Take for example
https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris . It has over
250 incoming links, but the person doesn't have an article in Dutch. We
have a Wikidata item with links to 7 Wikipedia's at
https://www.wikidata.org/wiki/Q116510 , but no way to relate
https://nl.wikipedia.org/w/index.php?title=Friedrich_Ris with
https://www.wikidata.org/wiki/Q116510 .
Wouldn't it be nice to be able to make a connection between the red link
on Wikipedia and the Wikidata item?
Let's assume we have this list somewhere. We would be able to offer all
sorts of nice features to our users like:
* Hover of the link to get a hovercard in your favorite backup language
* Generate an article placeholder for the user with basic information in
the local language
* Pre-populate the translate extension so you can translate the article
from another language
(probably plenty of other good uses)
Where to store this link? I'm not sure about that. On some Wikipedia's
people have tested with local templates around the red links. That's not
structured data, clutters up the Wikitext, it doesn't scale and the
local communities generally don't seem to like the approach. That's not
the way to go. Maybe a better option would be to create a new property
on Wikidata to store the name of the future article. Something like
Q116510: Pxxx -> (nl)"Friedrich Ris". Would be easiest because the
infrastructure is there and you can just build tools on top of it, but
I'm afraid this will cause a lot of noise on items. A couple of
suggestions wouldn't be a problem, but what is keeping people from
adding the suggestion in 100 languages? Or maybe restrict the usage that
a Wikipedia must have at least 1 (or n) incoming links before people are
allowed to add it?
We could create a new projects on the Wikimedia Cloud to store the
links, but that would be quite the extra time investment setting up
everything.
What do you think?
Maarten
Dear all,
I'm looking for Wikidata bots that perform accuracy audits. For example,
comparing the birth dates of persons with the same date indicated in
databases linked to the item by an external-id.
I do not even know if they exist. Bots are often poorly documented, so I
appeal to the community to get some example.
Many thanks.
Ettore Rizza
I don't have enough knowledge about neural nets to evaluate the email
below, but I'm forwarding it in case it's of interest to others on two
relevant lists.
Pine
( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: John Erling Blad <jeblad(a)gmail.com>
Date: Wed, Sep 26, 2018 at 6:23 PM
Subject: [Wikimedia-l] Captioning Wikidata items?
To: Wikimedia Mailing List <wikimedia-l(a)lists.wikimedia.org>
Just a weird idea.
It is very interesting how neural nets can caption images. Quite
interesting. It is done by building a state-model of the image, that is
feed into a kind of neural net (RNN) and that net (a black box) will
transform the state-model into running text. In some cases the neural net
is steered. That is called an attention control, and it creates
relationship between parts in the image.
Swap out the image wit an item, and a virtually identical setup can
generate captions for items. The caption for an item is whats called the
description in Wikidata. It is also the first sentence with a lead-in in
Wikipedia articles. It is possible to steer the attention, that is to tell
the network what items should be used, and thus the later sentences will be
meaningful.
What that means is that we could create meaningful stub entries for the
article placeholder, that is the "AboutTopic" special page. We can't
automate this for very small projects, but somewhere between small and mid
sized languages it will start to make sense.
To make this work we need some very special knowledge, which we probably
don't have, like how to turn an item into a state-model by using the highly
specialized rdf2vec algorithm (hello Copenhagen) and verifying the stateful
language model (hello Helsinki and Tromsø).
I wonder if the only real problems are what do the community want, and what
is the acceptable error limit.
John Erling Blad
/jeblad
_______________________________________________
Wikimedia-l mailing list, guidelines at:
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l(a)lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
<mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe>
Hi there!
Do you people think that https?://(\S+\.)+\S+(/\S*)?<https://regex101.com/?regex=https?://(\S+\.)+\S+(/\S*)?> can be improved? I think it can.
For instance, why require the protocol spec? For most use case scenarios (just clicking it), it's not useful. Arguably, it's not part of the website.
Also, the regex is lax. That's not a big issue, but a more strict one could prevent duplicate urls from being added by bots and data imports. Aside from the http/s discussion, constraints like these:
- warn against urls ending in /
- warn against uppercase
would help data imports from external databases or general batch edits with bots to avoid adding duplicate values.
Att,
Victor
Hi Team !
+Dan Brickley <danbri(a)google.com> +Lydia Pintscher
<lydia.pintscher(a)wikimedia.de>
Schema.org mapping is progressing on every new Weekly Summary "Newest
properties" listing.
That's great ! And thanks to Léa and team for providing the new properties
listing !
What's not great, is many times, we cannot apply a "broader external class"
to map to a Schema.org Type. This is because "broader concept"
https://www.wikidata.org/wiki/Property:P4900 is constrained to "qualifiers
only and not for use on statements".
We are able to use the existing "narrower external class"
<https://www.wikidata.org/wiki/Property:P3950> , for example like here on
this topic, https://www.wikidata.org/wiki/Q7406919 , but there is no
"broader external class" property in Wikidata yet from what we see.
It would be *awesome* if someone could advocate for that new property to
help map Wikidata to external vocabularies that have broader concepts quite
often, such as Schema.org.
-Thad
+ThadGuidry <https://plus.google.com/+ThadGuidry>
Hello all,
The next IRC office hour with the Wikidata team will take place on
September 25th, from 18:00 to 19:00 (UTC+2, Berlin time
<https://www.timeanddate.com/worldclock/converter.html?iso=20180925T160000&p…>),
on the channel #wikimedia-office.
As usual, we will present you some news from the development team, the
projects to come, and collect your feedback.
If you have any special topic you'd like to see as a focus, please share it
here!
Thanks,
--
Léa Lacroix
Project Manager Community Communication for Wikidata
Wikimedia Deutschland e.V.
Tempelhofer Ufer 23-24
10963 Berlin
www.wikimedia.de
Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/029/42207.
Hi everyone,
Last week I presented Wikidata at the Semantics conference in Vienna (
https://2018.semantics.cc/ ). One question I asked people was: What is
keeping you from using Wikidata? One of the common responses is that
it's quite hard to combine Wikidata with the rest of the semantic web.
We have our own private ontology that's a bit on an island. Most of our
triples are in our own private format and not available in a more
generic, more widely use ontology.
Let's pick an example: Claude Lussan. No clue who he is, but my bot
seems to have added some links and the item isn't too big. Our URI is
http://www.wikidata.org/entity/Q2977729 and this is equivalent of
http://viaf.org/viaf/29578396 and
http://data.bibliotheken.nl/id/thes/p173983111 . If you look at
http://www.wikidata.org/entity/Q2977729.rdf this equivalence is
represented as:
<wdtn:P214 rdf:resource="http://viaf.org/viaf/29578396"/>
<wdtn:P1006 rdf:resource="http://data.bibliotheken.nl/id/thes/p173983111"/>
Also outputting it in a more generic way would probably make using it
easier than it is right now. Last discussion about this was at
https://www.wikidata.org/wiki/Property_talk:P1921 , but no response
since June.
That's one way of linking up, but another way is using equivalent
property ( https://www.wikidata.org/wiki/Property:P1628 ) and equivalent
class ( https://www.wikidata.org/wiki/Property:P1709 ). See for example
sex or gender ( https://www.wikidata.org/wiki/Property:P21) how it's
mapped to other ontologies. This won't produce easier RDF, but some
smart downstream users have figured out some SPARQL queries. So linking
up our properties and classes to other ontologies will make using our
data easier. This is a first step. Maybe it will be used in the future
to generate more RDF, maybe not and we'll just document the SPARQL
approach properly.
The equivalent property and equivalent class are used, but not that
much. Did anyone already try a structured approach with reporting? I'm
considering parsing popular ontology descriptions and producing reports
of what is linked to what so it's easy to make missing links, but I
don't want to do double work here.
What ontologies are important because these are used a lot? Some of the
ones I came across:
* https://www.w3.org/2009/08/skos-reference/skos.html
* http://xmlns.com/foaf/spec/
* http://schema.org/
* https://creativecommons.org/ns
* http://dbpedia.org/ontology/
* http://vocab.org/open/
Any suggestions?
Maarten
Hi Community !
Is there a property for "indigenous to" or "native in" for Plants usage ?
I only see https://www.wikidata.org/wiki/Property:P2341 which is
constrained for People and not Animals or Plants.
Any discussion links about this before ?
Thanks in advance !
-Thad
+ThadGuidry <https://plus.google.com/+ThadGuidry>