I'm sure this is documented somewhere, but...
In the Labs DB replicas, I see the wbc_entity_usage table. I assume that
logs the Wikidata entities used for a particlar page.
* What is eu_aspect? What do "L.de"/O/S/T/X mean?
* Is one of these the item ID of the page (as opposed to "arbitary use")?
If so, which one? If not, how could I find this information on the replicas?
I am trying to query the wiki data for entities with labels that matches a
regex. I am new in the sparql world. So could you please help me with it.
Here is what I have for now.
https://gist.github.com/anonymous/2810eb5747e51a9ae746183a43f20771
But I don't think it is the right way. Any help will be much appreciate.
Thanks
Hi,
I have encountered a problem with federating Wikidata entities with LOD
based on the abundant external resource "literal identifiers" that are
represented in Wikidata. An external resource identifier should include
the entire IRI to the resource, not just the "id", otherwise the external
resource cannot be federated in a SPARQL query without concatenating the
IRI to the literal identifier and binding to a new variable.
I think that Wikidata properties that expect explicit Identifiers in the
range should thus be defined as "object properties" rather than "datatype
properties". The javascript id linking gadget in the UI also typically
borks the link when a IRI is inputted as an identifier making it less
likely for people to do it properly. I assume that this is a "policy" that
was established for bot id imports, that unfortunately has produced a lot
of "not so useful" data.
The triple pattern fragment implementation that I am working with does not
allow bind grouping in the sparql query, so this unfortunately makes
federation of Wikidata entities by Identifiers nearly impossible. Hence,
the only federation that I have been able to do effectively is "to"
Wikidata not "from" it....
If anyone is interested, I have setup Wikidata as a TPF datasource here:
http://orbeon-bb.wmflabs.org/wdqs-sparql The API works by passing an
encoded subject, predicate or object parameter like:
http://orbeon-bb.wmflabs.org/wdqs-sparql?object=%22Berlin%22%40de
Christopher
Hi,
I am developing a scientific terms thesaurus and have discovered that
existing Wikipedia "page redirect titles" provide a useful way to resolve
an odd or archaic form to a "canonical" term label as it is represented by
the Wikipedia page title (aka Wikidata "sitelink"). For example,
https://en.wikipedia.org//w/api.php?action=query&format=xml&prop=redirects&…
In Wikidata, these "page redirect titles" are not represented in the data
model except very inconsistently and sparsely as skos:altLabel or
("alias"). My use case is that I would like to be able to query Wikidata
for these page redirect titles in order to resolve odd multi-linguistic
names to an single concept.
My question is that if I were to create a bot that imported all "page
redirect titles" for a given sitelink and created them with the
skos:altLabel property en masse, is this a valid semantic relationship?
Or, should it rather be represented as ?sitelink owl:sameAs <page redirect
URI>? Or both?
Furthermore,, in some cases (z.B. mis-spellings), skos:hiddenLabel may be
more appropriate, but this has no definition in the data model. There
potentially would be a lot of clutter in the UI without a hiddelLabel alias
property. Also, there are no types for page redirects in Wikipedia, afaik.
Additional value for the searching in the WIkidata UI could probably be
obtained from indexing these alternate page titles as well.
Regards,
Christopher Johnson
If the page redirect titles exist in Wikipedia, they are valid in Wikidata
as data, regardless of what they represent in *your view* of "quality". If
cleanup needs to be done, it should be done in the context of the source
first. Evaluating the value of a specific "alias" to a Wikidata item is a
judgment that should be based entirely on a *referenceable* data source.
Wikidata aliases (as well as descriptions and preferred labels) are
completely arbitrary and unreferenced, and in my judgment worthless,
without a primary source or clearly defined semantic relationship. The
judgmental curation of Wikidata is in fact, not that useful. Wikidata
should simply seek to represent data *as it exists* (errors or not) in the
primary source.
Furthermore, apparently you do not get why skos:hiddenLabel exists. Why
you feel that it is not worthwhile is not relevant to its primary function,
which is to facilitate searching. (see
https://www.w3.org/2012/09/odrl/semantic/draft/doco/skos_hiddenLabel.html)
And, it is not difficult to argue that the searching in Wikidata could use
improvement.
On 16 March 2016 at 13:00, <wikidata-tech-request(a)lists.wikimedia.org>
wrote:
> Send Wikidata-tech mailing list submissions to
> wikidata-tech(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
> or, via email, send a message with subject or body 'help' to
> wikidata-tech-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> wikidata-tech-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Wikidata-tech digest..."
>
>
> Today's Topics:
>
> 1. Re: Wikipedia Page Redirect Titles in Wikidata (Lydia Pintscher)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 15 Mar 2016 16:49:40 +0000
> From: Lydia Pintscher <Lydia.Pintscher(a)wikimedia.de>
> To: wikidata-tech(a)lists.wikimedia.org
> Subject: Re: [Wikidata-tech] Wikipedia Page Redirect Titles in
> Wikidata
> Message-ID:
> <
> CABfqUgJj3haDoaA+Oi6WkOT9Zr6HBcnq9w40ZtXtxz9-+vh1mQ(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Sat, Mar 12, 2016 at 2:14 PM Christopher Johnson <
> christopher.johnson(a)wikimedia.de> wrote:
>
> > Hi,
> >
> > I am developing a scientific terms thesaurus and have discovered that
> > existing Wikipedia "page redirect titles" provide a useful way to resolve
> > an odd or archaic form to a "canonical" term label as it is represented
> by
> > the Wikipedia page title (aka Wikidata "sitelink"). For example,
> >
> >
> https://en.wikipedia.org//w/api.php?action=query&format=xml&prop=redirects&…
> >
> > In Wikidata, these "page redirect titles" are not represented in the data
> > model except very inconsistently and sparsely as skos:altLabel or
> > ("alias"). My use case is that I would like to be able to query Wikidata
> > for these page redirect titles in order to resolve odd multi-linguistic
> > names to an single concept.
> >
> > My question is that if I were to create a bot that imported all "page
> > redirect titles" for a given sitelink and created them with the
> > skos:altLabel property en masse, is this a valid semantic relationship?
> > Or, should it rather be represented as ?sitelink owl:sameAs <page
> redirect
> > URI>? Or both?
> >
> > Furthermore,, in some cases (z.B. mis-spellings), skos:hiddenLabel may be
> > more appropriate, but this has no definition in the data model. There
> > potentially would be a lot of clutter in the UI without a hiddelLabel
> alias
> > property. Also, there are no types for page redirects in Wikipedia,
> afaik.
> >
> > Additional value for the searching in the WIkidata UI could probably be
> > obtained from indexing these alternate page titles as well.
> >
>
> There are several points to address:
> 1) Should redirects from Wikipedia be imported as aliases on Wikidata? No.
> This has been done before and created a massive amount of cleanup work
> because the redirects contained a lot of not meaningful misspellings and
> more. Please do not import them to Wikidata without approval through the
> bot approval process and clear quality control.
> 2) Should we allow more fine-grained distinction between real aliases and
> misspellings in the UI and datamodel? No. I don't believe this is worth the
> complexity and resulting discussions/edit wars and more.
>
>
> Cheers
> Lydia
> --
> Lydia Pintscher - http://about.me/lydia.pintscher
> Product Manager for Wikidata
>
> Wikimedia Deutschland e.V.
> Tempelhofer Ufer 23-24
> 10963 Berlin
> www.wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.
>
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/029/42207.
>
Dear Wikibase experts,
I have a (hopefully) trivial question to ask: is there any specific
configuration to set in order to make a wikibase extension accessible via
libraries such as Pywikibot?
Should I maybe activate the wikibase_client
<https://www.mediawiki.org/wiki/Extension:Wikibase_Client> ?
Here's the problem described from the Pywikibot point of view: when I try
to connect to my wikibase installation, I can access the mediawiki pages...
In [1]: import pywikibot
In [2]: site = pywikibot.Site()
In [9]: page = pywikibot.Page(site, u"Main_Page")
In [10]: page.get()
Out[10]: u"<strong>MediaWiki ha ........................
...but I cannot access the wikibase Items (created in the same mediawiki
installation), since the "site.data_repository()" method (of my local
Pywikibot site-object) returns an EMPTY value...
In [37]: site.data_repository()
In [38]: type(site.data_repository())
Out[38]: NoneType
While, if I access (the same way) the wikibase instance of
https://test.wikidata.org (of course) I get a correct response from the
"site.data_repository()" method (of my local Pywikibot site-object):
In [39]: wikidatatest_site = pywikibot.Site("test", "wikidata")
In [40]: wikidatatest_site.data_repository()
Out[40]: DataSite("test", "wikidata")
In [41]: type(wikidatatest_site.data_repository())
Out[41]: pywikibot.site.DataSite
Here are the main details of my mediawiki/wikibase installation:
- MediaWiki: 1.26.2
- PHP: 5.6.15 (apache2handler)
- MariaDB: 10.1.9-MariaDB
- ICU: 4.8.1.1
...with these extensions (with specific versions) for Wikibase:
- Purtle: 0.1-dev
- Wikibase DataModel: 4.4.0
- Wikibase DataModel JavaScript: 1.0.2-dev
- Wikibase DataModel Serialization: 2.1.0
- Wikibase Internal Serialization: 2.1.0
- Wikibase JavaScript API: 1.1.0
- Wikibase Repository: 0.5 alpha (32d7ef0) 00:03, 30 September 2015
- Wikibase Serialization JavaScript: 2.0.6
- Wikibase View: 0.1-dev
- WikibaseLib: 0.5 alpha (32d7ef0) 00:03, 30 September 2015
If this question is asked in the wrong context I'm really sorry (I searched
with no success the official pages of Wikibase and the archive of this
mailing-list, http://blog.gmane.org/gmane.org.wikimedia.wikidata.technical
):
please let me know whether I should better ask this question to the
Pywikibot library guys.
Thanks a lot for attention!
Hi,
I found that Special:EntityData returns outdated JSON data that is not
in agreement with the page. I have fetched the data using wget to ensure
that no browser cache is in the way. Concretely, I have been looking at
https://www.wikidata.org/wiki/Special:EntityData/Q17444909.json
where I recently changed the P279 value from Q217594 to Q16889133. Of
course, this might no longer be a valid example when you read this email
(in case the cache gets updated at some point).
Is this a bug in the configuration of the HTTP (or other) cache, or is
this the desired behaviour? When will the cache be cleared?
Thanks,
Markus