Hi,
I am developing a scientific terms thesaurus and have discovered that existing Wikipedia "page redirect titles" provide a useful way to resolve an odd or archaic form to a "canonical" term label as it is represented by the Wikipedia page title (aka Wikidata "sitelink"). For example, https://en.wikipedia.org//w/api.php?action=query&format=xml&prop=red...
In Wikidata, these "page redirect titles" are not represented in the data model except very inconsistently and sparsely as skos:altLabel or ("alias"). My use case is that I would like to be able to query Wikidata for these page redirect titles in order to resolve odd multi-linguistic names to an single concept.
My question is that if I were to create a bot that imported all "page redirect titles" for a given sitelink and created them with the skos:altLabel property en masse, is this a valid semantic relationship? Or, should it rather be represented as ?sitelink owl:sameAs <page redirect URI>? Or both?
Furthermore,, in some cases (z.B. mis-spellings), skos:hiddenLabel may be more appropriate, but this has no definition in the data model. There potentially would be a lot of clutter in the UI without a hiddelLabel alias property. Also, there are no types for page redirects in Wikipedia, afaik.
Additional value for the searching in the WIkidata UI could probably be obtained from indexing these alternate page titles as well.
Regards, Christopher Johnson
On Sat, Mar 12, 2016 at 2:14 PM Christopher Johnson < christopher.johnson@wikimedia.de> wrote:
Hi,
I am developing a scientific terms thesaurus and have discovered that existing Wikipedia "page redirect titles" provide a useful way to resolve an odd or archaic form to a "canonical" term label as it is represented by the Wikipedia page title (aka Wikidata "sitelink"). For example,
https://en.wikipedia.org//w/api.php?action=query&format=xml&prop=red...
In Wikidata, these "page redirect titles" are not represented in the data model except very inconsistently and sparsely as skos:altLabel or ("alias"). My use case is that I would like to be able to query Wikidata for these page redirect titles in order to resolve odd multi-linguistic names to an single concept.
My question is that if I were to create a bot that imported all "page redirect titles" for a given sitelink and created them with the skos:altLabel property en masse, is this a valid semantic relationship? Or, should it rather be represented as ?sitelink owl:sameAs <page redirect URI>? Or both?
Furthermore,, in some cases (z.B. mis-spellings), skos:hiddenLabel may be more appropriate, but this has no definition in the data model. There potentially would be a lot of clutter in the UI without a hiddelLabel alias property. Also, there are no types for page redirects in Wikipedia, afaik.
Additional value for the searching in the WIkidata UI could probably be obtained from indexing these alternate page titles as well.
There are several points to address: 1) Should redirects from Wikipedia be imported as aliases on Wikidata? No. This has been done before and created a massive amount of cleanup work because the redirects contained a lot of not meaningful misspellings and more. Please do not import them to Wikidata without approval through the bot approval process and clear quality control. 2) Should we allow more fine-grained distinction between real aliases and misspellings in the UI and datamodel? No. I don't believe this is worth the complexity and resulting discussions/edit wars and more.
Cheers Lydia
Hi,
I believe one can (and possibly should) use these "page redirect titles" as an additional (but low ranked) source for our growing search algorithm. Think of a user typing a page title that exists as a redirect on the English Wikipedia. It would be pretty cool if the related Wikidata item can be found when using the same search term on Wikidata.
But as Lydia already explained: The semantic relationship is weak. Especially on the English Wikipedia that usually does not delete redirects, no matter how "wrong" they are. This is different in other wikis. The German Wikipedia for example usually deletes redirects that do not have a strong semantic relationship to the target article. The probability that these German redirects can become aliases on Wikidata is much higher, but still not high enough for a bot import.
Best Thiemo
wikidata-tech@lists.wikimedia.org