Hi Gerard,
Actually I was not referring to Wikidata descriptions in this thread.
That seems to be a simple problem to fix if one finds a way to fix the
classification. The bigger problem here is that the data is inconsistent
(claiming that VW Polo is a page that is produced by Volkswagen).
Markus
On 20.08.2014 12:31, Gerard Meijssen wrote:
Hoi,
Markus, I am not surprised at all that such problems exist. The problem
is inherent in the descriptions. They are added and made sense at that
time. Now they do no longer apply because of statements made that ensure
it is no longer a disambiguation page. These descriptions are not seen.
You only see the descriptions in *YOUR* language.
The first obvious remedial task would be to remove all the texts in all
languages where the "instance of" is different from "Wikimedia
disambiguation page".
I am really happy that you notice the fragility of descriptions.
Automated descriptions do not suffer from this..
Related to what you have noticed are the list articles. There is a
policy where list articles are turned into singular and then describe
whatever they are were list of. They are no longer list articles and the
texts indicating they are is wrong as well.
Thanks,
GerardM
On 20 August 2014 11:40, Markus Krötzsch <markus(a)semantic-mediawiki.org
<mailto:markus@semantic-mediawiki.org>> wrote:
Hi all,
We have a lot of statements saying that something is an instance of
a "Wikipedia disambiguation page" (Q4167410). Unfortunately, this
kind of information says something about a particular Wikipedia
article in a particular language, and often is not true for other
languages. Moreover, even if there is a language where the according
article is marked as disambiguation page, it is still common that
the page gives a description of a real item.
Example (bad use of instance of:Wikipedia disambiguation page)
==============================__==============================__==
https://www.wikidata.org/wiki/__Q247819
<https://www.wikidata.org/wiki/Q247819> (VW Polo)
Enwiki (like many languages) has a normal article here that is not a
disambiguation page. It says "The Volkswagen Polo is a supermini car
produced by the German manufacturer Volkswagen". That's very
different from "The Volkswagen Polo is a disambiguation page."
Even Wikipedias where the VW-Polo article is marked as
disambiguation page do not claim that the thing they are talking
about is the disambiguation page. For instance, frwiki has the
article in Catégorie:Homonymie, yet it says:
"Volkswagen Polo est une automobile, de la gamme des polyvalentes,
de la marque allemande Volkswagen"
Again, it is not said that VW Polo is a disambiguation page, even
though the page (not the car) is marked as one.
Proper use of instance of:Wikipedia disambiguation page
==============================__=========================
Now there are also many proper disambiguation pages. They do not
have a joint concept, other than the ambiguous title in a particular
language.
Examples:
https://en.wikipedia.org/wiki/__Jaguar_(disambiguation)
<https://en.wikipedia.org/wiki/Jaguar_(disambiguation)>
and, entertainingly:
https://en.wikipedia.org/wiki/__Disambiguation_(__disambiguation)
<https://en.wikipedia.org/wiki/Disambiguation_(disambiguation)>
An item that is "instance of:Wikipedia disambiguation page":
* should not have sitelinks to pages that are not disambiguation
pages (an item can either be about a Wikipedia page or about a car,
but these should be kept separate),
* should always use the exact page title as the label (because this
is the real label of the page; the page "Jaguar (disambiguation)" is
not called "Jaguar" by anybody),
* should hardly have any statements at all, since there is almost
nothing that you can truthfully say about a group of pages in many
different languages, and since we want to avoid project-specific
statements (that's one reason we have badges as part of site links).
Whether disambiguation pages should have more than a single sitelink
at all is another question. In my view, if we are talking about a
"page", it is not the same page in French as it is in English (most
properties that pages could naturally have, such as authors,
language, creation date, etc. apply to a single page only). However,
I can see that it is practical to group such pages nonetheless.
Conclusion
==========
It would be nice if somebody could analyse this problem in more
detail (how many of our "disambiguation page" items have statements
that are obviously not about a page but about a car make, animal,
etc.). We might need some manual effort to clean this up (basically,
a kind of un-merging game).
The immediate conclusion is that we need to be much more careful
importing this type of information from one Wikipedia, since it is
(by its very nature) not project-independent and not universal
across languages.
Cheers,
Markus
_________________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
https://lists.wikimedia.org/__mailman/listinfo/wikidata-l
<https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l