In preparation for next week's quarterly planning, I'd like to restate some
of my concerns regarding Wikidata descriptions and flesh them out more
comprehensively, since we're featuring them more prominently in the
upcoming quarter.
(n.b. These are more like "devil's advocate" thoughts, lest I make it sound
like the Apps team isn't unified in its vision, which it certainly is.)
My reservations fall under two categories:
== Philosophical ==
Wikidata is a superbly valuable repository of *data* -- data that a machine
can use to generate all kinds of results that us humans can consume. The
"description" field, on the other hand, is the only thing that is *not*
data, and is not usable by a machine in any way.
To allow users to manually fill in the Wikidata description (i.e. to
manually duplicate the contents of Wikipedia) is to miss the point of the
true potential of Wikidata, which is to be able to *use* the data to
generate the description automatically!
Of course the counterargument to this is that the current state of
auto-generated descriptions is not quite good (they often sound strange or
nonsensical), but that's only because the tools we have at our disposal for
generating descriptions are still in their infancy. I don't deny that this
will be a hard problem to solve, but in my view, this is ultimately the
*correct* problem to solve.
The other thing (a more obvious one) that makes Wikidata descriptions
redundant is the first sentence of every Wikipedia article which, on its
own, is intended to provide a concise description of the article (and many
articles already do this with rather good consistency). In fact, as we
speak, we're working on programmatically "cleaning up" the first sentence
to make it even more concise. Why not simply use this as the description?
Is the first sentence sometimes too long to be a good description? No
problem: create a markup annotation that will denote the *portion* of the
first sentence that will serve as the description. In any case, making
users manually copy the content from the first sentence (which is from
where most of the current Wikidata descriptions appear to be derived) seems
extraordinarily unnecessary. On top of all that, it creates an unnecessary
synchronization cost, fulfillable only by a human contributor, between the
two sources of data.
So, what I mean to say is: every edit to the Wikidata description is a
missed opportunity to edit the Wikipedia article in such a way that the
description could be auto-generated correctly. (or, similarly, a missed
opportunity to edit the *data* of the Wikidata entry in such a way that the
description could be auto-generated correctly)
== Practical ==
If we open the floodgates to editing the Wikidata description (i.e. if we
make it too easy to edit the description), I predict that we'll be very
disappointed by the quality of the contributions we'll get. I can see it
quickly devolving into a whole lot of noise, spam, and vandalism.
This means that we would need to implement the same kind of
moderation/administration schemes that currently exist on Wikipedia
itself. I'm by no means qualified to speak for the Community, but I doubt
that many Wikipedians will want to double their workload by having to
"watch" the Wikidata description of their favorite articles, in addition to
the articles themselves.
I'll also point out that we do not yet expose any administrative mechanisms
in the mobile apps. This means that users will routinely see their edits
disappear or be reverted without any notification or explanation. This is
already the case for the general editing of article content in the apps,
but since the description is featured much more prominently, any edits (or
reverts) to it will be much more noticeable, and will surely add to the
confusion and frustration.
If we really want to get it right, we have to figure this out before
proceeding.
-Dmitry