[WikiEN-l] Proposal: WikiData
Philip Sandifer
snowspinner at gmail.com
Wed Feb 13 21:34:09 UTC 2008
One of the frequent inclusion/deletion arguments has been over "cruft"
of various sorts - plot summaries, "in popular culture" sections,
strange but interesting lists ("List of songs that mention the title
over n times" where n was something weird and large was an old
favorite), etc. The basic problem in these cases is that while the
information is often verifiable, it seems somewhat tangental to a
reasoned and well-organized presentation of major facts on a subject.
On the other hand, it is not irrelevant as such either - quite the
contrary, the information is often valuable, if not in a strictly
encyclopedic sense. Certainly the drive to eliminate articles that are
just plot summaries, while well-intentioned, would serve to destroy a
useful resource that is not duplicated by other free content projects
at present.
In print publications such interesting tangents exist in the form of
sidebars. Open Time or Newsweek and you'll see them all over the place
- articles will hum merrily along, and off on the side will be small
explanations, graphs, and other tangental pieces of information. But
for whatever reason, in the online structure, we've largely declined
to take advantage of that. As a result, we have a messy structure of
sub-articles and chunks of what are essentially sidebar data dropped
into an article. And this affects a wide range of articles. On the one
hand you have something like [[School Hard]] - an article on an
episode of Buffy - that is interrupted by a credits section, numerous
lists, and a huge table of in-universe chronology. On the other you
have something like [[Democratic Party (United States) presidential
primaries, 2008]] where a narrative of what happened is abandoned in
favor of graphs, charts, etc.
In both cases the problem is the same - relevant chunks of data are
choking out the article. And in the latter case, a fair amount has
already been done about this - there are already 8 sub-articles
breaking out lengthy chunks of data from this article.
A quick tour of a number of major topics shows the same result with
sub-articles. [[Hillary Rodham Clinton presidential campaign, 2008]]
has [[Political positions of Hillary Rodham Clinton]] - a vital chunk
of information that still amounts to a list of positions, and is not
meaningfully an article.
I propose that we need to dramatically rethink how we treat chunks of
data on Wikipedia. In many cases - from fictional topics to real-world
ones - there is often a large chunk of information that is worth
presenting, but that does not present well in article form. Our
current method of spin-off and sub-articles leaves us with a mass of
articles that often make poor articles even as they contain valuable
information. (And I would say that [[School Hard]] and [[Political
positions of Hillary Rodham Clinton]] are articles of more or less
exactly equal quality)
I propose that we start an active repository for "sidebar content" -
large chunks of data, lists, tables, summaries, etc. This could be
done as a namespace - the Sidebar or Data namespace - or as a separate
project - WikiData. But in either case, the goal would be the same -
verifiable information that is useful in researching and learning
about a topic, but that does not present well in the format of an
encyclopedic overview of the topic. We'd need to come up with a good
navigation engine - something, in other words, that avoids the litany
of mistakes in the category system. But I think that this would let us
dramatically re conceptualize how we cover a number of topics in a way
that allows both the depth of (at times idiosyncratic) information
that is widely recognized as one of our great strengths and the clear,
well-organized prose that we strive for in encyclopedia articles.
In more practical terms, what I'm imagining would be an article on,
say, Buffy the Vampire Slayer that had some clear link to data and
sidebars. Click on it, and a navigational engine comes up that guides
you through the sidebar content - a list of episodes that one could
delve into and, from there, get plot summaries, credits, overviews of
reviews, etc. A list of characters, an overview of critical
commentaries, heck, a huge link collection of reviews of the series or
of episodes. In other words, a way of having our article - structured
with a clear lead section, and specific, well-sourced sections - be
the top layer of a mass of well-organized content. Something that
gives us an option for a topic beyond "have an article on it," "don't
have an article on it," or "throw it into a messy list that doesn't
quite function as an article."
Thoughts?
-Phil
More information about the WikiEN-l
mailing list