First, let me say that, despite all appearances, I'm not trying to
criticize Magnus (surely no one should be criticized for doing work for
Wikipedia ;-) ). Or Simon! I just think we need to think harder about
all of these issues.
Maybe Simon can answer the question Magnus couldn't (or, not very well as
far as I could tell): what is the *purpose* of this feature? What is it
supposed to accomplish?
On Sun, 6 Jan 2002, Simon Kissane wrote:
I like Magnus' idea a lot. Yes, we could just use
ordinary old links. But I like this better for a
couple of reasons:
1. Software can understand the {{CATEGORY ...}}
notation much better than it can understand plain old
links. What if I want to extract all the articles on
linguistics from Wikipedia? With categories like this,
I could have a script that could do it easily. But
with just plain old links, writing a script to do it
would be difficult.
OK, this is helpful: categories could be used to sort articles into broad
compilations (associated with academic fields like linguistics) that, for
whatever reason (and who can predict what reasons they would be?), people
might like to have.
I can see a use for that, but I think it needs to be clarified further: is
the claim that it would be useful to have articles sorted only at the
top-level categories (those on the HomePage, just for example) or at some
expanded level? If at some expanded level, depending on *how* expanded,
the purpose(s) of the feature might change considerably.
Moreover, it is *not* at all clear to me (particularly given, as Simon
says, we might want to support multiple category schemes--though this too
I might doubt, as I'll have to explain) that this particular
implementation is the best. What *would* be the best way to sort articles
into broad categories? Why not, for example, have special pages that
simply list article titles, so that, e.g., [[linguistics category]] would
contain nothing but links to articles that someone asserts belong to the
category of linguistics? We could just as easily autogenerate lists of
uncategorized pages, and for each article page we could have the script
look at the category: pages to see whether the article is in a category.
I'm not saying that this is a better idea; I'm just saying that there are
multiple ways of going about doing something, and it would be a mistake
not to consider them.
2. We can easily find "loose" pages, that
don't belong
to a category. (On the other hand, we could mark the
category pages as category pages, and then search for
pages aren't linked from a page marked as a category
page.)
Larry suggested having a drop-down box, to limit the
categories the users could choose from. I think
Wikipedia should support multiple category schemes,
and should allow anyone to add their own categories.
Well, we could have multiple drop-down boxes, eh? How else would we
individuate our multiple category schemes? The whole point of a drop-down
box is to disallow a category scheme from metastasizing for frivolous
reasons (such as that somebody didn't know that an article that goes under
XYZ really belongs under XYZ, and so creates its own category--just an
example).
That way we can experiment, and see what works best.
How exactly would we experiment? What would we be seeing "works best"?
I think at this point we need to be very clear on what we mean by
"category scheme." On the one hand, there are schemes of the sort we have
on the Home Page, or the Library of Congress catalog scheme. Those
schemes (1) list subjects, (2) arrange those subjects under larger
headings ("supercategories"), and even (3) provide an ordering of some
sort within the "supercategories." So when you say we can experiment and
see what works best, which of these (or what combination) do you mean?
We already do experiment with multiple category schemes in the sense of a
combination of (1) and (2). But this doesn't list all articles, of
course.
But when Magnus proposes to allow us to list the category, or categories,
of a particular article on an article's page itself (even within the body
of the article itself), he provides us no particular category scheme in
*any* of the senses in (1), (2), or (3) (which is fine). What Simon
asserts now is that Magnus' feature allows is "multiple category schemes."
I don't see this, though. What it allows us, rather, is multiple
*categories*, at the whim of the user: so we could expand the list
mentioned in (1) at will. That's the only sense in which I can see how
Magnus's feature allows us "multiple category schemes."
By contrast, there's nothing about Magnus' feature that allows us to
produce multiple category schemes in the sense in which we've *usually*
talked about them on Wikipedia (and Nupedia), viz., the *combination* of
(1) (a specific, delimited set of subject categories) and (2) (an
arrangement of that delimited set into supercategories). As far as I can
tell, Magnus' feature at present would give us *a single* infinitely
expandable set of categories.
People with alternative views of how to categorise
things can create their own category schemes (and
categorising things is one area where there are often
as many views as there are people, probably because
there is no one right answer.)
Wouldn't that make categorization particularly pointless? No one person
is going to categorize all our articles, I imagine--no one person is
competent to do so, probably. That means we have to work together on
this. Now, I can see multiple competing category schemes (maybe--but I'd
like to know what the purpose of *that* would be). Say, two or three.
More than that, and, again, we've got a veritable babel; in that case, I
doubt any one scheme would succeed in categorizing all the articles. Even
two or three is a little confusing: won't "philosophy" be a category in
any plausible scheme? Similar with other traditional subjects. So how
will the competing category lists (not schemes, really) be distinguished?
On the other hand, if we can agree in advance on one set of categories,
then, *for the purpose of sorting articles into broad academic fields*
(which, as I said, seems like a clear, reasonably useful purpose), we can
*work together* on sorting all the articles. That would be a good thing:
it could be a useful, accurate piece of metadata.
I think it would be nice if we could have different
"category namespaces", to support multiple category
schemes. There should also be a way to lock cateogy
namespaces: so I can have my own category namespace,
and only I am allowed to assign pages to categories
within it; or so (like Larry seems to be suggesting)
people can't create their own categories, but they can
assign pages to pre-existing ones.
I'm not sure I understand, exactly, but is the idea here somewhat like the
one I suggested above? Viz., we put the metainformation about categories
not on article pages but on special categorization pages?
I also think that information on "what pages
this
category belongs to" belongs in the category, not in
the page (although I don't know how you were actually
installing this.)
Aha, yes. Might be better, yes.
Finally, even if we don't want this sort of
feature
for Wikipedia, why not keep the code, but make it an
administrator configurable option? So if Larry & Jimbo
don't want to use it on the
Wikipedia.com server, they
can switch it off, but if other people want to take
PHP Wikipedia's code to use for some other purpose
they can turn it on if they want to?
I'd be 100% in favor of something like that.
I really don't like to sound contrary (really, I don't!), but I think that
whenever we propose new features that could potentially complicate the
process of building Wikipedia, and that could be abused or misused
(resulting in confusion if nothing else), we should think more carefully
about what we are doing and why we're doing it, exactly.
So far, along these lines, there are *two* actual reasons that I have
spotted for why we might want to do *one specific* thing, viz., list
(somewhere) the metadata about what *general categories* articles are
classed under. To wit:
(1) It would help sort the Recent Changes page nicely, so that specialists
can, if they want, focus just on articles in their areas. (Others could
view all categories at once.) If this were all we needed to accomplish,
then we might as well sort *edits* into categories, not *articles*.
(2) It would allow us to produce a list of all the articles in one broad
area of study, which would no doubt be useful for a variety of purposes.
For this, of course, we need to sort articles, not edits.
Now, there are a variety of ways we could accomplish both of these
purposes. The best I think we've heard so far would work like this:
On each article page, there is a multiple-selection box that allows us to
place articles into one or more categories from among a set of categories
that is previously decided upon by Wikipedia members and probably Nupedia
as well (it would be nice if the categories corresponded to Nupedia review
groups). This particular datum is editable like anything else in the
article. There is *also* a set of pages sorting existing articles into
these categories based on the metadata found on the articles; these might
or might not be editable. There is, as well, a page or several of
unsorted articles; from that page one could visit different pages and sort
them quickly.
The latter proposal would accomplish purposes (1) and (2) as follows: the
metadata would allow us to sort the Recent Changes page so we can view
only those categories of articles we're interested in; it would also allow
us to generate (and further organize, perhaps) broad categories of
articles. One can see an autogenerated Wikipedia Encyclopedia of
Mathematics, for example!
Larry