[WikiEN-l] Category destruction

Andrew Gray shimgray at gmail.com
Wed May 2 17:54:31 UTC 2007


On 02/05/07, Marc Riddell <michaeldavid86 at comcast.net> wrote:
> Perhaps this is why we are having a problem finding common ground in this
> discussion. I am looking at WP Categorization strictly from the POV of a
> reader.

...from the point of view of a high-level specialised reader who wants
to do somewhat complex searches on datasets, though, not from the
point of view of the average user. Very different beasts.

> I came to your library, went to your catalogue system, and found it
> did not meet my needs as a researcher. You, as the librarian, asked me what
> my needs were, and I told you. Your answer seems to be that we cannot meet
> your needs without revamping the entire system. I say: OK, I'll wait :-) -
> as long as you are serious about doing so.

My problem is this: we cannot revamp the entire system to meet your
needs, in the way you suggest, *without breaking it for everyone
else*; we cannot start increasing the number and size of categories
without making them much harder to use as a navigational tool.

Basically, there's a division here between metadata and search.
Categories are metadata, applied to the individual articles. Opening a
category and looking at it is a very crude form of search, surprising
though it may seem at a first glance; it tells you all the articles
listed under that particular heading.

You want to do a more advanced form of search, on a larger scale than
simple navigation, producing results like "all rivers in Eurasia" or
"all people who were born in the 20th century". There are two ways to
do this:

a) Add more metadata, but keep the existing 'search'. Start slapping
on more categories onto articles, so that - to pick an example - "1987
births" is joined by "1980s births" and "20th century births"

b) Get a better search, but keep the existing metadata. Develop a
search system that can parse the existing categories, combine them in
various ways, and spit out a list for the researcher.

(a) is quick and dirty, but has problems; it makes it harder to
navigate the existing category system from the point of view of the
normal reader. It also rapidly becomes unworkable if we start thinking
about more complex searches than just "the members of all daughter
categories" - imagine if we had "1980s births of folk singers" and
"20th century births of folk singers" and all the other possible
intersections of the already-existing categories being added to pages!
There are just so many possible category intersections for any given
page... this really won't scale.

(b) is elegant, and scales well. It has the major advantage of being
completely disassociated from the metadata, unlike our existing
'search' method, as it's overlaid on top; it doesn't impact in any way
the existing system beyond perhaps making us streamline and
rationalise it a little.

Unfortunately, it needs someone to go write it. This is what's holding
things up.

> All I am asking, in this age of creative technology, is that someone step up
> and create a system that can meet the needs of as many of our readers as
> possible.

And all I'm asking is that if we're wanting to create a system, we
create a system, we don't try to make the existing one sort-of-useful
for everyone at the price of making it not-really-useful to anyone :-)

-- 
- Andrew Gray
  andrew.gray at dunelm.org.uk



More information about the WikiEN-l mailing list