Indexing - Wikitech-l

28 Feb 2008


      With the advent of __HIDDENCAT__,  I've been wondering about using 
hidden categories to create indexes.  My initial hope with Wikipedia was 
that we could reorganize categories so that categories could function as 
broad indexes of single attributes such as "People", "Films", "Bridges", 
etc... and hide all the intersection categories of parents.  Later, if 
and when category intersection was implemented, all the hidden 
categories would no longer be needed.  However, implementing major 
changes seems to be near impossible in a project as large and set in its 
ways as Wikipedia.  There is just too much resistance to change.  If 
category intersection was implemented there would be an technical 
compelling reason to make the change, but short of that upgrade, it 
seems like a very difficult -- if not impossible -- sell.
It really bothers me (and others, especially librarians), that Wikipedia 
is not indexed.  You cannot find a master index of People, places, 
books, films, etc...  To find anything you have to know in advance, 
where it is subcategorized.  This only works if you know where to 
browse, and it is your desire to only browse in a small well-defined 
place.  One of the big joys of libraries is the ability of finding 
things you didn't know about in broad swaths of knowledge.  This ability 
is often lacking in Wikipedia because of categories being constantly 
broken into smaller pieces.  For example, If I want to browse through 
the bridges in Europe, I have to look at a category for each country 
separately, and in some countries (like the UK) I have look at one for 
each county.  It is just too difficult and time consuming a task to be a 
pleasurable leisurely browse.
So I've been thinking of alternative approaches.  One possibility is to 
use hidden categories to create index categories.  For instance, 
[[Category:Index-Films]] could contain all films, 
[[Category:Index-People]] could contain all people, etc...  However, 
this would be difficult to maintain because the categories would be 
hidden, and it would take a tremendous amount of work to populate these 
categories. It seems crazy to have people doing all the mindless 
busywork necessary to create categories like these.  That is why we have 
computers.
This is where developers come in...
I'm wondering about creating a new namespace, called (you guessed it) 
INDEX.  Any category of people could be put in an index by adding 
[[Index:People]] on the category page.  The "People" INDEX page, into 
which the category get put, would have links to all the articles and 
subcategories  from the categories in the INDEX.  The contents of the 
subcategories of those categories would NOT be added automatically.  
Each would have to be manually added to the index if appropriate.  Just 
like a category there would be text that could be edited for each INDEX 
page.  So in essence, an INDEX is a way to do category unions.  This 
would be much, much easier than trying to create and maintain these 
indexes manually using categories.
It would be great if an INDEX page could be viewed two different ways 
(and easily switched).  The first way would look similar to current 
categories, showing a category tree at the top, and all the articles 
below arranged alphabetically.  It would also be great to see categories 
viewed hierarchically, like an index in a book.  So the categories would 
be listed alphabetically and then all the subcategories and articles in 
the categories would be listed together alphabetically and indented.  
The categories could be differentiated by either making them bold, 
italic, or by labeling them as categories.  If the subcategories have 
also been included in the index, their contents would also appear 
indented in one more level (this could be closed at first and opened 
using a "+, the same way category trees look.  Users might also be able 
to set the default number of levels that appear -- perhaps two?).
I don't think there is any need to be able to add anything but 
categories to an INDEX.  Adding anything else would probably make it 
harder to maintain the INDEX, and would probably confuse newbies.  Of 
course, you should be able to create a link to an index page by typing 
[[:Index:People|Index of people]].
If you think this idea has merit and is a possibility, would it be 
difficult to implement?  It has long been my understanding that category 
unions would be much less server intensive than category intersections.  
Perhaps each INDEX display process could be done dynamically?
Thanks,
Samuel Wantman
[[en:User:Sam]]