[Wikipedia-l] Re:WikiIndex (idea)

James R. Johnson modean52 at comcast.net
Thu Mar 17 19:52:50 UTC 2005


I'd also agree to augmenting the existing category system, so that the
categories are somehow logical to one another, i.e.

On the page on Liam Neeson, having the categories in order

[[Category:Human]]
[[Category:Men]]
[[Category:Americans]]
[[Category:Actors]]

Would place him in a logical tree from human->men->Americans->actors.  It
could be arranged differently, maybe with other Category tags that are
automatically hidden, similar as to those that were earlier suggested:

[[Gender:Male]]
[[Animal:Homo Sapiens]]
[[Nationality:American]]

The Category+ tags would be set in a "All Category Tags" special page, so
they would be translated for each language wikipedia, but be useful across
wikipedias, enabling cross-wiki searching/indexing.  I would hope to see
some basic categories that allow any article to be categorized, and
logically structured in the Wiki-DB, such as: Person, Place, Animal, Thing,
Event (wars, time periods, etc.), then Fictitious/Non-Fictitious, then so
forth. 




I would like to see recursive category listings, so that if an article is in
Cats, Black Cats, then on the category page for "Black Cats", I can click on
Cats to go one category up, and find other colored cats.

Page:
Category: Cats
    Subcategory: Black Cats

------------------

If it gets to more than 3 subcategories down, then have the uppermost
category at top, a vertical ellipsis, then the bottom two categories.  The
ellipsis would be clickable to reveal the hidden categories that are hidden
from view to save page space.

Page: Cats
:
:
    Subcategory: Himalayan Cats in Entertainment
        Subcategory: Cats on Television

------------------------

Something like that.

James

-----Original Message-----
From: wikipedia-l-bounces at Wikimedia.org
[mailto:wikipedia-l-bounces at Wikimedia.org] On Behalf Of Marc O'Morain
Sent: Tuesday, March 15, 2005 7:25 PM
To: wikipedia-l at wikimedia.org
Subject: Re: [Wikipedia-l] Re:WikiIndex (idea)

> > How about this:
> > [[Category:1929 births]]
> > [[Category:German authors]]
> > [[Category:Women]]


Hi all, this is my first mail to the list,

Firstly, a small gripe of mine: The current category system does not
recursively list pages. Consider a Category:Cats, with two
subcategories; Category:WhiteCats and Category:BlackCats. If I go the
the Category:Cats page, I am not shown any pages that are part of 
Category:WhiteCats or Category:BlackCats. Also, someone could make
Category:BlackCats have a subcategory Category:Cats. This creates a
loop in the category hierarchy. Cats -> BlackCats -> Cats.
In my opinion:
- When displaying Category:Cats, all members of this category and
subcategories should be displayed to the user.
- If a user adds a subcategory to a category, wikipedia should check
for loops as a post-condition of the edit, and reject the edit if it
does create a loop.


Now, This thread is about adding meta data (data about the data) to
pages. Broadly speaking, there seem to be two suggestions put forward
here to accomplish the same goal.

1. To use better categories to store data about page.
2. To invent a new, separate way of classifying pages.

There are benefits and drawbacks to associated with each suggestion:

1. To use better categories to store data about page:

This method does not store any information about the links between
data. For example, there are subtle semantic differences between
defining a person as Category:Irish and defining a ship as
Category:Irish.

2. To invent a new, separate way of classifying pages:

This way of adding meta data to pages would be custom designed to
achieve the goal, and would be better than the current category
system. However, new code would have to be written to add this, which
would take time, labour, money and introduce bugs that would need to
be fixed, costing more time, labour and money. The current Category
system exists, and works.  I therefore feel that it would be a better
decision to augment the current category system, than to create a
second system that shares a lot of common features with the category
system. Also, most of the pages in Wikipedia currently have
categories, and if a new system was created, all the existing pages
would have to be tagged (mostly by hand) for the new system.


If I were in charge of Wikipedia, I would have an overhaul of the
current categories, removing the very specific categories, and adding
the ability to perform the basic set of set operations to categories.

For example:
Currently these exists a Category:Irish_Poets. I would delete this
category and add all pages in it to Category:Irish and Category:Poet.
If a user wants list of Irish Poets, he would ask wikipedia for
(Irish) Intersection (Poets).

Also I would create a category Person. All people in Wikipedia would
be part of this category. Also I would create Category:Men and
Category:Women. These would be subcategories of Category:Person. All
people in Wikipedia would be part of either Category:Men or
Category:Women. If a user is looking for articles about people, they
can simply search within Category:Person to find the information that
they are looking for.

Now we must think about data storage. consider the article on Liam
Neeson. Do we mark it as Category:Person and Category:Men? Or do we
mark the page as just Category:Men. If we store it as both we have
redundancy - it can be deduced from the fact that a page is part of
Category:Men that it must be part of Category:Person too, since Person
is a superset of Men. But for Wikipedia to reason that a Man is a
Person takes computation. The fact that a Man is a Person should never
change, so we could make the page as Category:Person automatically
when a page is marked Category:Men. We are storing more information
that is necessary, but future reasoning becomes faster.

Disjointness must also be considered:
Category:Men is disjoint from Category:Women. How can we store this
information in the database? We may have to add a new tag to category
pages:

Category:Men
[[DisjointFrom:Woman]]

(A page cannot be a member of both Men and Women - For this sake of
argument please ignore trans genders, transsexuals etc, this is just a
blackboard example.)

Also, when someone adds the above tag, the page Category:Woman must
automatically be marked as [[DisjointFrom:Men]].

What happens when a wikipedia user edits the Liam Neeson page and adds
Category:Women? Should wikipedia search what categories have been
marked as disjoint (this is computationally expensive operation), and
not add not allow the edit?

Also, with the category system we could save Categories as virtual
Categories. For example, consider that a VCategory is a virtual
category:

VCategory:Irish_Poets would simply be a re-direct to:
Category:Irish (intersection) Category:Poets

What do other people think?

Best Regards,

Marc O'Morain
_______________________________________________
Wikipedia-l mailing list
Wikipedia-l at Wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikipedia-l





More information about the Wikipedia-l mailing list