(At least, I hope it is a major new thread!)
Well, we've argued for a few days, and a lot of ideas have been thrown around, and the tension between various competing principles and/or ideals has been explored fairly effectively.
So I wonder if we could work towards some consensus that we can all (or nearly all) agree on. What steps might we take that all parties to the discussion might agree on?
Can everyone chip into this thread with accomodative ideas that you think everyone might agree with?
By way of example, in the "fair use" debate, the one thing that all parties could agree on is the importance of prioritizing the _tagging_ of images with their status. Whatever might end up being done, even if -nothing much- is the answer, that tagging is something that no one has objected to.
So what might we do in the content metadata arena that no one (or nearly no one) would object to?
I think that there is broad support for a categorization system which is broad and flexible, and not necessarily aimed at content advisories, but which might, in part, be used to address that issue as well.
Here's what I propose -- and I'm talking at the level of policy, not at the level of technical implementation, because there's a lot more to be thought about and said in that regard -- is that we move towards the implementation of a content metadata or categorization system with the following features:
1. Categories should be non-normative in nature. That is, "mature content" is an invalid category, because it suggests a value judgment that we want to leave to the end user. "Explicit sexual content", while still perhaps vague in some respects, is at least non-normative... it might be a good thing or a bad thing, depending on your purpose.
2. Categories should be infinitely wiki-editable, especially in the beginning, because "a priori" categorization is impossible and likely to lead to a lot of problems. Ideas that only sysops can create new categories should be avoided until we actually determine empirically that it's necessary. (One thing we know from our wiki experience is that something as obviously insane as letting anyone in the world edit works amazingly well!)
3. We should quite possibly accept _as an editing principle_, a sort of Ockham's razor for categories -- not multiplying them needlessly, while at the same time, not hesitating to let people experiment with categories as they see fit. But, we'll try not to fight about it too much, especially at first.
4. The categorization system should be simple -- i.e., articles can be tagged with as many categories as we like, and that's that. They are not required, and if people want to work on them, they can, and if they don't want to work on them, they don't have to do so.
5. Especially initially, website impacts of categories should be very minimal... i.e. we don't try anything radical regarding filtered searches or automatic index pages or anything too exciting like that.
--------------
If we did that, we'd be introducing nothing harmful, and doing something positive for *multiple* purposes, not just the "content advisory" purpose. (See below...)
And we'd get to find out, in an experimental environment, whether categorization prompts massive flamewars, etc.
And then in a while, we can revisit the issue, and see what we think could be done with these categories.
--Jimbo
p.s. Examples of useful categories that are non-normative:
biography math statistics graphic sexual content advanced mathematical content gay and lesbian studies European history American history U.S. Presidents sports tennis
All of these might be useful for content re-users who might like to extract a subset of our data. For example, an "Encyclopedia of Sexual Practices" might want to extract just those articles flagged with 'graphic sexual content' or 'sexual content'. A "edupedia" project could automatically extract articles that avoid certain topics, or focus on certain other topics.
----- End forwarded message -----
Jimmy-
all of the requirements you describe would be met nicely by my [[Category:Foo]] scheme. :-) So far nobody has done any work implementing it, and I won't do much coding for the next couple of months, but it seems like a reasonable solution. I would like to add that any category system should support hierarchies, to avoid unnecessary redundancy. E.g. when you have [[Category:US president with a beard]], you don't need to flag the same article with [[Category:US president]], because the former is a child category of the latter.
Regards,
Erik
On Wed, 25 Jun 2003, Toby Bartels wrote:
Toby Bartels wrote:
Erik Moeller wrote:
all of the requirements you describe would be met nicely by my [[Category:Foo]] scheme. :-)
I would just like to not here that I'm in full favour of Erik's scheme.
^^^
s/not/note
You're so contrary, Toby! Don't deny it...
(For the record, I also favor an Erik-style madcap scheme with [[Category:foo]] links, based largely on improved backlinks magic.)
-- brion vibber (brion @ pobox.com)
OK, if you're all in favor of that, I might be willing to code (what a surprise!). I deeply regret, however, that I can't remember the details of Erik's scheme.
Before you go looking for the original mail, was it something like this: * A [[category:foo]] *link* in an article works similar to a language link (shows somewhere in the surrounding framework rather than in the text) * The [[category:foo]] *article* has a description of the category, and some kind of automated list of backlinks.
If Erik suggested something completely different, please find that mail for me...
Magnus
Brion Vibber wrote:
On Wed, 25 Jun 2003, Toby Bartels wrote:
Toby Bartels wrote:
Erik Moeller wrote:
all of the requirements you describe would be met nicely by my [[Category:Foo]] scheme. :-)
I would just like to not here that I'm in full favour of Erik's scheme.
^^^
s/not/note
You're so contrary, Toby! Don't deny it...
(For the record, I also favor an Erik-style madcap scheme with [[Category:foo]] links, based largely on improved backlinks magic.)
-- brion vibber (brion @ pobox.com)
Wikipedia-l mailing list Wikipedia-l@wikipedia.org http://www.wikipedia.org/mailman/listinfo/wikipedia-l
Magnus Manske wrote:
OK, if you're all in favor of that, I might be willing to code (what a surprise!). I deeply regret, however, that I can't remember the details of Erik's scheme.
Before you go looking for the original mail, was it something like this:
- A [[category:foo]] *link* in an article works similar to a language
link (shows somewhere in the surrounding framework rather than in the text)
- The [[category:foo]] *article* has a description of the category,
and some kind of automated list of backlinks.
If Erik suggested something completely different, please find that mail for me...
Magnus
ABOUT THE CODE
That sounds pretty much like the Erik's proposal as I remember it, and it seems both harmless and non-intrusive to existing work. Yay! Let's do it!
There was at one point a proposal for a separate DB field for metadata, like the language and category links. It's a nice idea for the long term, but it's simply too much complexity for now, and would complicate things too much by requiring lots of new UI code. So let's just keep the category links in the main body, and keep them at the top or bottom by convention.
Presumably:
* The description of the category should be in the "page text" for [[Category:xxx]], like some of the special pages. * category page bodies should be able to be edited by anyone, just like ordinary pages, with the same code for protection by admins, deletion, etc. * category page bodies should also be able to contain category links themselves, allowing a directed graph of categories to be created. * there should be a "list of all categories" special page
A comment: * moving category pages does not really make sense in this simple scheme as the names are "hardwired" in the [[Category:XYZ]] links in the back-linking articles * but there's probably no reason to disable this operation...
Once we have some code, the next thing to do is to agree how to use this feature: but this should not hold up writing and releasing the code.
ABOUT USAGE
I expect * lots of arguments about category schemes * pragmatic agreement on the most obvious categories * much useful entry of data according to various schemes * eventual and wonderful automatic indexes, Bayesian auto-classification, etc. etc.
I'd like to suggest a few conventions, before the category wars begin. There are a number of existing category schemes at the moment, such as the Library of Congress system, and SUMO. I suggest that, rather than using these codes raw, we prefix them with their scheme name, thus:
[[Category: SUMO ExistingThingName]] or [[Category: LOC BD-300]]
This will greatly simplify collating them together, or machine-processing them later, and helps forestall the flamewars about Dewey vs. LOC vs. Cyc vs. SUMO vs. ... each set of enthusiasts can simply code away as they like...
This leaves the "no-prefix" namespace ready for the Wikipedia category free-for-all....
My first suggestions:
[[Category:Person]] eg [[Henry Droop]] [[Category:Place]] eg [[Poland]] or [[London, England]] [[Category:Time period]] eg [[1427]] or [[15th century]]
...and then we can use the information from the more sensible [[List of XXX topics]] articles, for mathematics, physics, politics, medicine, rare diseases, etc. to auto-mark the given articles.
If we do this right, we can probably _eventually_ get rid of most of the list articles entirely: ''but'' there is one fly in the ointment with doing things this way: we lose the ability to create links in category lists to non-existing articles. This would be a great pity, as it is one of the most useful ways to create lists of new candidates for articles within a topic.
Suggestions anyone?
-- Neil
Neil Harris wrote in part:
- category page bodies should also be able to contain category links
themselves, allowing a directed graph of categories to be created.
This is one of the most important features of Erik's idea that IMO sets it above the others.
Another is the possibility of category #REDIRECT pages.
- moving category pages does not really make sense in this simple scheme
as the names are "hardwired" in the [[Category:XYZ]] links in the back-linking articles
But this would work perfectly well with #REDIRECT pages -- we shouldn't have to alter the code for moving pages one bit.
If we do this right, we can probably _eventually_ get rid of most of the list articles entirely: ''but'' there is one fly in the ointment with doing things this way: we lose the ability to create links in category lists to non-existing articles. This would be a great pity, as it is one of the most useful ways to create lists of new candidates for articles within a topic.
We can always revive the [[Wikipedia:xxx basic topics]] pages, which have been replaced (and greatly expanded upon) by the bastard category pages.
-- Toby
--- Neil Harris usenet@tonal.clara.co.uk wrote:
If we do this right, we can probably _eventually_ get rid of most of the list articles entirely: ''but'' there is one fly in the ointment with doing things this way: we lose the ability to create links in category lists to non-existing articles. This would be a great pity, as it is one of the most useful ways to create lists of new
candidates
for articles within a topic.
If the category pages are editable, which they should be for various reasons, then people can maintain a list of articles-to-be-written on the very category page.
Axel
__________________________________ Do you Yahoo!? SBC Yahoo! DSL - Now only $29.95 per month! http://sbc.yahoo.com
Jimmy Wales jwales@bomis.com writes:
And then in a while, we can revisit the issue, and see what we think could be done with these categories.
Whatever you do please avoid cluttering the user interface -- there is already much too much meta info and too many links on every page...
wikipedia-l@lists.wikimedia.org