I hate to bring up what I know is a repeating drag on the project, but our most important sub-project (COM:TOL of course) advocates removal of content categorisation from useful imagery. I think this misrepresents the broader lack of consensus on the issue - nevermind that its contrary to what I think should be done.
I'd like to see the project brought to heel and just use the usual "keep image in most precise category" methodology. The totally different method is harmful to the project (merely by being different) and by its existence is causing confusion at best (like what I'm reporting now) and downright antagonism at worst.
In fact as far as I can tell *lots* of effort is being wasted by some users doing one thing (according to the TOL rule) and some doing the exact opposite. Is there any way we can get this mess over and done with?
Nilfanion wrote:
I hate to bring up what I know is a repeating drag on the project, but our most important sub-project (COM:TOL of course) advocates removal of content categorisation from useful imagery. I think this misrepresents the broader lack of consensus on the issue - nevermind that its contrary to what I think should be done.
I'd like to see the project brought to heel and just use the usual "keep image in most precise category" methodology. The totally different method is harmful to the project (merely by being different) and by its existence is causing confusion at best (like what I'm reporting now) and downright antagonism at worst.
In fact as far as I can tell *lots* of effort is being wasted by some users doing one thing (according to the TOL rule) and some doing the exact opposite. Is there any way we can get this mess over and done with?
Well, you could start by dropping the escalating verbiage, like "brought to heel". :-)
As I'm sure you know, various people have proposed solutions, and they've all been shot down, in many cases by the vociferous objection of one person or another. Given the small number of people participating (while lots of people upload TOL images, far fewer try to organize them), that has always seemed indicative of a lack of consensus, and so nothing changes. It's not clear that pushing a preferred solution via the mailing list is going to somehow create consensus where none exists now.
For myself, after a certain point the arguing starts to demotivate, so I go back to working on my backlog instead, and watch bemused as people ceaselessly move my images around. They only completely delink them once in a while. :-)
Stan
On Tue, Mar 11, 2008 at 1:34 PM, Stan Shebs stanshebs@earthlink.net wrote: [...]
Given the small number of people participating (while lots of people upload TOL images, far fewer try to organize them), that has always seemed indicative of a lack of consensus, and so nothing changes.
[...]
Unfortunately that seems to be always the case when it comes to categorizing. Few people are interested in categorization at all, and even fewer people seem to be interested in discussing it. It is a major problem for Commons I think, but I don't know whether we can solve it.
Bryan
On 11/03/2008, Bryan Tong Minh bryan.tongminh@gmail.com wrote:
On Tue, Mar 11, 2008 at 1:34 PM, Stan Shebs stanshebs@earthlink.net wrote:
Given the small number of people participating (while lots of people upload TOL images, far fewer try to organize them), that has always seemed indicative of a lack of consensus, and so nothing changes.
Unfortunately that seems to be always the case when it comes to categorizing. Few people are interested in categorization at all, and even fewer people seem to be interested in discussing it. It is a major problem for Commons I think, but I don't know whether we can solve it.
Active work is in progress on category intersections in MediaWiki, which will allow us to use categories more like tags - which should solve really quite a lot of our category problems in one swoop.
(The next swoop will be cats/tags with different names for the same category, so we don't have to do everything in English. That's likely to be much easier than getting category intersections to work well in MySQL ...)
- d.
Unfortunately that seems to be always the case when it comes to categorizing. Few people are interested in categorization at all, and
IMO there is no point in discussing it at the present time. I've tried it and failed.
We have to make categories useful for commons, and that means turning them into a tagging system (what they technically already are). And that can only be accomplished by getting category intersections to work. That would also mean getting rid of the "most precise category" paradigm.
* Who needs a category "Bridges in Swizerland" when you can tag images with "Bridge" and "Switzerland".
* Who needs small browsable categories when you can have small browsable intersection results (and galleries for that matter)
So when we get the infrastructure we should have the discussion, and create a stringently enforced policy on the issue
We have to make categories useful for commons, and that means turning them into a tagging system (what they technically already are). And that can only be accomplished by getting category intersections to work. That would also mean getting rid of the "most precise category" paradigm.
- Who needs a category "Bridges in Swizerland" when you can tag images
with "Bridge" and "Switzerland".
- Who needs small browsable categories when you can have small browsable
intersection results (and galleries for that matter)
For a tagging system, we also need to change our content providers, the users. I dare say that they can't categorize and they will not tag in a sensible way. That bridge in Switzerland would probably be tagged with "Ginevra", "holiday", "aunt", and "Isabel" (plus a few prepositions). Theoretically tagging is a great system but in the hands of a gazillion amateurs it will most likely lead to missing images, not found ones.
How is categorizing different, you may ask. Categories help us find images that are not categorized well enough and improve them. To be able to look for images which are not tagged well enough is a different matter.
sensible way. That bridge in Switzerland would probably be tagged with "Ginevra", "holiday", "aunt", and "Isabel" (plus a few prepositions).
You cannot be serious!?!
Theoretically tagging is a great system but in the hands of a gazillion amateurs it will most likely lead to missing images, not found ones.
Yeah, and a Wiki doesn't work in theory either, fortunately it works in practice. Have a little more faith in our users. We don't have any "Holiday" or "Aunt Isabell" categories either. Claiming this will change when we use categories as tags is completely unfounded speculation (or FUD).
sensible way. That bridge in Switzerland would probably be tagged with "Ginevra", "holiday", "aunt", and "Isabel" (plus a few prepositions).
You cannot be serious!?!
Well, I wasn't quite. But have a look at http://commons.wikimedia.org/wiki/User:Orgullobot/Welcome_log To say that 1 of 10 new users get it right - even nearly - is probably exaggerating.
Theoretically tagging is a great system but in the hands of a gazillion amateurs it will most likely lead to missing images, not found ones.
Yeah, and a Wiki doesn't work in theory either, fortunately it works in practice. Have a little more faith in our users. We don't have any "Holiday" or "Aunt Isabell" categories either. Claiming this will change when we use categories as tags is completely unfounded speculation (or FUD).
To make my point clearer: what we have almost works because errors and omissions (plenty of which are made) are easier to find. With tags, not so.
To make my point clearer: what we have almost works because errors and omissions (plenty of which are made) are easier to find. With tags, not so.
I see what your point is, but there are ways to avoid your horror scenario :-)
* Watch categories/tags using the category rss feeds to monitor new additions.
* Sort category namespace pages to list new additions first
This would make them manageable even if doe to less specific tagging the category pages would fill up with thousands of images. An alternative would be to keep using precise categories, but take all their toplevel categories into account when doing intersections!
On 12/03/2008, Daniel Schwen lists@schwen.de wrote:
We have to make categories useful for commons, and that means turning them into a tagging system (what they technically already are). And that can only be accomplished by getting category intersections to work. That would also mean getting rid of the "most precise category" paradigm.
- Who needs a category "Bridges in Swizerland" when you can tag images
with "Bridge" and "Switzerland".
- Who needs small browsable categories when you can have small browsable
intersection results (and galleries for that matter)
So when we get the infrastructure we should have the discussion, and create a stringently enforced policy on the issue
Yes, WHEN we get the infrastructure. When we see what it looks like and how responsive it is. When we see how usable it is.
From the search box - how do I get from typing "bridges switzerland"
in the search box to [[category:bridges in Switzerland]]? Currently it's the #1 search result.
From a similar file - when I'm viewing
http://commons.wikimedia.org/wiki/Image:Koblenz_Aare-Rhein.jpg, which is in [[category:bridges in Switzerland]], if in future it has [[category:bridges]] and [[category:switzerland]] (and a whole lot more besides), is there some kind of link for every conceivable pairing among those tags, where I can view their intersection?
I suppose we can javascript-up something on the search page so you put a tag in each of two boxes and that automatically looks up Special:Categoryintersection/foo+bar or whatever.
cheers Brianna
On Tue, Mar 11, 2008 at 8:28 PM, Brianna Laugher brianna.laugher@gmail.com wrote:
Yes, WHEN we get the infrastructure. When we see what it looks like and how responsive it is. When we see how usable it is.
...
Perpetual catch-22.
Finding things using category intersections is totally useless when things have been reduced to a single category, ... and not very useful when the categories applied are "Bridges made of stone in the Ukraine with a rated capacity greater than 200 poot"
The idea of 'a category implies its children' isn't possible in high performance solutions because it would make it possible for a single user action to have to touch millions of elements of data. (Cat:C contains all plaints, C is put into A, now the internal references for A have to have all plants added)... while a pure tagging system is either O(1) or O(number of cats on an image) operation per update depending on if the data is forward or inverse indexed.
There is no huge technical challenge in making ultrafast intersections. Any full-text indexing system can do it. I put up an example external system built on postgresql some months ago that could intersect/exclude *any* mixture of categories in a few milliseconds I turned it off because no one was using it and one day MSN's web spider decided to enumerate all possible intersections. ;)
I don't see why anyone is going to have much interest in building fast intersection systems which will have limited usefulness given how many users believe we should be using categories. ... so if people will only *consider* possibly changing their behavior after the system is fully formed and production ready, we seem to have a stalemate.
(Ha, Enwp can keep their inclusionists and deletionists ... At commons we have intersectionalists and classificationists!)
.
Yes, WHEN we get the infrastructure.
Yep, that's what I said.
From the search box - how do I get from typing "bridges switzerland" in the search box to [[category:bridges in Switzerland]]? Currently it's the #1 search result.
Heh :-), good example. That category doesn't show me railwaybridges in Switzerland! Or what if I'm looking for the Landwasserviadukt (a famous 'bridge' in Switzerland), or what if I'm looking for pictures that show a Krokodil (train engine) on a bridge in Switzerland (0 search results).
Either the search has to be improved, or, and I very much prefer this: a ''semantic'' (and machine readable) retrieval system should be implemented. Categories perform poor in this respect, tags would be perfect.
P.S.:Gregory, evaluate parent and grandparent (etc.) categories too for CI