Robert Stojnic schrieb:
Aryeh Gregor wrote:
Right. Supporting category intersection and search in category with better UI (we already sort of support it if you know the right magic terms) is what we should be aiming for here.
Last year, just around this time, we came to the exactly same conclusion. And similarly like then, there is no shortage of good opinions on how to do it, but people to actually do the programming.
r.
Wikimedia Germany has contracted Neil Harris to work on implementing deep category intersection. The goal is basically a rewrite of my sucky CatScan tool. The result is hopefully fast & generic enough so it can be used as a service that integrates with the current search infrastructure.
The project has started, there is funding and a project plan. I expect to see usable results soon. In fact, I hope to present this at the developer meeting in april (neil, contact me about attending) and discuss the integration into lucene search.
I agree that full recursive flattening of the current category structure leads to bad results some times (especially on the english wikipedia, commons is quite bad too), a depth of 5 however is generally useful. One common use case is intersecting a content category with a maintenance category, for organizing editorial work in a wiki project. In that case, at least one category comes from a template.
Atomic categorization aka tagging however also sucks: the tags are either too generic (so it's hard to find stuff) or too specific (you never know what to search for). tags implying/including other tags is very useful. which is exactly what categories with deep intersection will provide.
-- daniel
On Thu, Feb 4, 2010 at 1:50 PM, Daniel Kinzler daniel@brightbyte.de wrote: [...]
Ok, all that sounded oddly familiar...
Atomic categorization aka tagging however also sucks:
Well, I certainly would not say it sucks. After all _every_ major image library uses it. Will it be perfect? Probably not, but perfect is the enemy of good enough ;-)
the tags are either too generic (so it's hard to find stuff) or too specific (you never know what to search for). tags implying/including other tags is very useful. which is exactly
I do not see this problem at all. In my example above we would have _both_ specific (Normandy, Guernsey) and general (France) tags. Search for what ever you like and narrow down using intersection. How can you not know what to search for? This is a problem we have _now_! Out categories are ridiculously specific. Going atomic will only make this situation better in this respect.