Tim Landscheidt schrieb:
Daniel Kinzler daniel@brightbyte.de wrote:
In the meantime: http://toolserver.org/~magnus/catscan_rewrite.php
(toolserver seems to have a problem ATM, though...)
Yes, lots more options than my old thingy, thanks magnus :) but still bound to recursive calls to the database, which is what i really want to get rid of. the lookup needs to be snappy.
Is there any reason not to have a flatted structure some- where on the toolserver (or, in the long run, in MediaWiki)? A quick look at recentchanges for dewp shows about 22000 changes per month, about one every two minutes. With about 80000 categories in all, it should be feasible to up- date the structure incrementally, with daily/weekly/monthly clean new full "dumps" (or even dispense with up-to-the-se- cond data and just dump the flat structure hourly).
Basically: yes, this is the idea, but detecting categorization changes isn't trivial. also, really keeping a copy of the flat content of each category would be redundant to the extreme. it would result in hundreds of millions of entries, and would be hard to handle. a data structure for fast recursive lookup makes more sense. Neil is working on this.
As to the general approach: I hope that by providing a way to intersect categories, we can get rid of most of the "Foo in Bar" cross-section catgories. I still believe hierarchical structuring/inclusion of categories is useful. Or, to put it differently: let people use "flat tagging", but let's keep the notion of one tag implying another, i.e. math implying science and texas implying america.
-- daniel