In short, our category system is completely screwed up.
I've complained about this in the past, but it only seems to be getting worse.
If you follow the category hierarchy you'll find over 26,000 sub*cats* of Category:GFDL, for example.
On the mean number of decedents of commons categories is 110 subcategories after full expansion.
There are 29 categories from which you can reach at least 90% of the total categories used on commons.
Just a list of all the categories and their children is over 400mbytes of text.
Limiting the depth of traversal doesn't work because doing so would result in terrible brokenness, such as randomly failing to return a particular flower picture when someone searches for "flowers" just because someone decided to split Category:White Flowers into "Grey flowers" and "light white flowers" thus moving some of the images a level too deep.
Many of the broken linkages aren't especially deep.
This is brokenness which goes far beyond the semantic drift issues that I've argued make implicit category assignment fundamentally flawed. (although some of it is through semantic drift, cat:astronomical objects includes everything on earth through a completely useless but piecewise sensible path)
What I need to know is: What do you need from me in order to fix this?
I can provide, via JS insertion of data from a toolserver tool, a list on the category page of all the children of that category. However, which many cats with over 1000 decedents... you might be waiting a while for some pages to load.
Alternatively I could provide. via the same means. a list of the children of a category at the top of the category (working around the subcats display complaint in bugzilla).
I can provide reports such as what are the deepest linkages which carry the most children, and which categories have the most descendants.
What do we need to fix this? I've already given my solution (apply all categories that make sense to images, and only use the hierarchy to help with finding and suggesting cats).
Some background:
I recently put up a tool which enables very fast intersections of commons categories. The same back end can be used for blindingly fast geographic searches, and text searches. http://tools.wikimedia.de/~gmaxwell/cgi-bin/cattersect.py
It's already a useful tool for finding images, but it's substantially limited by the fact that categories are broken into tiny groups and the tool can not dynamically walk the category hierarchy.
I've long argued that we need to avoid the category hierarchy issue by applying all applicable categories directly to the image, and reserve the hierarchy for pure category maintenance purposes.
I've realized that despite the merits of my position (argued elsewhere), it simply isn't going to get traction in our community.
So in the interest of making progress I started working on finding a solution to making the tool support following the category hierarchy. I've got something which I think will work fairly well, ... it doesn't break live update for image data, though it does require batch updates of the category tree data. ... I would have had it up tonight were our category data not so badly broken.