Aerik,
Very interesting, but what puzzling results. I tried intersections of some of the largest categories I could think of and got wildly different results. The first time I tried American actors intersected with Living people the results took 1.5 seconds the next time I tried it it was 0.0011 seconds. What does this mean? Is the long result because of server traffic? Is the short result close to the time that it actually took? Are the results being cached? Also, the intersection of large categories with numerous articles common to both seems to take longer than categories of similar size with no common articles. This is not what I expected. If you are just looking for the first 30 results, I'd expect the first case to be faster. The second has to look through the entire category for a match.
My question to developers is what is the criteria for an acceptable solution? How much time can an intersection be allowed to take? How often do we think people will be looking for intersections? Can we save the results of a popular intersections (like American people intersected with Actors) so that the intersection is updated less frequently?
User:Rick Block, User:Radiant! and I have been discussing ways to optimize or limit intersections so that server load will be less of a problem and hopefully doable. Please see our discussions at [[en:User talk:Radiant!]] Also, if anyone hasn't seen it, we've done quite a bit of work on possible user interfaces. These are at [[en:Wikikpedia:Category intersection]].
There are many ways intersections can be constrained to put less of a load on the servers. I'm hoping that we can implement this at some level soon. The categorization system is being turned into a database search system. Instead of having broad rich indexes for each category, everything is being chopped into tiny bits. Look at [[:Category:Battles of the American Civil War]] for an example of this. The battles have been broken into so many tiny pieces that there is no way to browse through all of them. It is becoming harder to browse through many topics and over-categorization is increasing. Implementing category intersection will mean that users will be able to browse through categories at whatever level of specificity they desire. The longer we wait for this to be implemented, the more broken up categories will become (at least on en: German Wikipedia doesn't seem to have this problem).
Thanks, and keep up the great work.
Samuel Wantman [[en:User:SamuelWantman]]
wikitech-l@lists.wikimedia.org