Sorry, took me longer to get around to this than I'd planned. So, I
restored the 6 million row categorylinks table to a local computer for
testing and threw some sql at it. I got mixed results - in the first
pass I'm using my "count... group by" approach and did different
queries to get the pages at the intersection of 2 categories. I used
at least semi-meaningful categories to try to make the testing at
least somewhat representative of real possible usage.
I got several sets of results in under 1 second (the lowest time being
.3 seconds), one query returned in 8 seconds and another in 36
seconds. I'm going to try re-running the same queries after the query
cache is empty (gotta go learn about how to do that) several times to
see what the repeatability is, then see if I can glean what the long
query times correlate to (intuitively I'm guessing the come from the
intersections of large categories, but I haven't tested that yet even
though it's easy to do). I'll publish detailed results with figures
and actual queries once I've got more data. (Plan to do this tonight
or tomorrow night.)
Best Regards,
Aerik