Okay, so we (or those few of us that spoke up about it) generally agree that
category intersections will be popular. I also have the impression that SQL
queries against the existing table will likely not be fast enough (Domas,
Brion, anyone else care to comment more explicitly?) - based on a few tests
on my wiki (only about 25,000 rows in my categorylinks table) the sql query
alone to get the intersection of 2 categories takes about 2/10 of a second,
so certainly for more categories and more rows it will take longer (anyone
have any suggestions on optimising sql queries for use on the existing
table? I tried a couple of methods, and they took the same amount of time).
So, *if* throwing sql at the existing table is too slow (and it looks like
it probably is), then the next alternative is to create records for the
intersections themselves, so the retrieval query is a simple one. This
would make the write time longer, but save a lot on the many reads. So,
here's a table of how many combinations that would have to be written for up
to 10 categories. Anybody have a histogram of how many categories are on a
given page? I'd also guess pages would tend to get *more* categories once
the intersection tool is available.
categories distinct
combinations 1 1 2 3 3 7 4 15 5 31 6 63 7 127 8 255 9 511 10
1023
So you can see that the number of combinations would get to be quite a lot.
I would propose writing them into the categorylinks table, and making the
existing category pages smart enough to deal with them, rather than writing
them into a new table. I think sorting the categories on a given page, then
concatenating the each possible with some character (perhaps a tab or
newline) between them, and then posting them into the database would work.
Only downside I can think of, is I'd guess it would take a *long* time to
post hundreds of records. I guess I'll try it and see. Limiting the number
of categories in an intersection to 3 or 4 would reduce the number of
combinations very significantly.
Any ideas? Feedback?
Best Regards,
Aerik