Domas Mituzas wrote:
Timwi,
Sure, now there's another issue - what happens (well, it already happens with Living People), when people use categories like tags. Simply, amount of big categories increases.
Obviously I know that, hence why I suggested to go for a three-way intersection table right from the start.
When you do an intersection of a small and big categories, you end up with nested loop starting on a small one, checking for entry existence in big one. Then you sort the small resultset. When you do an intersection of two big categories, you end up with nested loop going over two big tables, and a big resultset finally, with big resultset to sort. Right now we try to avoid sorting for categories at all - we use index-based sorts.
If you already know that having the proper indexes eliminates the need for sorting, why did you first claim that any sorting was necessary?
Additionally, I'm sure you're aware that if you intersect two big categories, you end up looping over only ONE big "table" (by which I think you mean resultset) and looking each value up in the index for the other (which makes its size irrelevant).
You haven't really explained to me where you see a problem :)
B+Trees are not that good for intersections
Hence (again) why I suggested to store the intersections directly.
Allow me to repeat that I think the paranoia about the three-way intersection table becoming too large is unfounded.
Timwi