Hi All, I was really hoping get some feedback on the performance of my proof of concept intersections page at http://aerik.com/wikintersections.php - anybody?
This is using the MyISAM table with categories stored as words in one row per page, fulltext indexed. It was a bit faster and much more consistent on my local machine, but I'd really like anybody interested in intersections to throw queries at it and beat it up - see if this might be an efficient enough solution for prime time.
Of course, some difficult to anticipate factors are that if category intersections are adopted and become popular, we will likely see a movement towards more implied categories ("Americans" and "Actors" instead of "American Actors") and fewer deep categories. I can imagine the effect this will have on the index (fewer keywords with each having more entries). I think this works okay as "+Living_people +Articles_with_unsourced_statements" (two very large catgories) performs well.
Thanks, Aerik
P.S. I've been testing this by clicking the links, then picking some other existant article from the wikipedia entry at the previous intersection. I have a lot of noise in my result time, so in a pure form, I think the approach is good with results often coming in in less than .5 seconds, but sometimes the same query, or a query of similar complexity will come in at 2 seconds. I don't know how to extrapolate this to a theory of how it would perform on the live servers.