Sorry - here are the queries I was playing with - I had intended to
send a rather more detailed and thought out analysis, but haven't had
time in the last couple of days to play with it much more. These
times are from running the queries on phpMyAdmin on my home computer
that wasn't doing anything else strenuous at the time. As you can
see, I used count... group by instead of join. They could be the
same, inside MySQL, but it doesn't seem like it. I tried doing some
large joins last night, and they never returned any results, but that
could be an instability in my computer - I need to try it again.
Just to get this out of the way: These number are of course, only
relevant to themselves, but I was thinking of comparing the times to
return intersections against the time it takes to do something that is
known on Wikipedia (say, get the first 200 results of the
"Living_People" category?) and try to extrapolate a meaningful
estimate.
Showing rows 0 - 15 (16 total, Query took 0.3316 sec)
SQL query: SELECT cl_sortkey, count( * ) AS catcount
FROM `categorylinks`
WHERE `cl_to` = 'Fantasy_films'
OR `cl_to` = 'Disney_films '
GROUP BY cl_sortkey
HAVING catcount =2
LIMIT 0 , 30
Showing rows 0 - 9 (10 total, Query took 0.4407 sec)
SQL query: SELECT cl_sortkey, count( * ) AS catcount
FROM `categorylinks`
WHERE `cl_to` = 'United_States_Army_soldiers'
OR `cl_to` = 'German-Americans '
GROUP BY cl_sortkey
HAVING catcount =2
LIMIT 0 , 30
Showing rows 0 - 19 (20 total, Query took 0.8908 sec)
SQL query: SELECT cl_sortkey, count( * ) AS catcount
FROM `categorylinks`
WHERE `cl_to` = 'Drama_films'
OR `cl_to` = 'World_War_II_films '
GROUP BY cl_sortkey
HAVING catcount =2
LIMIT 0 , 30
Showing rows 0 - 29 (2,169 total, Query took 36.3109 sec)
SQL query: SELECT cl_sortkey, count( * ) AS catcount
FROM `categorylinks`
WHERE `cl_to` = 'Prisoners_of_war'
OR `cl_to` = 'Living_people'
GROUP BY cl_sortkey
HAVING catcount =2
LIMIT 0 , 30
Showing rows 0 - 1 (2 total, Query took 8.6470 sec)
SQL query: SELECT cl_sortkey, count( * ) AS catcount
FROM `categorylinks`
WHERE `cl_to` = 'Articles_with_unsourced_statements'
OR `cl_to` = 'American_World_War_II_veterans '
GROUP BY cl_sortkey
HAVING catcount =2
LIMIT 0 , 30