Re: [Wikitech-l] Category Intersections - some early testing - Wikitech-l

1 Oct 2006


      Timwi  wrote:
...
...
I thought I had mentioned before (maybe not to you) that this query is
unnecessarily inefficient (because it scans both categories completely).
Why do you use this adventurous approach when there is a much simpler
and more obvious one, which also happens to be faster?
SELECT c.cl_from FROM categorylinks c WHERE c.cl_from IN (
        SELECT c2.cl_from FROM categorylinks c2
        WHERE c2.cl_to='Fantasy_films'
) AND c.cl_to='Disney_films'
Showing rows 0 - 8 (9 total, Query took 0.9231 sec) <-- took longer than the
count/group by query
SQL query: SELECT c.cl_from
FROM categorylinks c
WHERE c.cl_from
IN (
SELECT c2.cl_from
FROM categorylinks c2
WHERE c2.cl_to = 'Fantasy_films'
)
AND c.cl_to = 'Disney_films'
LIMIT 0 , 30
I tried your way before, an a smaller data set, and got basically identical
results for your way and mine.  Trying your way on the larger dataset (copy
of en), it appears it may be slower than mine.  I think I recall that when
you tested it, it was in MSSQL, and not MySQL.  If the efficiency is equal,
I like this approach becuase it scales seasily to intersections of more than
two cateogories.  It looks like the count/group by approach may be faster in
MySQL, based on this example.
Aerik