There's two things I don't understand here. At the test site this query now seems to take around 7 seconds. So did you already implement this there?
Yes, it's in place already. I tested it on my own server first as always, but this and other bug fixes that I'm judging either non- risky or important enough to justify some risk I'm putting on the live server as well so they can get hit with Neil's bots.
Secondly, I don't see why counting links should be easier unless you are doing something really weird like allowing duplicates in the table 'brokenlinks'. You are still using this table, are you?
There is no unique keys in the links or brokenlinks tables, so yes, there are "duplicates" if you want to call them that--but they represent real info, in that if page A links to non-existent page B twice, there will be exactly two records "A -> B" in the table. And of course there are lots and lots of duplicates of the individual "from" and "to" fields, because there has to be.
I could easily enough add a unique key with a combination of the "from" and "to" fields. Is there a good reason to do that? I never access the table by that--I only access it by the individual fields *i.e., there's never any query in which both fields appear in the where clause or an order by or group by).
Temp file? Are you now doing the sorting yourself?
MySQL is doing it, but it uses a temp file ("explain" on the query says so, anyway). And the error we were getting before was the size of the tempfile bumping into a disk quota. 0