[Wikipedia-l] "Wanted pages" fix
Jan.Hidders
hidders at uia.ua.ac.be
Mon Jul 8 21:43:03 UTC 2002
On Mon, Jul 08, 2002 at 01:03:45PM -0700, lcrocker at nupedia.com wrote:
> One of the things I found was that the present query for Wanted pages
> counts only the distinct pages with a broken link to the wanted page,
> even if that page has two or more broken links to the same title. It
> seems to me that's not important--I'd just as soon have it count all
> links so I know how many to fix, and that's just as good a metric
> of "wantedness", I think. And it's not very different in any case--
> multiple broken links to the same title on one page are rare.
>
> Changing it to count all links speeds it up quite a bit (from 30-40
> seconds to 6-7).
There's two things I don't understand here. At the test site this query now
seems to take around 7 seconds. So did you already implement this there?
Secondly, I don't see why counting links should be easier unless you are
doing something really weird like allowing duplicates in the table
'brokenlinks'. You are still using this table, are you?
> Also, I'm throwing away all wanted pages with only
> a single link--that reduces the size of the temp file needed for
> sorting by number of links.
Temp file? Are you now doing the sorting yourself?
-- Jan Hidders
More information about the Wikipedia-l
mailing list