On 10/5/05, Neil Harris <usenet(a)tonal.clara.co.uk> wrote:
Neil Harris wrote:
The corpus-based measures are particularly
interesting; they mean we
don't need to bug Google for a million search keys.
Although if anyone from Google is monitoring this list, and wants to
give me a Google Account with 1.25M search keys, I'd be happy to set off
the appropriate script... or send it to you to run.
In any case, the number of results reported is a very approximate
estimate. See, for example,
http://blog.outer-court.com/archive/2005-02-08.html#n72
I think it'd be much easier to use a standard measure of usefulness:
look at access logs on wikipedia's end. If article A gets twice the
number of hits per day as article B, it would seem natural that
someone would be twice as likely to look it up in a paper-based
encyclopedia. (There are certainly exceptions like hot news stories
or controversial topics during a revert war, but I think it'd take you
a long way...)
I like Neil's list too, but that, as they observed, is a lot more work.
-- Evan, monitoring this list :)