[WikiEN-l] Stats on most frequently linked sites

Anthony wikilegal at inbox.org
Tue May 1 13:17:46 UTC 2007


On 5/1/07, Angela <beesley at gmail.com> wrote:
> > http://www.online-utility.org/wikipedia/top_500_websites_wikipedia.jsp
>
> Unless I'm misreading something, that's not even close to accurate.
> They claim Amazon had only 1721 links in Wikipedia in November 2006,
> but according to Special:Linksearch they had over 19000 links a month
> before that and they have over 25000 now.
>
How do you get Special:Linksearch to give you the count?  Just
increase the view size until it fits?

I parsed the external links table (using zcat, as it's over 2 gigs
uncompressed), and managed to extract 14181297 links before my script
broke.  I realized a problem though: I'm really only interested in
links from the article namespace, so I've gotta download and parse
another table to get that.  Anyway, from my count www.amazon.com had
22985 links, so I guess my script broke before I got them all.  If you
want any more information contact me.  Here's my table of the top 20:

6122405 | en.wikipedia.org
0642644 | www.google.com
0349654 | wikimediafoundation.org
0322938 | tools.wikimedia.de
0155488 | www.britannica.com
0121251 | www.bartleby.com
0110458 | encarta.msn.com
0108980 | scienceworld.wolfram.com
0095577 | www.imdb.com
0073350 | maps.google.com
0064746 | creativecommons.org
0061350 | www.rhaworth.myby.co.uk
0057467 | news.bbc.co.uk
0056602 | local.live.com
0051487 | www.nlm.nih.gov
0044527 | www.findagrave.com
0038422 | babelfish.altavista.com
0034436 | www.wikimapia.org
0033752 | terraserver-usa.com
0031066 | topozone.com



More information about the WikiEN-l mailing list