On Wed, 25 Aug 2010, Jodi Schneider wrote:
Ed Summers has done some nice analysis of the top
hosts referenced in article space, based on SQL dumps:
http://inkdroid.org/journal/2010/08/25/top-hosts-referenced-in-wikipedia-pa…
People with more in-depth knowledge might make something of this -- for
instance the importance of bots in external links, or the prevalence of
certain types of information.
A few weeks ago I looked on hosts in the cite news template, and I get
results in line with Ed Summers with news.bbc.co.uk on the top and New
York Times second:
1 14443 news.bbc.co.uk
2 3224
www.nytimes.com
3 2729
query.nytimes.com
4 2675
www.washingtonpost.com
5 1838
www.cnn.com
6 1781
www.guardian.co.uk
7 1584
www.time.com
8 1443
www.telegraph.co.uk
9 1420
www.smh.com.au
10 1278
www.usatoday.com
11 1198
www.abc.net.au
12 1119
www.variety.com
13 1026
select.nytimes.com
14 1006
www.theage.com.au
15 1005
www.timesonline.co.uk
16 987
www.sfgate.com
17 975
sports.espn.go.com
18 969
www.msnbc.msn.com
19 913
findarticles.com
20 904
www.news.com.au
There is a short commentary here:
http://fnielsen.posterous.com/top-news-cites-referenced-from-wikipedia
/Finn
___________________________________________________________________
Finn Aarup Nielsen, DTU Informatics, Denmark
Lundbeck Foundation Center for Integrated Molecular Brain Imaging
http://www.imm.dtu.dk/~fn/ http://nru.dk/staff/fnielsen/
___________________________________________________________________