Re: [Analytics] Backlinks TO Wikipedia

3 Dec 2015

what Greg said, Common Crawl is an excellent data source to answer these questions, see:

http://blog.commoncrawl.org/2015/04/announcing-the-common-crawl-index/
http://blog.commoncrawl.org/2015/02/wikireverse-visualizing-reverse-links-w…

for aggregate stats about referrals to individual articles by traffic and aggregated at
domain level you mail also be interested in this dataset:

http://figshare.com/articles/Wikipedia_Clickstream/1305770

...
  On Dec 2, 2015, at 8:06 AM, Greg Lindahl
&lt;lindahl(a)pbm.com&gt; wrote:

 On Tue, Dec 01, 2015 at 07:50:23PM +0100, Federico Leva (Nemo) wrote:
  Edison Nica, 29/11/2015 16:56:
  how many non-wikipedia pages point to a certain
wikipedia page  
 I guess the only way we have to know this (other than grepping
 request logs for referrers, which would be quite a nightmare) is to
 access the Google Webmaster account for wikipedia.org (to which a
 couple employees had access, IIRC).  
 There are a couple of other ways to figure out inlinks:

 * Common Crawl
 * Commercial SEO services like Moz or Ahrefs

 In the medium term the Internet Archive is going to be generating this
 kind of link data as part of the Wayback Machine search engine effort.

 And finally, Edison, counting the number of inlinks without
 considering their rank or popularity will probably leave you
 vulnerable to people orchestrating googlebombs. And you might want to
 also know the anchortext, that's extremely valuable for search
 indexing.

 -- greg

 _______________________________________________
 Analytics mailing list
 Analytics(a)lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics 

Dario Taraborelli  Head of Research, Wikimedia Foundation
wikimediafoundation.org • nitens.org • @readermeter

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Analytics] Backlinks TO Wikipedia