Wikimedia-l September 2018

wikimedia-l@lists.wikimedia.org

72 participants
51 discussions

Re: [Wikimedia-l] Responses to en-wp sourcing question
by sashi 01 Sep '18

01 Sep '18

Hi Adam, Pine, Robert, Thank for the suggestions! In particular, Adam's link to Ford, etal., where I read: -- We used Apache's Map Reduce framework on Amazon's Elastic Map Reduce (EMR) cloud computing infrastructure to efficiently extract the history of references to all articles. That sounds like power tools! I've been using more of a clunky bucket chain procedure which only captures part of what has "stuck" in the river. (They downloaded a corpus with all its deletion history.) We do reach some of the same conclusions, but the data are very different.... (twitter & facebook weren't quite as weighty back in 2012, for example). That said, I suspect it would be much wiser to work on a database dump as they have. The classified version (linked below) is getting more interesting now. Left papers often do better than their circulation figures would suggest, though Brazil & Germany being the notable exceptions. In any case, what's very clear is that on en-wp, *Pitchfork* does much better than the *Poetry Foundation*. >> http://www.creoliste.fr/docs/WikiInSources_cat.pdf << Not to worry, Robert, *Wikipediocracy* barely makes the list... I'll have a look at the research mailing list once I've finished exploring Adam's suggestion, Pine. Thanks to the three of you for taking the time to respond! sashi

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Wikimedia-l September 2018