Re: [Wikiquality-l] Using PageRank to ascertain quality (Foundation help needed!)

9 Nov 2007

As another note, quality can be typically associated with something  
of high value. The problem is that
value is vague and a subjective concept. Determining the value of a  
particular object\page to a
particular individual is impossible at best. However, there are  
numerous proxy features for value.
One such feature is popularity and that's what the page rank  
algorithm gets at. Another such feature
is the number of readers who navigate to a page -- this is sort of  
like popularity except that it also
encompasses the many eyes principle (the more people see an article,  
the better it will become (i.e.
it will become of higher quality)).

Of course this may be disputed, but if you think value is what you're  
really trying to get at, a potential
direction is to go with page views (e.g. as do Priedhorsky et. al. in  
"Creating, Destroying, and Restoring
Value in Wikipedia." -- http://tinyurl.com/269lpq ).

ivan.

On Nov 9, 2007, at 2:08 AM, John Erling Blad wrote:

...
  Interesting reading!

 I believe that a correct evaluation of article quality must be  
 combined
 with writers reputation and most likely also how the writer interacts
 with other users. The article itself also don't exist in a vacuum, as
 you suggest in the final notes about the PageRank algorithm. Incoming
 links are very useful for evaluating the article quality, but it takes
 time for them to emerge. It is therefore highly likely that it will be
 necessary to use different approaches to asses the article quality,  
 not
 only given the category for the article but also given the age of the
 article.

 A lot of those measures will interact. For example person A writes the
 article but has previously written articles that don't rate as very  
 good
 due to factual errors. He does although write good English (most  
 likely,
 thats not me.. ;) Now person B writes rotten English (oh, thats  
 me!) but
 writes factual correct articles. Because of his very bad English other
 contributors reverts his edits or rewrites them. Both of these two
 persons will rank very badly, and their articles even worse. Still,  
 when
 they team up they can produce excellent articles.

 When I first started to look into estimating writers reputation and
 article quality I expect to find some fairly obvious features to use.
 What I did find was that there was several connected systems, and that
 all of them (at least the most prominent ones) should be taken into
 account. Still there will be a fairly large number of erroneous
 classifications.

 John E

 Brian skrev:
  Several collaborators and I are preparing to
expand on previous work
 to automatically ascertain the quality of Wikipedia articles on the
 English Wikipedia (presented at Wikimania '07 [0]). PageRank is
 Google's hallmark quality metric, and the foundation actually has
 access to these numbers through the Google Webmaster Tools  
 website. If
 a foundation representative were to create a Google account and  
 verify
 that they were a "webmaster," they could download the PageRank for
 every article on the English Wikipedia in a convenient tabular  
 format.
 This data would likely serve as a fantastic predictor. I would also
 like to compare the Google-computed PageRank to the PageRank computed
 via Wikipedia's internal link structure. I don't see any privacy
 implications in releasing this data. It also doesn't seem to help
 spammers much, as they already know the pages that have a very high
 PageRank, and we include rel="nofollow" on outbound links.
 Nonetheless, I would of course be willing to keep the data private.

 This would only take a few minutes if it were approved. Is anyone out
 there who has the power to make it happen?

 Cheers :)
 Brian

 [0]
 http://upload.wikimedia.org/wikipedia/wikimania2007/d/d3/ 
 RassbachPincockMingus07.pdf

 --------------------------------------------------------------------- 
 ---

 _______________________________________________
 Wikiquality-l mailing list
 Wikiquality-l(a)lists.wikimedia.org
 http://lists.wikimedia.org/mailman/listinfo/wikiquality-l

 <john.erling.blad.vcf>  _______________________________________________
 Wikiquality-l mailing list
 Wikiquality-l(a)lists.wikimedia.org
 http://lists.wikimedia.org/mailman/listinfo/wikiquality-l 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Re: [Wikiquality-l] Using PageRank to ascertain quality (Foundation help needed!)