Wikiquality-l November 2007

wikiquality-l@lists.wikimedia.org

7 participants
3 discussions

Implicit vs Explicit metadata
by Waldir Pimenta 02 Dec '07

02 Dec '07

I am sure this has already been discussed, but just in case, here goes my two cents: The post in http://breasy.com/blog/2007/07/01/implicit-kicks-explicits-ass/ explains why implicit metadata (like Google's PageRank) are better than explicit metadata (Like Digg votes). Making a comparison to Wikimedia, I'd say that Prof. Luca's trust algorithm is a more reliable way to determine the quality of an article's text than the Flagged Revision Extension. However, the point of the latter is to provide a stable version to the user who chooses that, while the former displays to which degree the info can be trusted, but still showing the untrusted text. What I'd like to suggest is the implementation of a filter based on the trust calculations of Prof. Luca's algorithm, which would use the editors' calculated reliability to automatically choose to display a certain revision of an article. It could be implemented in 3 ways: 1. Show the last revision of an article made by an editor with a trust score bigger than the value that the reader provided. The trusted editor is implicitly setting a minimum quality flag in the article by saving a revision without changing other parts of the text. This is the simpler approach, but it doent prevent untrusted text to show up, in case the trusted editor leaves untrusted parts of the text unchanged. 2. Filter the full history. Basically, the idea is to show the parts of the to the article written by users with a trust score bigger than the value that the reader provided. This would work like slashdot's comment filtering system, for example. Evidently, this is the most complicated approach, since it would require an automated conflict resolution system which might not be possible. 3. A mixed option could be to try to hide revisions by editors with a lower trust value than the threshold set. This could be done as far back in the article history as possible, while a content conflict isn't found. Instead of trust values, this could also work by setting the threshold above unregistered users, or newbies (I think this is approximately equivalent to accounts younger than 4 days) Anyway, these are just rough ideas, on which I'd like to hear your thoughts.

5 6

Using PageRank to ascertain quality (Foundation help needed!)
by Brian 09 Nov '07

09 Nov '07

Several collaborators and I are preparing to expand on previous work to automatically ascertain the quality of Wikipedia articles on the English Wikipedia (presented at Wikimania '07 [0]). PageRank is Google's hallmark quality metric, and the foundation actually has access to these numbers through the Google Webmaster Tools website. If a foundation representative were to create a Google account and verify that they were a "webmaster," they could download the PageRank for every article on the English Wikipedia in a convenient tabular format. This data would likely serve as a fantastic predictor. I would also like to compare the Google-computed PageRank to the PageRank computed via Wikipedia's internal link structure. I don't see any privacy implications in releasing this data. It also doesn't seem to help spammers much, as they already know the pages that have a very high PageRank, and we include rel="nofollow" on outbound links. Nonetheless, I would of course be willing to keep the data private. This would only take a few minutes if it were approved. Is anyone out there who has the power to make it happen? Cheers :) Brian [0] http://upload.wikimedia.org/wikipedia/wikimania2007/d/d3/RassbachPincockMin…

4 5

Technical report available on Wikipedia trust coloring
by Luca de Alfaro 02 Nov '07

02 Nov '07

Dear All, we have finally finished a technical report describing in detail the algorithms we use for the Wikipedia trust coloring (see http://trust.cse.ucsc.edu/). The technical paper not only gives the algorithms, but describes several ways to quantify notions of text trust for the Wikipedia, and provides detailed quantiative results on the quality of our coloring. The technical report is available here: http://www.soe.ucsc.edu/~luca/papers/07/trust-techrep.html We would of course appreciate comments and feedback. I wish a happy weekend to you all, Luca

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Wikiquality-l November 2007