[Wikipedia-l] trust metrics

Sat Feb 14 18:58:33 UTC 2004

Jimmy Wales wrote:

>One system that I think does actually work, though, is the system at
>Ebay.  At Ebay, after every transaction, buyers and sellers can leave
>feedback for each other, positive or negative.  Whenever people want
>to buy or sell something, they can look at the feedback of the
>potential counterparty and see how much total feedback there is, how
>many positive, how many negative.
>
>For us, we have nothing like a 'transaction', but the system could be
>generalized from that.  Each person could assign positive or negative
>feedback to others.  Just a simple 'up' or 'down' rating with an
>optional comment.  Some people might give an 'up' rating to everyone,
>some might give a 'down' rating to everyone, some might just abstain
>totally.  
>
In eBay you can only give feedback if you are directly involved in the 
transaction.  There is no such thing as third-party feedback.

>But most would adopt a personal policy of giving mostly positives or
>abstaining, reserving negatives for worst case scenarios.
>
The eBay statistical norm for feedback ratings would appear to be in 
high 90s%.  Abstentions are not and perhaps cannot be counted.  If this 
kind of thing were implemented, I suspect that the statitical norm for 
us would be a lot lower

>Newcomers would have no rating at all, obviously.  Very prominent
>people would have lots of ratings, mostly positive I would have to
>assume.  I would probably have 95% positive rating, but not perfect,
>since beloved though I am and obviously deserve to be (*wink*), I am a
>target.
>
Hmmm!  Ego may be a negative factor in attributing rating. :-)

>We'd likely see perfect positive ratings for people like Michael
>Hardy, who keeps his nose to the grindstone editing topics that aren't
>controversial, and who stays out of internal politics almost
>completely as far as I know.
>
People like this don't get noticed.  That's the one thing that may tend 
to keep their ratings low.

>Some sysops have taken enormous and weighty responsibilities on
>themselves to do important but controversial work like VfD or banning
>trolls or mediating disputes or editing articles about the Middle
>East.  We'd naturally expect them to get mixed reviews, but we might
>be surprised... lots of people would give them positive ratings just
>for doing those jobs, acknowledging the difficulty and risk involved.
>
Agreed

>Some virtues of this concept:
>
>1.  Easy to program... it's just a single table in the database,
>feedback, with 5 fields -- 'from', 'to', 'up/down', 'comment',
>'timestamp'.  (The timestamp is so old feedback can be expired.  Maybe
>positive votes expire after 1 year, negatives after 1 month, so as to
>encourage more positivity!)
>
eBay feedback never expires, but it is aged.  If a person has had a 
period of significant negative feedback one can see whether it is recent 
or more than a year ago.  It is also important that the person giving 
the feedback is identified.  The person receiving negative feedback then 
hs the opportunity to reply to the feedback, and perhaps give negative 
feedback in return.

>2.  Difficult to game -- it is NOT automated, so there's no way to
>game the system by engaging in repetitive actions to score points.
>
I think that it would NEED to be automated to work.  We already track 
the number of edits that a person has made.  Perhaps the number of 
reverts that a person receives could be an automated negative.  This 
could quickly put our problem people into an overall negative rating. 
 It will also have a negative effect on the people who clean up the 
messes, but at the same time it could be an incentive to them to find 
more conciliatory approaches when dealing with problems.   Those who 
perpetually revert without discussion will soon find their ratings go down.

>3.  Easy for end users -- no complex system of approving or
>disapproving of individual edits.  You can just give someone a smile
>or a frown, as you wish, when you wish, or not.
>
How often?

>4.  Rewards co-operativeness and friendliness and neutrality, because
>to get a high rating, you have to please lots of people.
>
>5.  It would be a relatively simple matter to also calculate a
>"weighted" score, where the weight is based on the raw scores of those
>who have rated this user.  What I mean is that if someone has a
>perfect positive score, then their impact on the weighted score
>calculations would be higher than the impact of someone with a high
>negative score.  I consider this weighted score to be optional and
>possibly dangerous, but it is at least easy enough to do.
>
In some ways this would be better, but see 2 above.

>6.  It *might* lead to a lot less demands and bickering.  It isn't
>uncommon for someone to write to the lists or to me personally asking
>that such-and-such prominent wikipedian be banned or desysopped.  This
>is exhausting and divisive.  Possibly instead we could have a system
>where people have a way to express displeasure, and to privately
>advocate that others express displeasure, but in a way that doesn't
>involve long drawn out flamewars.
>
I don't think that we will ever be able to do away with these attempts 
at covert influence.  Some off-list comments are warranted, especially 
when real confidential information is at issue.  If your proposal were 
operational it might be helpful to reveal the names of unwarranted 
secret accusers, and giving them a negative score for failure to be open 
with the community.  That could happen irrespective of whether their 
accusations toward the other contributor were justified.  That would put 
the cat among the (stool)-pigeons! :-)

>7.  It is reflective of what we actually do in practice, i.e. it's
>just a formalization of how we actually do operate.  Everyone has an
>opinion, and when I think about who is good and prominent, I think
>about my *own* opinion, but I also think about the opinions of
>*others*.  But this can only be guessed at, not measured.  Tim
>Starling held a quick poll on whether 172 should be de-sysopped, and
>when it went heavily in one direction, he followed through.  If the
>system I am talking about were in place, we'd instead just see 172's
>positive rating start to evaporate.
>
I've not had any particular problem with 172, although he can be 
confrontational, and with overall political sympathies that are closer 
to mine than Jimbo's.  I've seen a lot on the list lately about 168. 
 I'm even wondering whether Jimbo said 172 when he really meant 168.

That being said, I've had very little contact with 168, and thus would 
not have a valid basis for either approving or diaapproving his 
sysophood.  Do I really want to take a large amount of time just to 
review what he has been doing?  I think my time would be better spent 
somewhere trying to do something constructive, but without taking that 
time I cannot give a fair vote.

>-------
>
>Some possible downsides, and there are many...
>
>1.  Unintended consequences 
>
>2.  People might be dissuaded from taking controversial and brave
>stands, if it's going to get them some negative feedback.
>
>3.  People might be incentivized to create sham accounts just to give
>themselves positive feedback.  
>
>4.  Well, I thought up the system, so I'm having a hard time seeing
>other downsides, but I'm sure they exist.  :-)
>
Probably the biggest downside is the practical one.  Even on eBay people 
need to be constantly encouraged to leave feedback, and still often 
don't.  Implementing this will mean a lot of extra work for everybody

Philosophically and theoretically I do not oppose such schemes.  My only 
reservation is that it won't work.

Ec