On 07/14/11 5:56 AM, WereSpielChequers wrote:
Do we have stats yet that measure whether this is
or diverting even more people from improving the pedia to critiquing
It's difficult to see any logical connection between an article rating
system, and encouraging new editors.
Remember there is a risk that this could exacerbate
trend. Just as we need to value edits that fix problems and remove
templates above edits that add to the hundreds of thousands of
maintenance templates on the pedia;
If templates were subject to a similar rating system as articles we
would soon see which are being ignored by users, and are thus of no value.
So we need to value a talkpage
comment that explains why someone has a specific concern about an
article over a bunch of "feedback" that says people like or dislike an
article without indicating why. Better still we should be encouraging
readers to improve articles that they see as flawed.
This dream has been around since the stone age.
So we need to
measure this tool in terms of its success at getting readers to edit,
not in terms of its success at getting readers to rate articles. I
hope it is successful, and I'm happy to take the long view and measure
a trial over months to see how effectively we convert article raters
into article editors.
I seriously doubt that it will head in that direction.
But we do need to be prepared to remove this if
it has a net effect of diverting potential editors into merely rating
articles for others to fix.
It's not a problem if they do. If many readers do this for a single
article it's worth paying attention to these articles. A claim from a
single person can be suspected eccentric.
We also need to be careful how we compare this 374k to
"90%", not least because with 3,682,158 articles on En wiki as I
write, 374k is about 6k more than a random 10% sample would be.
It's all a matter of statistical trends, and for this a 100-point scale
would have been more useful than a 5-point scale. I actually suggested a
10-point scale many years ago. The first statistical measure that should
develop is a cumulative rating for all articles. The mean in that will
be the measure of the average article, and any article falling within a
certain deviation from that could be judged average. As overall quality
of WP increases so too will the average rating, but only extremely
slowly. Other measures could be developed from there.
We also need to learn from one of the lessons of the
where we had a similar rating system. Many of the proposals there had
so few ratings that they were close to being individual views and few
had sufficient responses to be genuinely collective to the point where
one maverick couldn't skew them - even without sockpuppetry. On
average our articles get one or two edits a month, many get far less.
I would not be surprised if 100,000 of the 374k in the trial had less
than ten ratings even if trialled for a couple of months.
This isn't a problem either. The number of ratings given is just as
important as what those ratings are. It should be reported right along
with the rating on the article page Users could then be reminded that a
small number of ratings is just not statistically significant; they
could even be color-coded to that effect. Short samples are also more
volatile. They would easily be driven into the top or bottom decile of
the data, and that alone would bring attention to them.
Lastly we need to be prepared for sockpuppetry,
especially as these
are random unsigned votes with no rationale. Can we have assurances
that something is being built into the scheme to combat this?
This FUD gives undue weight to sockpuppetry or other hostile editing.
Ideally such practices should be marginalised to a point where they
don't matter. Mounting a successful campaign to influence the rating of
an article would take a tremendous amount of sustained effort. I played
with trying to affect the page views of one of the Bomis girl articles
in the early days by going repeatedly to that page; the effects were
minimal. Now, with a much bigger encyclopedic corpus this would be
proportionally more difficult. "Random unsigned votes" are perfectly
consistent with wikiness, and will also trend toward statistical norms.
Building safeguards against agenda based ratings would be a waste of
time and effort.