RE: [Wikipedia-l] Erik's team certification proposal

List overview All Threads
Download

newer

older

Di�n Bi�n Ph�

�ngstr�m

Poor, Edmund W

15 Nov 2002 15 Nov '02

5:50 p.m.

I'm not sure the consensus is to reject Erik's idea.

Erik is still for it. And I'd like to see it happen, if we can find a way that's technically feasible and doesn't conflict with the Wikipedia's main goals.

Nonetheless, Larry's idea has several advantages: 1. Magnus already wrote some software for it. 2. It doesn't require any change to PediaWiki software. 3. It's troll-proof. 4. We can start testing it any day now.

Erik's idea has my support, plus the support of a few others (sorry, names escape me, it's been a hectic week).

One way to support team certification on Wikipedia is similar to sifter: * Create certification teams * Let any team member certify any article * Display a list of articles, each of which is certified by every member of a given team.

This can be done in software. I've begun sketching out a simple design already. If there's enough interest, we can move discussion to wikitech-l for implementation.

Ed Poor

Show replies by date

Michael R. Irwin

16 Nov 16 Nov

1:07 a.m.

New subject: Erik's team certification proposal

"Poor, Edmund W" wrote:

...

I'm not sure the consensus is to reject Erik's idea.

Erik is still for it. And I'd like to see it happen, if we can find a way that's technically feasible and doesn't conflict with the Wikipedia's main goals.

Nonetheless, Larry's idea has several advantages:

Magnus already wrote some software for it.

It doesn't require any change to PediaWiki software.

It's troll-proof.

We can start testing it any day now.

Erik's idea has my support, plus the support of a few others (sorry, names escape me, it's been a hectic week).

It sounds interesting to me for general use in information wikis but I am not certain I think it is a good idea for the English Wikipedia at the moment.

...

One way to support team certification on Wikipedia is similar to sifter:

Create certification teams

Let any team member certify any article

Display a list of articles, each of which is certified by every member of a given team.

Unanimity sure slows things down occasionally.

I wonder if implementing a trust metric similar to that used at advogato.org but generalized to allow team members to certify/scale each other and users to set their evaluation of the team's product would work better.

This way a selection could be reported as per the weight given by a subset of team members as modified by your personal opinion of the team's typical results.

Regards, Mike Irwin

tarquin

1:37 p.m.

New subject: vandalism spree

seems we have multiple vandals:

* TMC (though whether it's the same as the TMC we know of old, who knows) * Wikipedia moderator * Vandalbot

Recent Changes appears to be swamped with them.

I think the time has come for sysops to be able to ban usernamed users.

Magnus Manske

2:09 p.m.

New subject: vandalism spree

tarquin wrote:

...

seems we have multiple vandals:

TMC (though whether it's the same as the TMC we know of old, who knows)

Wikipedia moderator

Vandalbot

Recent Changes appears to be swamped with them.

I think the time has come for sysops to be able to ban usernamed users.

I have removed two of these usernames, hoping it would break any bot/script running, at least for a while.

Magnus

Toby Bartels

19 Nov 19 Nov

8:12 a.m.

New subject: vandalism spree

Tarquin wrote in part:

...

I think the time has come for sysops to be able to ban usernamed users.

Let me tell y'all how I spent my weekend.

First, Friday morning (15:00 UCT), I got email from lorp@myfonts.com, who identified himself as [[User:Hotlorp]] (which was eventually confirmed when he left a message on my user talk page while signed in). Apparently his IP (194.117.133.198) was blocked. Note that Hotlorp contacted me, not the admins that had blocked him, because I have an email address on my user page, which is advertised on [[Wikipedia:Administrators]]. Since it had been a week and the IP was dynamic, I unblocked it once I got his message, which was around noon (19:00 UCT). I mailed this information to Hotlorp then.

Hotlorp tried again later (around 19:00 UCT), but was on IP 194.117.133.196, which was still blocked. So I unblocked that when I got the message (around 23:00 UCT). Then he came on and made some edits a few hours later (around 4:00 UCT or so), and I went to bed.

While I was sleeping (around 12:00 UCT), a vandal arrived, using bots (apparently) to splatter goatse across Wikipedia. This vandal thwarted our standard blocking mechanisms by using signed-in user names, which were chosen to make fun of us. The vandal made fun of Scipius, who responded to the vandal, but also made fun of people that had nothing to do with responding to the vandalism: Ed Poor, TMC, proposed moderators. These moderators were mentioned only on the mailing lists, and Ed and TMC were also participants in those discussions. Conclusion: The vandal has been reading the mailing lists.

Suspicion of the vandal fell on 194.117.133.196 (and 194.117.133.198), which were the last IPs to be used for goatse and which had (as mentioned) been unblocked less than a day earlier. These IPs had few redeeming edits -- after all, all of Hotlorp's good edits didn't show up in their user contribs. However, these IPs weren't blocked again right away, and there was no confirmation posted to the VANDALISM page that these IPs were actually being used by the vandal. An hour after arriving, the vandal left. I saw no record of any IPs' being banned at that time.

A couple of hours later (shortly before 15:00 UCT), Koyaanisqatsi blocked these IPs to stop the vandalism. In email with KQ, we've been unable to figure out why the last evidence of the vandalism now on Wikipedia ended hours before the blocking, even though KQ remembers seeing vandalism on Recentchanges just before blocking (and saw that the vandalism stopped after the blocking). Were people no longer reporting the vandal's user names to [[Wikipedia:VANDALISM IN PROGRESS]]? Or was the vandal now creating pages that people were deleting? (But there were no deletions during that time either; the last deletion log entry connected to goatse is 12:55 UCT.) Indeed, AFAICT, there is no particular evidence that this particular vandal was ever using these IPs, only that some goatse vandal (maybe a different person entirely) had used them a week ago.

Whether or not KQ actually blocked the vandal, he did block somebody else. Hotlorp had just returned, and he managed to get in 3 edits before he was blocked again. He emailed me again (no contact information for KQ) and I unblocked the IPs when I got the message (just before 24:00 UCT). I watched [[VANDALISM IN PROGRESS]] like a hawk all night, looking for a sign of returning goatse, but there was none, and has been none since.

The problem, of course, is that we're blocking an innocent user when it's not at all clear that we're even blocking the vandal. And it's a cruel joke to tell the innocent user to contact the admin that blocked them when they have no method of doing so besides editing pages.

Solutions:

* Block more intelligently:

** Let admins see the IP of signed in users. Then we can at least know for sure who to block.

** Let admins whitelist a user name known to use a dynamic IP. (This can always be undone later if abused.)

** Allow admins to see all contributions from a given IP, whether or not they were made anonymously. This will allow us to check for multiple users and give us the opportunity to create the above whitelist at the same time that we block the vandal.

* Give blocked people a way to contact admins:

** At the very least, include a link to [[Wikipedia:Administrators]] in the message telling people that they've been blocked, so that it will be easy for them to get in touch with me.

** Other admins can advertise their email addresses there too. (Risk: I've never yet received inappropriate mail at <toby+wikipedia>, and this is the only case where I've been contacted by a blockee.)

** Set up a mailing list for administrators to take blocking complaints and give blocked people a link to that on the block message. (Same risk as before, and we only need a few admins to sign up.)

* Others?

-- Toby

The Cunctator

8:48 a.m.

New subject: vandalism spree

On 11/19/02 2:12 AM, "Toby Bartels" toby+wikipedia@math.ucr.edu wrote:

...

Solutions:

Block more intelligently:

** Let admins see the IP of signed in users. Then we can at least know for sure who to block.

** Let admins whitelist a user name known to use a dynamic IP. (This can always be undone later if abused.)

** Allow admins to see all contributions from a given IP, whether or not they were made anonymously. This will allow us to check for multiple users and give us the opportunity to create the above whitelist at the same time that we block the vandal.

Give blocked people a way to contact admins:

** At the very least, include a link to [[Wikipedia:Administrators]] in the message telling people that they've been blocked, so that it will be easy for them to get in touch with me.

** Other admins can advertise their email addresses there too. (Risk: I've never yet received inappropriate mail at <toby+wikipedia>, and this is the only case where I've been contacted by a blockee.)

** Set up a mailing list for administrators to take blocking complaints and give blocked people a link to that on the block message. (Same risk as before, and we only need a few admins to sign up.)

Others?

How about not blocking?

tarquin

11:32 a.m.

New subject: vandalism spree

The Cunctator wrote:

...

How about not blocking?

How do you propose we fend off vandals? It sometimes seem there are too many of them and not enough of us to revert their changes.

Granted, the new search feature which shows only the pages where the user in question is the last to have edited will be extremely useful in cleaning up.

May I suggest another utility to speed up the cleaning process for sysops: one of: * a "revert last edit" link * a "restore this version" link available when viewing an old version

Toby Bartels

4:37 p.m.

New subject: vandalism spree

The Cunctator wrote:

...

Toby Bartels wrote:

...

...

Block more intelligently:

Give blocked people a way to contact admins:

Others?

...

How about not blocking?

An obvious solution to the problems of this weekend. But I don't think that you'll get the rest of Wikipedia to go along with your suggestion. Do you think that the other ideas will make things better?

-- Toby

Anthere

4:58 p.m.

New subject: vandalism spree

Toby Bartels toby+wikipedia@math.ucr.edu wrote: The Cunctator wrote:

...

Toby Bartels wrote:

...

...

Block more intelligently:

Give blocked people a way to contact admins:

Others?

...

How about not blocking?

A sysop being able to act by preventing more than one save per xx of time(dunno 2/5 minutes for example) for

- a specific ip - or a specific user name - or all non-logged ips

for a given time (dunno again - say 1 hour) ?

These options allow to slow down vandalism process, in particular *bots*.

It gives more time to (quietly) hunt for the vandals ips, and then eventually proceed to a block.

It does not prevent a "good" wikipedian (unjustifiably blocked by very quick protection measures) to actually edit a page though (even if it might be unpleasant to have to wait to save a page...maybe a good way to promote consequent editing ;-))

--------------------------------- Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site

The Cunctator

6:44 p.m.

New subject: vandalism spree

No, seriously, how about not blocking?

If you try to think of solutions to our problems that don't involve blocking, the solutions are generally much cleaner.

For example, 1. more powerful ways of searching and sorting edits, 2. more powerful ways of rolling back edits 3. Bayesian filtering of contributions

I see #3 as an interesting solution that would deal with most of our problems.

http://spambayes.sourceforge.net/background.html

Magnus Manske

7:05 p.m.

New subject: vandalism spree

The Cunctator wrote:

...

No, seriously,

Oh, how boring ;-)

...

how about not blocking?

IMHO "don't block at all" won't work, but neither does "block everything at once", as we've seen.

...

If you try to think of solutions to our problems that don't involve blocking, the solutions are generally much cleaner.

For example,

more powerful ways of searching and sorting edits,

We now have the "top" mark at user contributions, so we can see which edits haven't been reverted yet. What else could we use? I already suggested marking IP contributions on Recent Changes in red (or bold or...) for logged-in users, but was turned down as being "unfair" to the good contributors, which *are* the majority.

...

more powerful ways of rolling back edits

How about a link on the user contributions, for sysops only probably, to undo all "top" contributions of this user?

...

Bayesian filtering of contributions

That would be similar to the automatic "quality rating" I suggested earlier. Characteristics could include: * removal of large parts of the text * insertion of repetetetetitive sequences * insertion of certain keywords ("f**k") * insertion of off-site links (?) Suspicious edits could be highlighted on the Recent Changes page.

Magnus

tarquin

21 Nov 21 Nov

12:36 a.m.

New subject: vandalism spree

Magnus Manske wrote:

...

...
How about a link on the user contributions, for sysops only probably, to undo all "top" contributions of this user?

Good idea. it would certainly mean we no longer feel we're drowning in vandal edits.

I think an alert to a user that appears above all their edit boxes could be useful -- there could be a flag that says "you have been identified by user X as making vandal edits / breaching copyright / etc. Please use the Sandbox to test wiki markup." -- something to let them know a) we're onto them and b) where they can respond if there has been a misunderstanding

Zoe

19 Nov 19 Nov

9:33 p.m.

New subject: vandalism spree

Please explain Bayesian filtering? Zoe The Cunctator cunctator@kband.com wrote:No, seriously, how about not blocking?

If you try to think of solutions to our problems that don't involve blocking, the solutions are generally much cleaner.

For example, 1. more powerful ways of searching and sorting edits, 2. more powerful ways of rolling back edits 3. Bayesian filtering of contributions

I see #3 as an interesting solution that would deal with most of our problems.

http://spambayes.sourceforge.net/background.html

_______________________________________________ Wikipedia-l mailing list Wikipedia-l@wikipedia.org http://www.wikipedia.org/mailman/listinfo/wikipedia-l

--------------------------------- Do you Yahoo!? Yahoo! Web Hosting - Let the expert host your site

Imran Ghory

9:43 p.m.

New subject: vandalism spree

On Tue, 19 Nov 2002, Zoe wrote:

...

Please explain Bayesian filtering? Zoe

See [[Naive_Bayesian_classification]].

Imran

-- http://bits.bris.ac.uk/imran

erik_moeller＠gmx.de

11:14 p.m.

New subject: vandalism spree

...

For example,

more powerful ways of searching and sorting edits,

more powerful ways of rolling back edits

I strongly support this, but let's be careful not to lose article histories.

...

Bayesian filtering of contributions

I see #3 as an interesting solution that would deal with most of our problems.

I noticed that you're in the list of Wikipedia developers. I think few people would object to such modifications. So go for it.

Regards,

Erik

Toby Bartels

20 Nov 20 Nov

2:18 a.m.

New subject: vandalism spree

...

Bayesian filtering of contributions

This is an intriguing idea. Where can we work on developing a filter that will be useful.

BTW, I trust that the purpose of this filter will be to highlight changes that may require human intervention -- not to block changes automatically by an algorithm.

-- Toby

The Cunctator

2:36 a.m.

New subject: vandalism spree

On 11/19/02 8:18 PM, "Toby Bartels" toby+wikipedia@math.ucr.edu wrote:

...

...

Bayesian filtering of contributions

This is an intriguing idea. Where can we work on developing a filter that will be useful.

BTW, I trust that the purpose of this filter will be to highlight changes that may require human intervention -- not to block changes automatically by an algorithm.

Well, even automatic Bayesian blocking would be much less clumsy than automatic IP blocking.

But my goal is to come up with strategies that make problems get solved invisibly without crippling necessary flexibility--or adding layers of complexity and hierarchy.

The benefits or harms of any particular technological method depend on the implementation.

Toby Bartels

4:23 a.m.

New subject: vandalism spree

The Cunctator wrote:

...

Toby Bartels wrote:

...

...
BTW, I trust that the purpose of this filter will be to highlight changes that may require human intervention -- not to block changes automatically by an algorithm.

...

Well, even automatic Bayesian blocking would be much less clumsy than automatic IP blocking.

IP blocking is easy to keep track of and correct (so long as the lines of communication are kept open). But perhaps we'll develop a method of Bayesian blocking that has the same good properties -- the important thing is that humans have manual override.

...

But my goal is to come up with strategies that make problems get solved invisibly without crippling necessary flexibility--or adding layers of complexity and hierarchy.

A Bayesian algorithm won't be hierarchical, but will it be complex? Again, the answer will depend on just what we come up with.

...

The benefits or harms of any particular technological method depend on the implementation.

Exactly. I look forward to your ideas, if you have any. (I don't have a clue about this myself ^_^.)

-- Toby

Erik Moeller

19 Nov 19 Nov

12:04 p.m.

New subject: vandalism spree

...

** At the very least, include a link to [[Wikipedia:Administrators]] in the message telling people that they've been blocked, so that it will be easy for them to get in touch with me.

I've added this to the CVS version, as soon as Brion or someone else updates the code again, it's live.

As for blocking more intelligently, we definitely need this. Not blocking is not an option -- this may work on less active wikis where you just let the vandal work and then fix their changes when he gets tired, but there are actually people who want to use Wikipedia in the meantime.

The easiest way to stop vandalism, of course, would be to limit new members (automatically by requiring validated email addresses or manually by approving each member individually) and to require signing in to edit pages. I don't like that much either, but it would work.

Regards,

Erik

-- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!

Matthew Woodcraft

20 Nov 20 Nov

10:32 p.m.

New subject: vandalism spree

On Mon, Nov 18, 2002 at 11:12:34PM -0800, Toby Bartels wrote:

[On Saturday]

...

While I was sleeping (around 12:00 UCT), a vandal arrived, using bots (apparently) to splatter goatse across Wikipedia.

I'm not sure it really was using a bot, despite its claims. If it had been, it could have vandalised many more pages.

...

The problem, of course, is that we're blocking an innocent user when it's not at all clear that we're even blocking the vandal.

Block more intelligently:

** Let admins see the IP of signed in users. Then we can at least know for sure who to block.

** Let admins whitelist a user name known to use a dynamic IP. (This can always be undone later if abused.)

** Allow admins to see all contributions from a given IP, whether or not they were made anonymously. This will allow us to check for multiple users and give us the opportunity to create the above whitelist at the same time that we block the vandal.

These are surely good plans. Note that if we're willing to do the work to classify IPs, we can ban on the 'Client-ip' and 'X-forwarded-for' headers instead of the real IPs, for known shared proxies. This doesn't help the case where an innocent user ends up reusing the actual client IP address of a vandal (either because the address was reallocated, or just because they used the same public computer), but it would do something to mitigate problems with shared proxies.

But in the long run, nothing based on ip-banning would be able to stop a sufficiently determined vandal. Neither would relying on registered accounts. At present, stealing someone else's account would be quite easy. This doesn't matter, as there's little currently little incentive to do so. If we relied more strongly on authenticated accounts, that could change.

I think techniques for automatically slowing down bots would be the most valuable place to concentrate our efforts.

-M-

Erik Moeller

10:38 p.m.

New subject: vandalism spree

...

But in the long run, nothing based on ip-banning would be able to stop a sufficiently determined vandal. Neither would relying on registered accounts. At present, stealing someone else's account would be quite easy.

How so? Brute force password attacks? We can catch these by limiting the attempts. What else?

Regards,

Erik

-- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ NEU: Mit GMX ins Internet. Rund um die Uhr für 1 ct/ Min. surfen!

Matthew Woodcraft

11 p.m.

New subject: vandalism spree

...

...
But in the long run, nothing based on ip-banning would be able to stop a sufficiently determined vandal. Neither would relying on registered accounts. At present, stealing someone else's account would be quite easy.

On Wed, Nov 20, 2002 at 10:38:22PM +0100, Erik Moeller wrote:

...

How so? Brute force password attacks? We can catch these by limiting the attempts. What else?

Stealing the cookie. Non-brute-force password guessing. Compromising a public machine. Compromising a private machine.

-M-

Toby Bartels

21 Nov 21 Nov

7:33 a.m.

New subject: vandalism spree

[Moving to <wikitech-l>, since we're now discussing programming, not policy.]

Matthew Woodcraft wrote:

...

Toby Bartels wrote:

[plans]

...

These are surely good plans.

Thanks!

...

Note that if we're willing to do the work to classify IPs, we can ban on the 'Client-ip' and 'X-forwarded-for' headers instead of the real IPs, for known shared proxies.

I don't know what this means. But I hope that it works! ^_^

...

But in the long run, nothing based on ip-banning would be able to stop a sufficiently determined vandal. Neither would relying on registered accounts. At present, stealing someone else's account would be quite easy.

Right, the passwords and cookies are sent over the Net unencrypted. They just need to sniff our packets (how rude!).

...

I think techniques for automatically slowing down bots would be the most valuable place to concentrate our efforts.

This sounds promising to me too. What's the fastest rate of saving that a legitimate user is likely to use? What's the fastest rate of saving that we can expect to keep up with if used by a bot? I'm going make a 0th approximation of 1 minute for each. Too slow? too fast?

-- Toby

8068

Age (days ago)

8074

Last active (days ago)

wikipedia-l@lists.wikimedia.org

22 comments

13 participants

tags (0)

participants (13)

Anthere
Erik Moeller
erik_moeller＠gmx.de
Imran Ghory
Magnus Manske
Matthew Woodcraft
Matthew Woodcraft
Michael R. Irwin
Poor, Edmund W
tarquin
The Cunctator
Toby Bartels
Zoe