I think a rough analysis user / IP talk pages could give you a number pretty quickly. You probably would want to do it by hand first and then write a script that analyses the wikipedia dump file. It is doable by hand, if you just sub-sample a few hundred pages randomly. And if normalized by a total number of user pages vs total number of users this would already give a rough estimate.
Kind Regards, Dmitry
On Tue, Oct 1, 2013 at 11:00 AM, Ziko van Dijk zvandijk@gmail.com wrote:
So Piotr, if I understand you well it is about the question how many of the people who are our "contributors" according to the statistics (per 5 edits a month, or 100 edits a month) are actually vandals? I could imagine that some vandals manage to make 5 edits before being blocked, or lose interest before they are blocked, and appear in the statistics. Kind regards Ziko
2013/9/29 Piotr Konieczny piokon@post.pl
I know of the categories, but the problem is that they do not seem to be comprehensive. I can estimate, based on them, that there are at least 150k or so editors who were banned for vandalism, but it seems many vandals do not make it into those categories, suggesting this number is underestimated.
Still, we should be able to get some estimates. We know, for example, that something like 5 or 6 million of accounts have made 1+ edit on English Wikipedia. How many of them were indefinitely blocked? This should give us some idea.
Alternatively, we know how many accounts make an edit to Wikipedia every given timeframe. About 100,000-120,000 editors make at least one edit to Wikipedia each month. If we knew how many are indef blocked in that period, that would be another useful estimate.
-- Piotr Konieczny, PhDhttp://hanyang.academia.edu/PiotrKoniecznyhttp://scholar.google.com/citation...
On 9/30/2013 11:44 AM, Stuart Yeates wrote:
I guess it depends on whether Piotr is looking for an estimate of accounts used for vandalism or an estimate of the people who operate them. One seems straight forward, the other more challenging. Perhaps combining the categories below with sock puppet investigations and some fancy stats?
Cheers Stuart
On 29/09/2013, at 12:13 am, h hanteng@gmail.com wrote:
Hello Piotr, I believe that in Chinese Wikipedia, "blocked indefinitely" is a user category called Wikipedians that are blocked indefinitely "被永久封禁的維基人" http://zh.wikipedia.org/wiki/Category:%E8%A2%AB%E6%B0%B8%E4%B9%85%E5%B0%81%E... Its equivalent Wikidata table has the following pages in other language versions: http://www.wikidata.org/wiki/Q4616402#sitelinks-wikipedia Language Code Linked article English enwiki Category:Blocked historical usershttp://en.wikipedia.org/wiki/Category:Blocked_historical_users italiano itwiki Categoria:Wikipedia:Cloni sospettihttp://it.wikipedia.org/wiki/Categoria:Wikipedia:Cloni_sospetti latviešu lvwiki Kategorija:Uz nenoteiktu laiku nobloķētie lietotājihttp://lv.wikipedia.org/wiki/Kategorija:Uz_nenoteiktu_laiku_noblo%C4%B7%C4%93tie_lietot%C4%81ji slovenčina skwiki Kategória:Wikipédia:Natrvalo zablokovaní používateliahttp://sk.wikipedia.org/wiki/Kateg%C3%B3ria:Wikip%C3%A9dia:Natrvalo_zablokovan%C3%AD_pou%C5%BE%C3%ADvatelia česky cswiki Kategorie:Wikipedie:Natrvalo zablokovaní uživateléhttp://cs.wikipedia.org/wiki/Kategorie:Wikipedie:Natrvalo_zablokovan%C3%AD_u%C5%BEivatel%C3%A9 български bgwiki Категория:Блокирани неприемливи потребителски именаhttp://bg.wikipedia.org/wiki/%D0%9A%D0%B0%D1%82%D0%B5%D0%B3%D0%BE%D1%80%D0%B8%D1%8F:%D0%91%D0%BB%D0%BE%D0%BA%D0%B8%D1%80%D0%B0%D0%BD%D0%B8_%D0%BD%D0%B5%D0%BF%D1%80%D0%B8%D0%B5%D0%BC%D0%BB%D0%B8%D0%B2%D0%B8_%D0%BF%D0%BE%D1%82%D1%80%D0%B5%D0%B1%D0%B8%D1%82%D0%B5%D0%BB%D1%81%D0%BA%D0%B8_%D0%B8%D0%BC%D0%B5%D0%BD%D0%B0 олык марий mhrwiki Категорий:Википедий:Йӧн петырымеhttp://mhr.wikipedia.org/wiki/%D0%9A%D0%B0%D1%82%D0%B5%D0%B3%D0%BE%D1%80%D0%B8%D0%B9:%D0%92%D0%B8%D0%BA%D0%B8%D0%BF%D0%B5%D0%B4%D0%B8%D0%B9:%D0%99%D3%A7%D0%BD_%D0%BF%D0%B5%D1%82%D1%8B%D1%80%D1%8B%D0%BC%D0%B5 українська ukwiki Категорія:Безстроково заблоковані користувачіhttp://uk.wikipedia.org/wiki/%D0%9A%D0%B0%D1%82%D0%B5%D0%B3%D0%BE%D1%80%D1%96%D1%8F:%D0%91%D0%B5%D0%B7%D1%81%D1%82%D1%80%D0%BE%D0%BA%D0%BE%D0%B2%D0%BE_%D0%B7%D0%B0%D0%B1%D0%BB%D0%BE%D0%BA%D0%BE%D0%B2%D0%B0%D0%BD%D1%96_%D0%BA%D0%BE%D1%80%D0%B8%D1%81%D1%82%D1%83%D0%B2%D0%B0%D1%87%D1%96 中文 zhwiki Category:被永久封禁的維基人http://zh.wikipedia.org/wiki/Category:%E8%A2%AB%E6%B0%B8%E4%B9%85%E5%B0%81%E7%A6%81%E7%9A%84%E7%B6%AD%E5%9F%BA%E4%BA%BA 日本語 jawiki Category:無期限ブロックを受けたユー ザーhttp://ja.wikipedia.org/wiki/Category:%E7%84%A1%E6%9C%9F%E9%99%90%E3%83%96%E3%83%AD%E3%83%83%E3%82%AF%E3%82%92%E5%8F%97%E3%81%91%E3%81%9F%E3%83%A6%E3%83%BC%E3%82%B6%E3%83%BC
I hope that it helps.
Best, han-teng liao
2013/9/29 Piotr Konieczny piokon@post.pl
Hi everyone,
Another question: do we have an estimate of a vandal population?
I also asked this at https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#How_man... so far no good estimates have been provided.
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing listWiki-research-l@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l