On May 17, 2012, at 1:50 PM, WereSpielChequers wrote:
Piotr,
I've had a reasonable success rate by filing requests at http://en.wikipedia.org/wiki/Wikipedia:Bot_requests. Several programmers keep an eye on it and if they think the task interesting and useful you may get lucky.
WSC
On 16 May 2012 18:09, Piotr Konieczny <piokon@post.pl> wrote:
Dario,
Thanks, but the last time I looked into this, running queries
required knowing how to code going way beyond a simple knowledge of
wiki syntax or excel functions. I think it was at WikiSym few years
back where we raised that issue - that much of the data Wikimedia
provides is limited to the small subset of scholars who can code
with pretty names like Java or Pearl and such. I am pretty sure this
is the reason for why social sciences have been lagging in Wikipedia
research since day one...
Is there any place where a non-coder can ask a Toolserv coder to run
some of those queries? I'd be happy to trade some of my Wiki skills
(as in, writing a DYK, or reviewing a GA) for such assistance :)
--
Piotr Konieczny
"To be defeated and not submit, is victory; to be victorious and rest on one's laurels, is defeat." --Józef Pilsudski
On 5/10/2012 2:29 PM, Dario Taraborelli wrote:
Piotr,
if you are interested in getting fresh figures about lifetime
edit counts I recommend you register an account on the
toolserver where you can run queries against the user table
(which holds cumulative edit counts across all namespaces for a
specific wiki). For namespace-specific counts you will need to
use the revision table and that's much more time consuming.
On a related note, this real-time dashboard I just uploaded
to the toolserver (representing account registrations and the
fraction of new users clicking on the edit button or passing
the 1 edit threshold ) could be of interest http://toolserver.org/~dartar/reg2/
Best
Dario
On May 10, 2012, at 10:57 AM, WereSpielChequers
wrote:
Hi Piotr,
You might make the assumption that the difference
between 4 million and 16 million is largely editors who
never get out of userspace, my experience is that such
users are relatively rare, or at least won't dominate
that 12 million.
I'm fairly sure that there will be a number of different
groups in that 12 million. Steve Walling, Aaron or
Maryana may be able to help analyse or at least explain
them.
Significant groups in the 12 million will definitely
include:
1 People who registered an account and tried but never
successfully saved an edit because when they looked they
saw a wall of code and they don't do html. The WMF is
investing a lot of money in WYSIWYG editing software in
the hope that this will enable goodfaith but not very
technical people to edit Wikipedia.
2 Vandals since 2007. We have edit filters that are
trying to dissuade vandals from saving their first edit
because it triggers one of our tests for probably
being vandalism. These filters only came in during the
last few years and have been improved over time - so
they are deterring a significant proportion of recent
badfaith editors from ever saving an edit.
3 Visitors from other wikis. One of the features of
Single User Login is that if you are logged in and you
click on a link that takes you to another wikimedia
wiki, your account becomes active at that wiki even if
you never go near the edit button. My account is active
on 92 wikis and I've edited in rather less than half of
them. I won't go into all the reasons why one might
visit other wikis, but if you see that an article you've
written has equivalents in several other languages I
consider it human nature to click on the links and look
at the article. Even if you don't use Google translate,
the choice of image and the size of the paragraphs is
often enough to tell you whether someone has translated
your work or started afresh.
4 Editors whose articles have been deleted. About a
quarter of new editors start by creating a new article
rather than by editing existing articles. A large
majority of such articles get deleted and their authors
depart. If the 4 million is only measured on surviving
edits to article space then there will be many hundreds
of thousands whose only article space edits have been
deleted.
5 Zombie accounts. We now have programs that prevent
people opening accounts that are overly similar to the
names of existing editors, but before these filters came
in many editors would protect themselves from such
impersonation by creating such "zombie accounts"
themselves and marking their userpage with a link to
their main account.
6 Edit conflicts. Breaking news stories attract editors
like moths to flames, our article on Sarah Palin peaked
at 25 edits per minute at one point during the day she
became John McCain's running mate (I don't think anyone
logs the number of edit conflicts). If you are a newbie
trying to edit a trending article by using that edit
button on the top of the page then you are guaranteed to
get frustrated and leave. The regulars have learned that
busy pages are best edited one section at a time, and on
a very busy page there simply isn't time to edit the
whole page before a section edit is saved. Of course
that could be easily resolved by disabling whole page
editing on busy pages, but I'm not expecting that
anytime soon.
Another issue is that I believe that the 4 million are
people who have one undeleted edit to mainspace on the
English Wikipedia since December 2004. If so the 16
million may include those who haven't edited since
December 2004.
I'm probably missing a few other variables, I'm afraid
this is a complex area, but I hope this gives you an
idea of the problem.
WSC
On 10 May 2012 16:35, Piotr
Konieczny <piokon@post.pl>
wrote:
Thanks for
the link. The figure 4,058,477 you cite (from http://stats.wikimedia.org/EN/TablesWikipediaEN.htm#editdistribution),
as you note, comes with the warning that "Only
article edits are counted, not edits on discussion
pages, etc". I assume this is why the magic word NUMBEROFUSERS
at en Wikipedia returns 16,763,691 (numerous
low activity editors apparently make their
few edits outside article mainspace).
The breakdown I could live with, for a
while, but the fact that this stat covers
only about a quarter of registered accounts
is a problem. Is anybody familiar with a way
to achieve a breakdown of all named accounts
with 1+ edit (for English Wikipedia), no
matter which namespace they edited?
Preferably with more flexible ranges than
the ones in that table?
In other words, the linked page provides
"Distribution of article [namespace] edits
over registered editors", whereas I am
interested in "Distribution of [all]
namespaces edits over registered editors".
--
Piotr Konieczny
"To be defeated and not submit, is victory; to be victorious and rest on one's laurels, is defeat." --Józef Pilsudski
On 5/10/2012 4:49 AM, WereSpielChequers wrote:
I'm not sure that we
have exactly what your asking for.
For example we have the figure of 4,058,477
but that is for registered accounts on the
English Wikipedia that have made at least
one edit to an article. Different language
versions of Wikipedia are also available,
but of course registered accounts doesn't
exactly tally with Wikipedians not least
because IP editors are excluded. Also I
believe that early edits - pre 2004 may not
be available and I suspect that deleted
edits may not be counted.
That said we have further stats of 1,614,938
registered accounts with >= 3 article
edits and 772,557 >=10
On 9 May 2012
23:42, Piotr Konieczny <pik1@pitt.edu>
wrote:
I
was looking at official stats, but I
seem to be unable to find out an answer
to the following question:
* how many of Wikipedia editors have X
edits (or fall within a range of edits)
To be more precise, I am curious how
many Wikipedians have:
* exactly 1 edit
* between 2-9 edits
* between 10-50 edits
I know that the total number of
registered accounts is reported at http://en.wikipedia.org/wiki/Wikipedia:Wikipedians
Can anybody direct me to the right
page/counter that would allow me to
obtain the above information? I hope it
is obtainable without having to download
the dump...