Hi everyone,
I have a question about two properties you can query about a user account: the "editcount" and the "registration". If you look at
http://en.wikipedia.org/w/api.php
for the *list=users* method, you see
editcount - adds the user's edit count registration - adds the user's registration timestamp
My naive interpretation is that editcount is the total number of edits made a given user (including the number of and registration is the time when the account was setup.) Therefore, if you use *list=usercontribs* method, two things should be true:
1) the editcount should be greater to or equal to the number of edits (usercontribs) for the given user I can get from the API. (As an ordinary user of the API, I won't be able to uncover deleted edits and therefore the editcount might exceed the count of edits from *list=usercontribs* )
and
2) the timestamp of all edits for a given user account cannot be earlier than the registration timestamp for that account. (That is, how can an account be editing before the account was actually registered.)
In computing statistics for many recent accounts, I have these assumptions to be true for the vast majority of accounts. I'm puzzled, however, for accounts in which one or both assumptions are wrong. An example is User:Katr67
http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Ka... yields:
<?xml version="1.0"?> <api> <query> <users> <user name="Katr67" editcount="4" registration="2010-12-10T08:09:19Z" /> </users> </query> </api>
But if you look at the list of contributions by User:Katr67 at
http://en.wikipedia.org/w/index.php?limit=50&tagfilter=&title=Specia... -- you'll find more than 4 edits (contrary to assumption #1) and edits earlier than Dec 10 (contrary to assumption #2). From
http://toolserver.org/~soxred93/pcount/index.php?name=Katr67&lang=en&...
we can see that there are 504 edits attributed to User:Katr67 (something one can verify through using
http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Ka...
or mwclient -- e.g.:
len(list(mwclient.Site("en.wikipedia.org").usercontributions(user="Katr67"))) (which returned 504)
So, what do *editcount* and *registration* really mean then? (For User:Katr67, it looks like there was some change to the account on Dec 10, which reset the registration timestamp and editcount to the last 4 edits. )
Thanks, -Raymond (User:RaymondYee)
2010/12/14 Raymond Yee raymond.yee@gmail.com:
In computing statistics for many recent accounts, I have these assumptions to be true for the vast majority of accounts. I'm puzzled, however, for accounts in which one or both assumptions are wrong. An example is User:Katr67
You probably hit an account rename here. Edits are reattributed for those, but I guess the editcount isn't updated for that, and of course the registration timestamp will be different too.
Roan Kattouw (Catrope)
Thanks, Roan, for responding to my post. It did occur to me I had hit upon an account renaming but I think something else is happening here. Let me give an example of an account renaming that seems to be the most common:
User:Penrithpanthers was renamed to User:SATidball -- which you can see at
http://en.wikipedia.org/w/index.php?title=Special:Log&type=renameuser&am...
04:53, 3 December 2010Nihonjoe http://en.wikipedia.org/wiki/User:Nihonjoe(talk http://en.wikipedia.org/wiki/User_talk:Nihonjoe |contribs http://en.wikipedia.org/wiki/Special:Contributions/Nihonjoe)renamedUser:Penrithpanthers http://en.wikipedia.org/w/index.php?title=User:Penrithpanthers&action=edit&redlink=1to "SATidball" ?(94 edits. Reason: WP:CHU)
If we look up Penrithpanthers and SATidball in the API, using http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Pe... we get:
<?xml version="1.0"?> <api> <query> <users> <user name="Penrithpanthers" missing="" /> <user name="SATidball" editcount="96" registration="2010-09-25T00:30:16Z" gender="male" /> </users> </query> </api>
The old account name is now "missing" and the new account SATidball picks up the registration date and all the contributions that used to belong to Penrithpanthers.
What happened with User:Katr67 seems to be a bit different. We can find a renaming event for User:Katr67 http://en.wikipedia.org/w/index.php?title=Special:Log&type=renameuser&am... shows
03:31, 18 September 2010Nihonjoe http://en.wikipedia.org/wiki/User:Nihonjoe(talk http://en.wikipedia.org/wiki/User_talk:Nihonjoe |contribs http://en.wikipedia.org/wiki/Special:Contributions/Nihonjoe)renamedUser:Katr67 http://en.wikipedia.org/wiki/User:Katr67to "Valfontis" ?(53,420 edits. Reason: WP:CHU)
However, the User:Katr67 account is not deleted (as was User:Penrithpanthers). Instead, going to http://en.wikipedia.org/wiki/User:Katr67 redirects to http://en.wikipedia.org/wiki/User:Valfontis And we look up both users at http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Ka... we get
<?xml version="1.0"?> <api> <query> <users> <user name="Katr67" editcount="4" registration="2010-12-10T08:09:19Z" gender="unknown" /> <user name="Valfontis" editcount="53627" registration="2006-01-06T10:23:49Z" gender="unknown"> <groups> <g>autoreviewer</g> <g>reviewer</g> <g>rollbacker</g> </groups> </user> </users> </query> </api>
So what's happening with Katr67/Valfontis ?
Thanks, -Raymond (User:RaymondYee)
On 12/13/10 6:24 PM, Roan Kattouw wrote:
2010/12/14 Raymond Yeeraymond.yee@gmail.com:
In computing statistics for many recent accounts, I have these assumptions to be true for the vast majority of accounts. I'm puzzled, however, for accounts in which one or both assumptions are wrong. An example is User:Katr67
You probably hit an account rename here. Edits are reattributed for those, but I guess the editcount isn't updated for that, and of course the registration timestamp will be different too.
Roan Kattouw (Catrope)
I think I know.
Editcount register the edits made by a user. And contributions are the lines of history with his name.
So, when I rename a page I make 1 edit, but create 2 lines if history (the one on redirect will be deleted if someone cancel my renaming.
A case when you can have history line anterior to your registration is on history import. And, as you didn't made the edit it is not in edit count.
That can explain higher or lower editcount.
Regards
Hercule
Le 14 déc. 2010 à 02:28, Raymond Yee raymond.yee@gmail.com a écrit :
Thanks, Roan, for responding to my post. It did occur to me I had hit upon an account renaming but I think something else is happening here. Let me give an example of an account renaming that seems to be the most common:
User:Penrithpanthers was renamed to User:SATidball -- which you can see at
http://en.wikipedia.org/w/index.php?title=Special:Log&type=renameuser&am...
04:53, 3 December 2010 Nihonjoe (talk | contribs) renamed User:Penrithpanthers to "SATidball" (94 edits. Reason: WP:CHU)
If we look up Penrithpanthers and SATidball in the API, using http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Pe... we get:
<?xml version="1.0"?>
<api> <query> <users> <user name="Penrithpanthers" missing="" /> <user name="SATidball" editcount="96" registration="2010-09-25T00:30:16Z" gender="male" /> </users> </query> </api>
The old account name is now "missing" and the new account SATidball picks up the registration date and all the contributions that used to belong to Penrithpanthers.
What happened with User:Katr67 seems to be a bit different. We can find a renaming event for User:Katr67 http://en.wikipedia.org/w/index.php?title=Special:Log&type=renameuser&am... shows
03:31, 18 September 2010 Nihonjoe (talk | contribs) renamed User:Katr67 to "Valfontis" (53,420 edits. Reason: WP:CHU)
However, the User:Katr67 account is not deleted (as was User:Penrithpanthers). Instead, going to http://en.wikipedia.org/wiki/User:Katr67 redirects to http://en.wikipedia.org/wiki/User:Valfontis And we look up both users at http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Ka... we get
<?xml version="1.0"?>
<api> <query> <users> <user name="Katr67" editcount="4" registration="2010-12-10T08:09:19Z" gender="unknown" /> <user name="Valfontis" editcount="53627" registration="2006-01-06T10:23:49Z" gender="unknown"> <groups> <g>autoreviewer</g> <g>reviewer</g> <g>rollbacker</g> </groups> </user> </users> </query> </api>
So what's happening with Katr67/Valfontis ?
Thanks, -Raymond (User:RaymondYee)
On 12/13/10 6:24 PM, Roan Kattouw wrote:
2010/12/14 Raymond Yee raymond.yee@gmail.com:
In computing statistics for many recent accounts, I have these assumptions to be true for the vast majority of accounts. I'm puzzled, however, for accounts in which one or both assumptions are wrong. An example is User:Katr67
You probably hit an account rename here. Edits are reattributed for those, but I guess the editcount isn't updated for that, and of course the registration timestamp will be different too.
Roan Kattouw (Catrope)
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
On Tue, Dec 14, 2010 at 08:22:15AM +0100, Hercule wrote:
Editcount register the edits made by a user. And contributions are the lines of history with his name.
So, when I rename a page I make 1 edit, but create 2 lines if history (the one on redirect will be deleted if someone cancel my renaming.
Many log events that affect pages add a "dummy" revision, which can show up in the page history and the user's contribs but will not be included in editcount. AFAICT the only things that do increment editcount are actually editing a page and uploading a new file (but not overwriting an existing file).
Also, AFAICT editcount is never decremented. Not even if someone moves a page over a redirect you created, which does seem to completely delete the old revision for the redirect beyond any recovery.
A case when you can have history line anterior to your registration is on history import. And, as you didn't made the edit it is not in edit count.
I was going to suggest that too. It's even possible for a wiki to have edits attributed to a user account that doesn't exist due to imports.
Thanks, Roan, Hercule, Brad, and Alex for your helpful responses.
To summarize, the following factors complicate my simple picture of how registration and editcount work:
1) usercontribs includes more events than recorded in editcount (such as renaming a page)
2) there may be history imports that bring in edits before the registration (or after) the registration date
3) sometimes user accounts are recreated even when they have been renamed.
4) there is a bug in which "users with large edit counts don't get re-attributed properly in the revision table" https://bugzilla.wikimedia.org/show_bug.cgi?id=17313
-Raymond
On 12/14/10 1:31 PM, Brad Jorsch wrote:
On Tue, Dec 14, 2010 at 08:22:15AM +0100, Hercule wrote:
Editcount register the edits made by a user. And contributions are the lines of history with his name.
So, when I rename a page I make 1 edit, but create 2 lines if history (the one on redirect will be deleted if someone cancel my renaming.
Many log events that affect pages add a "dummy" revision, which can show up in the page history and the user's contribs but will not be included in editcount. AFAICT the only things that do increment editcount are actually editing a page and uploading a new file (but not overwriting an existing file).
Also, AFAICT editcount is never decremented. Not even if someone moves a page over a redirect you created, which does seem to completely delete the old revision for the redirect beyond any recovery.
A case when you can have history line anterior to your registration is on history import. And, as you didn't made the edit it is not in edit count.
I was going to suggest that too. It's even possible for a wiki to have edits attributed to a user account that doesn't exist due to imports.
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
2010/12/14 Raymond Yee raymond.yee@gmail.com:
What happened with User:Katr67 seems to be a bit different. We can find a renaming event for User:Katr67 http://en.wikipedia.org/w/index.php?title=Special:Log&type=renameuser&am... shows
03:31, 18 September 2010 Nihonjoe (talk | contribs) renamed User:Katr67 to "Valfontis" (53,420 edits. Reason: WP:CHU)
However, the User:Katr67 account is not deleted (as was User:Penrithpanthers). Instead, going to http://en.wikipedia.org/wiki/User:Katr67%C2%A0 redirects to http://en.wikipedia.org/wiki/User:Valfontis%C2%A0 And we look up both users at http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Ka... we get
This happens sometimes, usually due to the user being logged in while they're being renamed or something, causing the old name to be recreated. That doesn't seem to be the case here (timestamp mismatch) so I'm not sure exactly what happened in this particular case; maybe someone just recreated the account, simple as that.
Roan Kattouw (Catrope)
On 12/14/2010 5:24 AM, Roan Kattouw wrote:
2010/12/14 Raymond Yee raymond.yee@gmail.com:
What happened with User:Katr67 seems to be a bit different. We can find a renaming event for User:Katr67 http://en.wikipedia.org/w/index.php?title=Special:Log&type=renameuser&am... shows
03:31, 18 September 2010 Nihonjoe (talk | contribs) renamed User:Katr67 to "Valfontis" (53,420 edits. Reason: WP:CHU)
However, the User:Katr67 account is not deleted (as was User:Penrithpanthers). Instead, going to http://en.wikipedia.org/wiki/User:Katr67 redirects to http://en.wikipedia.org/wiki/User:Valfontis And we look up both users at http://en.wikipedia.org/w/api.php?action=query&list=users&ususers=Ka... we get
This happens sometimes, usually due to the user being logged in while they're being renamed or something, causing the old name to be recreated. That doesn't seem to be the case here (timestamp mismatch) so I'm not sure exactly what happened in this particular case; maybe someone just recreated the account, simple as that.
This is probably from the user rename bug where edits from users with large edit counts don't get re-attributed properly in the revision table.
https://bugzilla.wikimedia.org/show_bug.cgi?id=17313
mediawiki-api@lists.wikimedia.org