I think it is fair to restrict to valid participants and thus the second list. Erik, I hope this helps :) 

Thanks both!

Lodewijk

2012/10/2 Platonides <platonides@gmail.com>
On 01/10/12 23:06, Erik Zachte wrote:
> Could you please provide user names, like in earlier years?
>
> Then I can plug the new data file into existing code.
>
> Wikistats works with user names anyway, and later this month, when all
> dumps are in,
>
> we also want to find user names that did not occur earlier on any wiki.
>
> Cheers,
>
> Erik

I use userids for tracking who was each user, because they sometimes
provide different "pretty names" (I show the name they provided in the
page).

There go commons usernames:
> time sql commonswiki_p "SELECT DISTINCT user_name FROM u_platonides_wlm_p.wlm2012 JOIN user ON (user_id=wlm_author) WHERE wlm_source = 'commons';" > wlmUsernames.txt

http://toolserver.org/~platonides/sandbox/wlmUsernames.txt


Or if we restrict to valid submissions (182 users less):
> time sql commonswiki_p "SELECT DISTINCT user_name FROM u_platonides_wlm_p.wlm2012 JOIN user ON (user_id=wlm_author) WHERE wlm_source = 'commons' AND wlm_status='Participating';" > wlmParticipatingUsernames.txt
http://toolserver.org/~platonides/sandbox/wlmParticipatingUsernames.txt