Dear all,
I am starting this thread in the hope that some of the great Wiki researchers on this list could advise me on a data collection problem.
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete, action=undelete, action=protect, action=block and action=unblock. I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
I would be very grateful for any advice of feedback! Thanks much for your attention and time. :)
Best,
Jérôme.
Hi,
2013/10/10 Jérôme Hergueux jerome.hergueux@gmail.com:
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete, action=undelete, action=protect, action=block and action=unblock. I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
Not exactly what you're looking for, but have you had a look at https://toolserver.org/~vvv/adminstats.php ?
Best regards,
Have you tried to use Api Sandbox? https://en.wikipedia.org/wiki/Special:ApiSandbox
You can, by example, query the admins actions by setting up a query on "list=logevents" and then specify various parameters.
It's an easy and powerful tool. *--* *Haitham*
On Thu, Oct 10, 2013 at 4:10 AM, Jérémie Roquet arkanosis@gmail.com wrote:
Hi,
2013/10/10 Jérôme Hergueux jerome.hergueux@gmail.com:
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the
Wikipedia
API documentation page (http://en.wikipedia.org/w/api.php):
action=delete,
action=undelete, action=protect, action=block and action=unblock. I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data
collection
strategy I should adopt in this case? Is there a way to query the
Wikipedia
API directly, or should I look for some specific markers in the edit summaries?
Not exactly what you're looking for, but have you had a look at https://toolserver.org/~vvv/adminstats.php ?
Best regards,
-- Jérémie
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hello Jerome,
I'm not sure this is the best way, but pywikipediabot [1] has a library called pagegenerators.py and there is a function def UserContributionsGenerator(username) (around line 706). That would allow you to iterate through theses user names, and I bet there will be a special marking for deletions/undeletions. If not, worst comes to worse you can use a regular expression for those words.
[1] https://meta.wikimedia.org/wiki/pywikipediabot
When you use have a pywikibot-hammer everything looks like a pywikibot-nail!
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
________________________________ From: wiki-research-l-bounces@lists.wikimedia.org wiki-research-l-bounces@lists.wikimedia.org on behalf of J?r?me Hergueux jerome.hergueux@gmail.com Sent: Thursday, October 10, 2013 3:11 AM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] How to collect all the admin-specific edits for a subset of Wp admins
Dear all,
I am starting this thread in the hope that some of the great Wiki researchers on this list could advise me on a data collection problem.
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete, action=undelete, action=protect, action=block and action=unblock. I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
I would be very grateful for any advice of feedback! Thanks much for your attention and time. :)
Best,
J?r?me.
Hi Jerôme,
most of the actions you refer to are not stored as edits by mediawiki. They can be accessed via the logging table [1] (with log_type 'delete' or 'block'), which is replicated on tool labs (you can apply for a tool labs account if you don't have one [2]).
HTH
Dario
[1] https://www.mediawiki.org/wiki/Manual:Logging_table [2] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help
On Oct 10, 2013, at 10:02 AM, "Klein,Max" kleinm@oclc.org wrote:
Hello Jerome,
I'm not sure this is the best way, but pywikipediabot [1] has a library called pagegenerators.py and there is a function def UserContributionsGenerator(username) (around line 706). That would allow you to iterate through theses user names, and I bet there will be a special marking for deletions/undeletions. If not, worst comes to worse you can use a regular expression for those words.
[1] https://meta.wikimedia.org/wiki/pywikipediabot
When you use have a pywikibot-hammer everything looks like a pywikibot-nail!
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
From: wiki-research-l-bounces@lists.wikimedia.org wiki-research-l-bounces@lists.wikimedia.org on behalf of Jérôme Hergueux jerome.hergueux@gmail.com Sent: Thursday, October 10, 2013 3:11 AM To: wiki-research-l@lists.wikimedia.org Subject: [Wiki-research-l] How to collect all the admin-specific edits for a subset of Wp admins
Dear all,
I am starting this thread in the hope that some of the great Wiki researchers on this list could advise me on a data collection problem.
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete,action=undelete, action=protect, action=block and action=unblock. I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
I would be very grateful for any advice of feedback! Thanks much for your attention and time. :)
Best,
Jérôme. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Thank you all for your kind help and suggestions!! :)
Let me dig in a little bit and see what works best. I'll let you know how it goes!
Best,
Jérôme.
2013/10/10 Dario Taraborelli dtaraborelli@wikimedia.org
Hi Jerôme,
most of the actions you refer to are not stored as edits by mediawiki. They can be accessed via the logging table [1] (with log_type 'delete' or 'block'), which is replicated on tool labs (you can apply for a tool labs account if you don't have one [2]).
HTH
Dario
[1] https://www.mediawiki.org/wiki/Manual:Logging_table [2] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help
On Oct 10, 2013, at 10:02 AM, "Klein,Max" kleinm@oclc.org wrote:
Hello Jerome,
I'm not sure this is the best way, but pywikipediabot [1] has a library called pagegenerators.py and there is a function *def UserContributionsGenerator(username)* (around line 706). That would allow you to iterate through theses user names, and I bet there will be a special marking for deletions/undeletions. If not, worst comes to worse you can use a regular expression for those words.
[1] https://meta.wikimedia.org/wiki/pywikipediabot
When you use have a pywikibot-hammer everything looks like a pywikibot-nail!
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
*From:* wiki-research-l-bounces@lists.wikimedia.org < wiki-research-l-bounces@lists.wikimedia.org> on behalf of Jérôme Hergueux jerome.hergueux@gmail.com *Sent:* Thursday, October 10, 2013 3:11 AM *To:* wiki-research-l@lists.wikimedia.org *Subject:* [Wiki-research-l] How to collect all the admin-specific edits for a subset of Wp admins
Dear all,
I am starting this thread in the hope that some of the great Wiki researchers on this list could advise me on a data collection problem.
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete,action=undelete, action=protect, action=block and action=unblock.
I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
I would be very grateful for any advice of feedback! Thanks much for your attention and time. :)
Best,
Jérôme. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi all,
FYI: the solution proposed below worked just fine. Thanks Dario! :)
Cheers,
Jérôme.
2013/10/10 Dario Taraborelli dtaraborelli@wikimedia.org
Hi Jerôme,
most of the actions you refer to are not stored as edits by mediawiki. They can be accessed via the logging table [1] (with log_type 'delete' or 'block'), which is replicated on tool labs (you can apply for a tool labs account if you don't have one [2]).
HTH
Dario
[1] https://www.mediawiki.org/wiki/Manual:Logging_table [2] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help
On Oct 10, 2013, at 10:02 AM, "Klein,Max" kleinm@oclc.org wrote:
Hello Jerome,
I'm not sure this is the best way, but pywikipediabot [1] has a library called pagegenerators.py and there is a function *def UserContributionsGenerator(username)* (around line 706). That would allow you to iterate through theses user names, and I bet there will be a special marking for deletions/undeletions. If not, worst comes to worse you can use a regular expression for those words.
[1] https://meta.wikimedia.org/wiki/pywikipediabot
When you use have a pywikibot-hammer everything looks like a pywikibot-nail!
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
*From:* wiki-research-l-bounces@lists.wikimedia.org < wiki-research-l-bounces@lists.wikimedia.org> on behalf of Jérôme Hergueux jerome.hergueux@gmail.com *Sent:* Thursday, October 10, 2013 3:11 AM *To:* wiki-research-l@lists.wikimedia.org *Subject:* [Wiki-research-l] How to collect all the admin-specific edits for a subset of Wp admins
Dear all,
I am starting this thread in the hope that some of the great Wiki researchers on this list could advise me on a data collection problem.
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user. Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete,action=undelete, action=protect, action=block and action=unblock.
I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
I would be very grateful for any advice of feedback! Thanks much for your attention and time. :)
Best,
Jérôme. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi Jerome,
Just a random note of caution, there are also admin actions such as closing RFCs altering userrights and protecting and unprotecting pages. So if you discover that some of your 120 are inactive you might want to check if they are active in those areas - most of us are relatively specialised. Perhaps more importantly the logs won't show how often admins have declined an action, I have declined hundreds of deletion tags, others will have declined hundreds of unblock requests.
Also I suspect that the logs only go back to Dec 2004 - I know that most prior data is missing.
Jonathan
On 15 November 2013 11:27, Jérôme Hergueux jerome.hergueux@gmail.comwrote:
Hi all,
FYI: the solution proposed below worked just fine. Thanks Dario! :)
Cheers,
Jérôme.
2013/10/10 Dario Taraborelli dtaraborelli@wikimedia.org
Hi Jerôme,
most of the actions you refer to are not stored as edits by mediawiki. They can be accessed via the logging table [1] (with log_type 'delete' or 'block'), which is replicated on tool labs (you can apply for a tool labs account if you don't have one [2]).
HTH
Dario
[1] https://www.mediawiki.org/wiki/Manual:Logging_table [2] https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help
On Oct 10, 2013, at 10:02 AM, "Klein,Max" kleinm@oclc.org wrote:
Hello Jerome,
I'm not sure this is the best way, but pywikipediabot [1] has a library called pagegenerators.py and there is a function *def UserContributionsGenerator(username)* (around line 706). That would allow you to iterate through theses user names, and I bet there will be a special marking for deletions/undeletions. If not, worst comes to worse you can use a regular expression for those words.
[1] https://meta.wikimedia.org/wiki/pywikipediabot
When you use have a pywikibot-hammer everything looks like a pywikibot-nail!
Maximilian Klein Wikipedian in Residence, OCLC +17074787023
*From:* wiki-research-l-bounces@lists.wikimedia.org < wiki-research-l-bounces@lists.wikimedia.org> on behalf of Jérôme Hergueux jerome.hergueux@gmail.com *Sent:* Thursday, October 10, 2013 3:11 AM *To:* wiki-research-l@lists.wikimedia.org *Subject:* [Wiki-research-l] How to collect all the admin-specific edits for a subset of Wp admins
Dear all,
I am starting this thread in the hope that some of the great Wiki researchers on this list could advise me on a data collection problem.
Here is the question: for a each of 120 Wikipedia admins (for whom I have the usernames and unique numeric ids), I would like to reliably count the number of times they (i) deleted a page (ii) undeleted (i.e. restored) a page (iii) protected a page (iv) blocked a user and (v) unblocked a user.
Those types of edits all correspond to a specific "action" in the Wikipedia API documentation page (http://en.wikipedia.org/w/api.php): action=delete,action=undelete, action=protect, action=block and action=unblock.
I don't know, however, what would be the best strategy to go about collecting those edits. Does anyone have an idea about which data collection strategy I should adopt in this case? Is there a way to query the Wikipedia API directly, or should I look for some specific markers in the edit summaries?
I would be very grateful for any advice of feedback! Thanks much for your attention and time. :)
Best,
Jérôme. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org