Dear folks,
Is there a good way to query a user's edit history, e.g., edit count during a period?
My current solution is using usercontribs API (https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang
Hi Haifeng,
In my experience, this depends on how many users you're looking to get information about. Is it a few hundred? A few thousand? A million+?
If you are getting the edit history for a limited number of users (say a few hundred to a few thousand), then using the API can work well. One thing to keep in mind when using the API is that your requests might be throttled and/or there might be database lag. Are you using a software library to access the API? If not, I'd consider using one so that throttling/lag doesn't become an issue, it's one of the reasons why I use Pywikibot https://www.mediawiki.org/wiki/Manual:Pywikibot for API requests.
If you're interested in querying a large number of users (say tens of thousands or more), then getting an account on Toolforge https://tools.wmflabs.org so you can run SQL queries against the replicated MediaWiki databases would make sense. I've frequently used that approach for data gathering for research purposes.
Hope that helps! And if not, don't hesitate to ask questions :)
Cheers, Morten
On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Dear folks,
Is there a good way to query a user's edit history, e.g., edit count during a period?
My current solution is using usercontribs API ( https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
I will also note that Quarry https://quarry.wmflabs.org/ is good for querying database replicas – no Toolforge account required.
If you share why kind of query you'd like to run, people on this list might even write you an example Quarry :) See also: https://www.mediawiki.org/wiki/Talk:Quarry People post query requests that and others help them draft the right query for their needs.
On Wed, Mar 27, 2019 at 11:28 AM James Hare jhare@wikimedia.org wrote:
I will also note that Quarry https://quarry.wmflabs.org/ is good for querying database replicas – no Toolforge account required.
-- *James Hare* (he/him) Associate Product Manager Wikimedia Foundation https://wikimediafoundation.org/
On Wed, Mar 27, 2019 at 9:30 AM Morten Wang nettrom@gmail.com wrote:
Hi Haifeng,
In my experience, this depends on how many users you're looking to get information about. Is it a few hundred? A few thousand? A million+?
If you are getting the edit history for a limited number of users (say a few hundred to a few thousand), then using the API can work well. One
thing
to keep in mind when using the API is that your requests might be
throttled
and/or there might be database lag. Are you using a software library to access the API? If not, I'd consider using one so that throttling/lag doesn't become an issue, it's one of the reasons why I use Pywikibot https://www.mediawiki.org/wiki/Manual:Pywikibot for API requests.
If you're interested in querying a large number of users (say tens of thousands or more), then getting an account on Toolforge https://tools.wmflabs.org so you can run SQL queries against the replicated MediaWiki databases would make sense. I've frequently used
that
approach for data gathering for research purposes.
Hope that helps! And if not, don't hesitate to ask questions :)
Cheers, Morten
On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Dear folks,
Is there a good way to query a user's edit history, e.g., edit count during a period?
My current solution is using usercontribs API ( https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Sample query I just wrote that get the edit count of a single user (by username) during a period of time on Quarry: https://quarry.wmflabs.org/query/34716 . The gotcha that might be important for this query is using the table `revision_userindex` rather than the play revision table which doesn't have such an index.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Wed, 27 Mar 2019 at 09:42, Aaron Halfaker aaron.halfaker@gmail.com wrote:
If you share why kind of query you'd like to run, people on this list might even write you an example Quarry :) See also: https://www.mediawiki.org/wiki/Talk:Quarry People post query requests that and others help them draft the right query for their needs.
On Wed, Mar 27, 2019 at 11:28 AM James Hare jhare@wikimedia.org wrote:
I will also note that Quarry https://quarry.wmflabs.org/ is good for querying database replicas – no Toolforge account required.
-- *James Hare* (he/him) Associate Product Manager Wikimedia Foundation https://wikimediafoundation.org/
On Wed, Mar 27, 2019 at 9:30 AM Morten Wang nettrom@gmail.com wrote:
Hi Haifeng,
In my experience, this depends on how many users you're looking to get information about. Is it a few hundred? A few thousand? A million+?
If you are getting the edit history for a limited number of users (say
a
few hundred to a few thousand), then using the API can work well. One
thing
to keep in mind when using the API is that your requests might be
throttled
and/or there might be database lag. Are you using a software library to access the API? If not, I'd consider using one so that throttling/lag doesn't become an issue, it's one of the reasons why I use Pywikibot https://www.mediawiki.org/wiki/Manual:Pywikibot for API requests.
If you're interested in querying a large number of users (say tens of thousands or more), then getting an account on Toolforge https://tools.wmflabs.org so you can run SQL queries against the replicated MediaWiki databases would make sense. I've frequently used
that
approach for data gathering for research purposes.
Hope that helps! And if not, don't hesitate to ask questions :)
Cheers, Morten
On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Dear folks,
Is there a good way to query a user's edit history, e.g., edit count during a period?
My current solution is using usercontribs API ( https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Thanks a lot for answering my questions, guys.
May I save/upload my own table (with user names and time ranges) to Quarry?
It looks I have to manually enter these information in SQL queries.
Also, I tried to get a Toolforge account. When following the step to
Create a Wikimedia developer accounthttps://toolsadmin.wikimedia.org/register/, the page showed:
"Developer account creation is currently disabled. We apologise for the inconvenience."
Best,
Haifeng Zhang ________________________________ From: Wiki-research-l wiki-research-l-bounces@lists.wikimedia.org on behalf of Maximilian Klein isalix@gmail.com Sent: Wednesday, March 27, 2019 12:46:39 PM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Query user history edits
Sample query I just wrote that get the edit count of a single user (by username) during a period of time on Quarry: https://quarry.wmflabs.org/query/34716 . The gotcha that might be important for this query is using the table `revision_userindex` rather than the play revision table which doesn't have such an index.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Wed, 27 Mar 2019 at 09:42, Aaron Halfaker aaron.halfaker@gmail.com wrote:
If you share why kind of query you'd like to run, people on this list might even write you an example Quarry :) See also: https://www.mediawiki.org/wiki/Talk:Quarry People post query requests that and others help them draft the right query for their needs.
On Wed, Mar 27, 2019 at 11:28 AM James Hare jhare@wikimedia.org wrote:
I will also note that Quarry https://quarry.wmflabs.org/ is good for querying database replicas – no Toolforge account required.
-- *James Hare* (he/him) Associate Product Manager Wikimedia Foundation https://wikimediafoundation.org/
On Wed, Mar 27, 2019 at 9:30 AM Morten Wang nettrom@gmail.com wrote:
Hi Haifeng,
In my experience, this depends on how many users you're looking to get information about. Is it a few hundred? A few thousand? A million+?
If you are getting the edit history for a limited number of users (say
a
few hundred to a few thousand), then using the API can work well. One
thing
to keep in mind when using the API is that your requests might be
throttled
and/or there might be database lag. Are you using a software library to access the API? If not, I'd consider using one so that throttling/lag doesn't become an issue, it's one of the reasons why I use Pywikibot https://www.mediawiki.org/wiki/Manual:Pywikibot for API requests.
If you're interested in querying a large number of users (say tens of thousands or more), then getting an account on Toolforge https://tools.wmflabs.org so you can run SQL queries against the replicated MediaWiki databases would make sense. I've frequently used
that
approach for data gathering for research purposes.
Hope that helps! And if not, don't hesitate to ask questions :)
Cheers, Morten
On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Dear folks,
Is there a good way to query a user's edit history, e.g., edit count during a period?
My current solution is using usercontribs API ( https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi Haifeng,
Regarding the Toolforge account, I'm sorry to say that account creation is currently disabled. This is a temporary measure that we hope to be able to undo soon, but for now there is no exact timeline or public phabricator tasks I can point you to to follow. However, I've added you to the list of people to ping when it is available again. Thank you for your patience.
On Wed, Mar 27, 2019 at 8:18 PM Haifeng Zhang haifeng1@andrew.cmu.edu wrote:
Thanks a lot for answering my questions, guys.
May I save/upload my own table (with user names and time ranges) to Quarry?
It looks I have to manually enter these information in SQL queries.
Also, I tried to get a Toolforge account. When following the step to
Create a Wikimedia developer account< https://toolsadmin.wikimedia.org/register/%3E, the page showed:
"Developer account creation is currently disabled. We apologise for the inconvenience."
Best,
Haifeng Zhang ________________________________ From: Wiki-research-l wiki-research-l-bounces@lists.wikimedia.org on behalf of Maximilian Klein isalix@gmail.com Sent: Wednesday, March 27, 2019 12:46:39 PM To: Research into Wikimedia content and communities Subject: Re: [Wiki-research-l] Query user history edits
Sample query I just wrote that get the edit count of a single user (by username) during a period of time on Quarry: https://quarry.wmflabs.org/query/34716 . The gotcha that might be important for this query is using the table `revision_userindex` rather than the play revision table which doesn't have such an index.
Make a great day, Max Klein ‽ http://notconfusing.com/
On Wed, 27 Mar 2019 at 09:42, Aaron Halfaker aaron.halfaker@gmail.com wrote:
If you share why kind of query you'd like to run, people on this list
might
even write you an example Quarry :) See also: https://www.mediawiki.org/wiki/Talk:Quarry People post query requests that and others help them draft the right query for their needs.
On Wed, Mar 27, 2019 at 11:28 AM James Hare jhare@wikimedia.org wrote:
I will also note that Quarry https://quarry.wmflabs.org/ is good for querying database replicas – no Toolforge account required.
-- *James Hare* (he/him) Associate Product Manager Wikimedia Foundation https://wikimediafoundation.org/
On Wed, Mar 27, 2019 at 9:30 AM Morten Wang nettrom@gmail.com wrote:
Hi Haifeng,
In my experience, this depends on how many users you're looking to
get
information about. Is it a few hundred? A few thousand? A million+?
If you are getting the edit history for a limited number of users
(say
a
few hundred to a few thousand), then using the API can work well. One
thing
to keep in mind when using the API is that your requests might be
throttled
and/or there might be database lag. Are you using a software library
to
access the API? If not, I'd consider using one so that throttling/lag doesn't become an issue, it's one of the reasons why I use Pywikibot https://www.mediawiki.org/wiki/Manual:Pywikibot for API requests.
If you're interested in querying a large number of users (say tens of thousands or more), then getting an account on Toolforge https://tools.wmflabs.org so you can run SQL queries against the replicated MediaWiki databases would make sense. I've frequently used
that
approach for data gathering for research purposes.
Hope that helps! And if not, don't hesitate to ask questions :)
Cheers, Morten
On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <haifeng1@andrew.cmu.edu
wrote:
Dear folks,
Is there a good way to query a user's edit history, e.g., edit
count
during a period?
My current solution is using usercontribs API ( https://www.mediawiki.org/wiki/API:Usercontribs).
But, the process has been stalled maybe due to some query limit.
Thanks,
Haifeng Zhang _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org