Hi.
I occasionally get asked about what a reasonable rate for querying the API of a Wikimedia wiki is for a particular script or tool. I don't really know the answer other than "be reasonable" and "specify an informative User-Agent." That's essentially what I said when asked here: https://meta.wikimedia.org/w/index.php?title=Tech&diff=3569658&oldid...
If there's an authoritative answer (on Meta-Wiki or mediawiki.org or even wikitech), that'd obviously be ideal. I'd also settle for a mailing list post. I looked at https://www.mediawiki.org/wiki/API:Main_page to see if it mentioned "limit" or "rate", but nothing came up. It's a fairly common question; it should probably be a bit easier to find.
MZMcBride
According to this, the query limit is 51 titles per query, or 501 with the apihighlimits. For lists, the default limit is 10, but can be high as 500 for users, and 5000 for users with apihighlimits. Don't know the specific configuration on WIkimedia through.
Techman224
On 2012-03-14, at 11:17 PM, MZMcBride wrote:
Hi.
I occasionally get asked about what a reasonable rate for querying the API of a Wikimedia wiki is for a particular script or tool. I don't really know the answer other than "be reasonable" and "specify an informative User-Agent." That's essentially what I said when asked here: https://meta.wikimedia.org/w/index.php?title=Tech&diff=3569658&oldid...
If there's an authoritative answer (on Meta-Wiki or mediawiki.org or even wikitech), that'd obviously be ideal. I'd also settle for a mailing list post. I looked at https://www.mediawiki.org/wiki/API:Main_page to see if it mentioned "limit" or "rate", but nothing came up. It's a fairly common question; it should probably be a bit easier to find.
MZMcBride
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
A rate should almost never be hard-coded anymore and all bots and maintenance tasks should use the maxlag parameter: http://www.mediawiki.org/wiki/Manual:Maxlag_parameter. This allows them to make requests as quickly as they like when the servers have resources to spare, and forces them to back off when they don't. This has been the default behavior for Pywikipediabot and most other major bot frameworks for some time.
Cheers! -Madman
On Thu, Mar 15, 2012 at 12:17 AM, MZMcBride z@mzmcbride.com wrote:
Hi.
I occasionally get asked about what a reasonable rate for querying the API of a Wikimedia wiki is for a particular script or tool. I don't really know the answer other than "be reasonable" and "specify an informative User-Agent." That's essentially what I said when asked here: https://meta.wikimedia.org/w/index.php?title=Tech&diff=3569658&oldid...
If there's an authoritative answer (on Meta-Wiki or mediawiki.org or even wikitech), that'd obviously be ideal. I'd also settle for a mailing list post. I looked at https://www.mediawiki.org/wiki/API:Main_page to see if it mentioned "limit" or "rate", but nothing came up. It's a fairly common question; it should probably be a bit easier to find.
MZMcBride
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On 15/03/12 15:17, MZMcBride wrote:
Hi.
I occasionally get asked about what a reasonable rate for querying the API of a Wikimedia wiki is for a particular script or tool. I don't really know the answer other than "be reasonable" and "specify an informative User-Agent." That's essentially what I said when asked here: https://meta.wikimedia.org/w/index.php?title=Tech&diff=3569658&oldid...
If there's an authoritative answer (on Meta-Wiki or mediawiki.org or even wikitech), that'd obviously be ideal. I'd also settle for a mailing list post. I looked at https://www.mediawiki.org/wiki/API:Main_page to see if it mentioned "limit" or "rate", but nothing came up. It's a fairly common question; it should probably be a bit easier to find.
MZMcBride
When people ask me that question, I tell them to limit the concurrency, not the request rate. You can't do much damage with a single thread, whether queries complete in 10ms or 10s. But you can certainly do a lot of damage if you send a query once per second that takes 100 seconds to complete, using 100 concurrent clients.
This is especially relevant for people who write toolserver scripts and the like. It's easy to write a server-side script which accidentally allows 100 concurrent connections to Wikimedia, when 100 people happen to use it at once, or if someone decides to try a DoS attack using the toolserver as a proxy.
-- Tim Starling
On Wed, Mar 14, 2012 at 9:30 PM, Tim Starling tstarling@wikimedia.org wrote:
On 15/03/12 15:17, MZMcBride wrote:
Hi.
I occasionally get asked about what a reasonable rate for querying the API of a Wikimedia wiki is for a particular script or tool. I don't really know the answer other than "be reasonable" and "specify an informative User-Agent." That's essentially what I said when asked here: https://meta.wikimedia.org/w/index.php?title=Tech&diff=3569658&oldid...
If there's an authoritative answer (on Meta-Wiki or mediawiki.org or even wikitech), that'd obviously be ideal. I'd also settle for a mailing list post. I looked at https://www.mediawiki.org/wiki/API:Main_page to see if it mentioned "limit" or "rate", but nothing came up. It's a fairly common question; it should probably be a bit easier to find.
MZMcBride
When people ask me that question, I tell them to limit the concurrency, not the request rate. You can't do much damage with a single thread, whether queries complete in 10ms or 10s. But you can certainly do a lot of damage if you send a query once per second that takes 100 seconds to complete, using 100 concurrent clients.
This is especially relevant for people who write toolserver scripts and the like. It's easy to write a server-side script which accidentally allows 100 concurrent connections to Wikimedia, when 100 people happen to use it at once, or if someone decides to try a DoS attack using the toolserver as a proxy.
-- Tim Starling
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
We already have an article on this : http://www.mediawiki.org/wiki/API:Etiquette.
Looks like it captures Tim's point about slamming us with concurrency. For those that regularly get asked about this ... what's missing from it and how can we make it more discoverable? Currently its linked from the front of the API page.
I changed the text slightly from 'Etiquette' to 'Etiquette & Usage Limits' so that its a quicker find. Updates as needed.
--tomasz
Tomasz Finc wrote:
We already have an article on this : http://www.mediawiki.org/wiki/API:Etiquette.
Aha! Thanks. :-) I would have never thought to look for the page at that title or with that keyword. It's an awfully clever page title, in my opinion.
Your changes to the navigation sidebar look good. I also just made an edit that basically throws a few more keywords into the front page for people quickly searching the page (with ctrl/cmd-F): https://www.mediawiki.org/w/index.php?diff=511379&oldid=509379.
Thanks again for the link. I figured it had to be somewhere on mediawiki.org already, but it wasn't coming to me last night.
MZMcBride
wikitech-l@lists.wikimedia.org