On Tue, Dec 18, 2012 at 10:03 AM, Yuri Astrakhan yuriastrakhan@gmail.com wrote:
I do not think API should support the case you described with gaplimit=1, because that fundamentally breaks the original API goal of "get data about many pages with lots of elements on them in one request".
Oh? I thought the goal of the API was to provide a machine-usable interface to MediaWiki so people don't have to screen-scrape the HTML pages, which alleviates the worry about whether changes to the user interface are going to break screen-scrapers. I never knew it was all about *bulk* data access *only*.
But even if we do find compelling reasons to include that, for the advanced scenario "skip subquery and follow on with the generator" it might make sense to introduce appendable "|next" value keyword gapcontinue=A|next
How do things decide whether "foocontinue=A|next" is saying "the next foocontinue after A" or really means "A|next"? For example, https://en.wiktionary.org/w/api.php?action=query&titles=secundus&pro... currently returns plcontinue "46486|0|next".
Or are you proposing every module be individually coded to recognize this "|next"?
Ideally all "continue" values should be joined into a single "query-continue = magic-value" of no interesting user-passable properties.
So clients can make absolutely no decisions about processing the data they get back? No thanks.
Why not propose adding something like that as an option, instead of trying to force everyone to do things your way? Say have a parameter dumbcontinue=1 that replaces query-continue with
<query-dumb-continue>prop=links|categories&plcontinue=...&clcontinue=...&wlstart=...&allmessages=...</query-dumb-continue>
Entirely compatible.
IMO, if a client wants to ensure it has complete results for any page objects in the result, it should just process all of the prop continuation parameters to completion.
The result set might be huge. It wouldn't be nice to have a 12GB x64 only client lib requirement :)
Then use a smaller limit on your generator. And don't do this for prop=revisions&rvprop=content.