Assume wiki has pages A and B with links and categories: A(l1,l2,l3,l4,l5,c1,c2,c3), B(l1,c1).  This is how API behaves now:

1 req)  prop=categories|links & generator=allpages & gaplimit=1 & pllimit=2 & cllimit=2
1 res)  A(l1,l2,c1,c2), gapcontinue=B, plcontinue=l3, clcontinue=c3

client ignores gapcontinue because there are others, and adds pl & cl continues:
2 req)  initial & plcontinue=l3 & clcontinue=c3
2 res)  A(l3,l4,c3), gapcontinue=B, plcontinue=l5

this is where a *potential" for the bug is: client must understand that since there is no more clcontinue, but there is plcontinue, there are no more categories in this set of pages, so it should not ask for   prop=categories until it finishes with plcontinue. Once done, it should resume prop=categories and also add gapcontinue=B.

3 bad req)  initial & plcontinue=l5
3 bad res)  A(l5,c1,c2), gapcontinue=B, clcontinue=c3

3 good req)  initial but with prop=links only & plcontinue=l5
3 good res)  A(l5) & gapcontinue=B

4 req) initial & gapcontinue=B
4 res) B(l1,c1)  -- done

I think this puts too much unneeded burden on the client code to handle these cases correctly. Instead, API should be simplified to return clcontinue=| in result #2, and results 1 and 2 should have gapcontinue=A.  Client could simply merge all resulting continue values into following requests, and greatly simplify all the code for the most common "get everything I requested" scenario, and hence should be the default behavior:

1 req)  prop=categories|links & generator=allpages & gaplimit=1 & pllimit=2 & cllimit=2
1 res)  A(l1,l2,c1,c2), gapcontinue=, plcontinue=l3, clcontinue=c3

2 req)  initial & gapcontinue= & plcontinue=l3 & clcontinue=c3
2 res)  A(l3,l4,c3), gapcontinue=, plcontinue=l5, clcontinue=|

3 req)  initial & gapcontinue= & plcontinue=l5 & clcontinue=|
3 res)  A(l5) & gapcontinue=B, plcontinue=, clcontinue=

4 req) initial & gapcontinue=B & plcontinue= & clcontinue=
4 res) B(l1,c1)  -- no continue section, done


That would be quite a change. It would mean the API wouldn't return
gapcontinue at all until plcontinue and clcontinue are both exhausted,
and then would keep returning the *old* gapcontinue until plcontinue
and clcontinue are both exhausted again.

Correct, API would return an empty gapcontinue until it finishes with the first set, than it will return the beginning of the next set until that is exhausted as well, etc.
 
This would break some possible use cases which I'm not entirely sure
we should break. For example, I can imagine a bot that would use
generator=foo&gfoolimit=1&prop=revisions, follow rvcontinue until it
finds whichever revision it is looking for, and then ignore rvcontinue
in favor of gfoocontinue to move on to the next page. With "dumb
continue", it wouldn't be able to do that.


I do not think API should support the case you described with gaplimit=1, because that fundamentally breaks the original API goal of "get data about many pages with lots of elements on them in one request". I would prefer the client do two separate queries: 1) list pages  2) many queries "list revisions for page X". Having generator with gaplimit=1 does not improve server performance or minimize traffic.

But even if we do find compelling reasons to include that, for the advanced scenario "skip subquery and follow on with the generator" it might make sense to introduce appendable "|next" value keyword gapcontinue=A|next or a gcommand=skipcurrent parameter. I am not sure it is the cleanest solution, but it is certainly cleaner than forcing every client out there to have the complex logic from above for all common cases.

1 req)  prop=categories|links & generator=allpages & gaplimit=1 & pllimit=2 & cllimit=2
1 res)  A(l1,l2,c1,c2), gapcontinue=, plcontinue=l3, clcontinue=c3

client decided it does not need anything else from A, so it adds |next to gapcontinue. API ignores all other property continues.
2 req)  initial & gapcontinue=|next, plcontinue=l3, clcontinue=c3
2 res)   B(l1,c1) -- done

The client would still have to know how to manipulate
list=/meta=/generator=/prop=, particularly when using more than one of
these in the same query. But the rules are simpler, it wouldn't have
to know that gclcontinue is for generator=categories while clcontinue
is for prop=categories, and it would be easy to know what exactly to
include in prop= when continuing to avoid repeated results.

Complex client logic is exactly what I am trying to avoid. Ideally all "continue" values should be joined into a single "query-continue = magic-value"  of no interesting user-passable properties.
 
You can't get away with changing the generator's continue like that
and still get correct results, because you can't assume the generator
generates pages in the same order every prop module processes them.
Nor can you assume each prop module will process pages in the same
order. For example, many prop modules order by page_id but may be ASC
or DESC on their "dir" parameter.

Totally agree - I forgot about the sub-ordering. So we either keep the same gapcontinue until the set is exhausted. The key here is that if we do not let the client manipulate the continue parameters, the server could later be optimized to return less results if they cannot yet be populated.

 
IMO, if a client wants to ensure it has complete results for any page
objects in the result, it should just process all of the prop
continuation parameters to completion.

The result set might be huge. It wouldn't be nice to have a 12GB x64 only client lib requirement :)