On Tue, Dec 18, 2012 at 11:10 PM, Yuri Astrakhan
<yuriastrakhan(a)gmail.com> wrote:
The linkTitle foreach causes 18 more api calls to
start getting the links,
all with plcontinue, before it yeilds even a single link.
Yeah, it's certainly possible there will be many plcontinue calls just
to get the first link.
But that doesn't mean you have to get all plcontinues when you want
only some links.
On Tue, Dec 18, 2012 at 11:16 PM, Yuri Astrakhan
<yuriastrakhan(a)gmail.com> wrote:
Petr, when you say you have two nested foreach(), the
outer foreach does not
iterate through the blocks, it iterates through pages. Which means you still
must iterate through every plcontinue in the set before issuing next
gapcontinue.
It doesn't mean that.
For example, in the extreme case where you don't want to know any
links from this page
(say, because you want to filter the articles in a way that cannot be
expressed directly by the API),
you don't have to use plcontinue for this page at all.
A specific example might be changing your code into
(yeah, built specifically to make my point):
foreach (var page in source.Where(p =>
p.Info.title.Contains("\u2014")).Take(2000))
In this case, the link for "—All You Zombies—" will be retrieved from
the first call,
so no plcontinue is needed.
The link for "—And He Built a Crooked House—" will be retrieved using
one plcontinue call.
But there are no more articles with that character in title in the first page,
so no more plcontinue calls are necessary, and gapcontinue can be used now.
The second page contains no articles with that character at all,
so without any plconitines, gapcontinue will be used right away.
With your “dumb query-continue”, doing this would require many more calls.
Petr Onderka
[[en:User:Svick]]