To me the implementation depends on what alkamid actually wants to do.
For keeping some of SuggestBot's data sources up-to-date I use the
site object's recentchanges() generator to grab data (and although one
can only get a limited amount at each step, I've never had troubles
exhausting the generator), where it's easy to check the edit timestamp
to stop iterating when necessary. I then store page titles in a
set(), which can be fed to a PagesFromTitlesGenerator, and I chain
said generator with a PreloadingGenerator to get the latest revisions.
In my experience only a minority of a Wikipedia edition's articles are
updated on a weekly basis, so using allpages() results in a lot of
On 5 February 2012 17:28, Dr. Trigon <dr.trigon(a)surfeu.ch> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
past week? I thought of using the
executing editTime() on each page, but this method gives me only
zeros if the page was not read before (e.g. I have to call
page.get() first in order for editTime() to work properly). Is
there any edit-time-related piece of information I can get from a
generated list of pages? Or maybe there is another page generator
suitable for me?
Everything using 'getall' from 'wikipedia.py' (imported as
does give you the first history entry WITHOUT having to trigger
page.get(). E.g. the 'PreloadingGenerator' and as you can chain the
generators you can first setup your generator as 'gen1' and then pass
'gen1' to a 'PreloadingGenerator' (may be in a
in order to get the first history entry of every page... In
'sum_disc.py' of the DrTrigonBot repo is an example for this.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
-----END PGP SIGNATURE-----
Pywikipedia-l mailing list