[Toolserver-l] Extracting basic revision data
Platonides
platonides at gmail.com
Mon Nov 29 14:09:39 UTC 2010
Михајло Анђелковић wrote:
> Hello,
>
> I would ask for allowance to run a request that can be resource
> consuming if not properly scaled:
>
> SELECT page.page_title as title, rev_user_text as user, rev_timestamp
> as timestamp, rev_len as len FROM revision JOIN page ON page.page_id =
> rev_page WHERE rev_id > 0 AND rev_id < [...] AND rev_deleted = 0;
>
> This is intended to extract basic data about all publicly visible
> revisions from 1 to [...]. Info about each revision would be a 4-tuple
> title/user name/time/length. I need this data to start generating a
> timeline of editing of srwiki, so it is intended to be run only once
> for each revision.
>
> If this is generally allowed to do, my question is how large chunks of
> data can I take at once, and how long should be waited between two
> takes?
>
> M
Have you considered generating the early timeline from dumps?
More information about the Toolserver-l
mailing list