[Mediawiki-api] Revisions since certain date / wiki mirror

sl contrib sl.contrib at googlemail.com
Wed Jun 3 21:46:26 UTC 2009


Hi Roan,


> > Would it somehow be possible to build an intermediate solution? E.g.
> would
> > it be feasible to build a dedicated
> > action=query&prop=allchanges&start=...&end=...
> > that just solved that problem?
> For revisions, possibly. It wouldn't include log events, though.


I've had a go a modifying the code for allpages.

Basically if this is made conditional:
  $this->addWhereFld('page_namespace', $params['namespace']);

then all pages can be searched (irrespective of namespace). Has this got a
massive impact on efficiency? The maximum number of entries returned is
limited anyway, and it shouldn't really matter which namespace they come
from. (Of course some things like apfrom no longer work as expected, but for
my usecase, it would be ok to be disabled.)

You then introduce new parameters: startid, endid, start, end (for start/end
of revid, or start/end of last touched), and amend the query:
if (isset ($params['start'])) {
$this->addWhere('page_touched>=' . $params['start']);
}

Finally you need something like:

$this->addOption('ORDER BY', 'page_touched');
and
$this->setContinueEnumParameter('start',
$this->keyToTitle($row->page_latest));

With those changes (and a few conditionals) 'allpages' can produce a list of
pages that were touched between two dates, or a set of pages that have new
revisions between two revision numbers. Not sure yet whether last touched
will work as well as the revision timestamp, but at least from the revision
number you could easily update an offline set of wiki pages.

Do you think this looks good so far? Should I post the code somewhere so
that people can have a look?

Cheers,
Bjoern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/mediawiki-api/attachments/20090603/6cfd80fc/attachment.htm 


More information about the Mediawiki-api mailing list