Hi all,

First post to the list. I've got a bunch of questions, and I hope this is the right place to ask them.

I'm interested in the idea of wiki 'mirroring': updating a second wiki ('B') periodically with content from wiki A. (There's of course some discussion of this on the web, so I'm aware that there's been quite a bit of  thinking on this already, but I couldn't quite find the solution I was looking for.)

A first stab at mirroring would be to do a Special:Export on the whole of A, and then do a Special:Import on B. But this becomes impractical for larger wikis: Ideally, I just want to update what needs updating.

The best way to do this would probably be something like list=recentchanges (going back to the date of last transfer). Of course this doesn't work, because recentchanges are are periodically purged, so cannot be used between arbitrary dates. The log doesn't seem to record edits (is this correct?), so this can't be used to get a list of changes between two arbitrary dates.

So, question 1: Is it possible to get a list of all changes (including edits) between two dates (in a single query)? 

If one wanted the complete version history, then another way to do this would be to get all revisions since the last transfer made, i.e. something like:
action=query&prop=revisions&revids=1450|1451|1452|...&rvprop=content
(then transform xml to Special:Import format, and upload). Together with a query of the log, this would give you all changes.

But suppose the wiki is very active or you don't have much bandwidth or you simply don't want the whole version history, but just the latest versions (since the last transfer). The only way I can see is to do something like this:
  • 1. Fetch the list of namespaces
  • 2. Get the list of revisions in each namespace (action=query&prop=revisions&generator=allpages for each namespace)
  • 3. See what needs updating, and then fetch all the changed pages.

Question 2: Can you see a better way of doing this? Also, why won't generator=allpages work across namespaces? (I guess there my be a reason why that isn't possible to do easily.)

One way would be to try something like:

action=query&prop=revisions&generator=allpages&rvstart=20090521000000

but this doesn't work.

So, my question 3: Do you know why this doesn't work? I assume there isn't an efficient mysql query to accomplish this, or are there other reasons?

Finally, I guess I am wondering whether there are people actively interested in discussing issues around wiki mirroring/synchronisation more. If so, what's the best mailing list for this?

Sorry, the post got a bit longer than I expected - thanks for considering this!

All the best,
Bjoern