Marco Schuster schrieb:
Fetch them
from the toolserver (there's a tool by duesentrieb for that).
It will catch almost all of them from the toolserver cluster, and make a
request to wikipedia only if needed.
I highly doubt this is "legal" use
for the toolserver, and I pretty
much guess that 800k revisions to fetch would be a huge resource load.
Thanks, Marco
PS: CC-ing toolserver list.
It's a legal use, the only problem is that the tool i wrote for is is quite
slow. You shouldn't hit it at full speed. So it might actually be better to
query the main server cluster, they can distribute the load more nicely.
One day i'll rewrite WikiProxy and everything will be better :)
But by then, i do hope we have revision flags in the dumps. because that would
be The Right Thing to use.
-- daniel