-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Wed, Jan 28, 2009 at 1:13 AM, Daniel Kinzler wrote:
Marco Schuster schrieb:
Fetch
them from the toolserver (there's a tool by duesentrieb for that).
It will catch almost all of them from the toolserver cluster, and make a
request to wikipedia only if needed.
I highly doubt this is "legal" use
for the toolserver, and I pretty
much guess that 800k revisions to fetch would be a huge resource load.
Thanks, Marco
PS: CC-ing toolserver list.
It's a legal use, the only problem is that the tool i wrote for is is quite
slow. You shouldn't hit it at full speed. So it might actually be better to
query the main server cluster, they can distribute the load more nicely.
What is
the best speed, actually? 2 requests per second? Or can I go up to 4?
One day i'll rewrite WikiProxy and everything will
be better :)
:)
But by then, i do hope we have revision flags in the
dumps. because that would
be The Right Thing to use.
Still, using the dumps would require me to get the full
history dump
because I only want flagged revisions and not current revisions
without the flag.
Marco
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Use GnuPG with Firefox :
http://getfiregpg.org (Version: 0.7.2)
iD8DBQFJgAIpW6S2GapJUuQRAuY/AJ47eppKPbBqjz0l4HllCPolMWz9KACfRurR
Lod/wkd4ZM0ee+cPTfaO7yg=
=zB26
-----END PGP SIGNATURE-----