Steve Sanbeg wrote:
On Sat, 27 Jan 2007 19:24:13 -0500, Simetrical wrote:
On 1/27/07, Luiz Augusto lugusto@gmail.com wrote:
Tiny question: isn't possible to block any non-Wikimedia site from live mirroring instead of blocking one-by-one?
I suppose we could try to record massive request rates from the same IP, but then you'd probably catch some large-scale proxies too. So not really. They just request a page like anyone else, and the request is superficially impossible to distinguish reliably from a request made by an ordinary user.
There is one difference, in that proxy traffic would be generated by humans who use the proxies, while mirror traffic would presumably come from the web crawlers that are indexing the mirrors, and should thus be much more predictable.
Some remote loaders use a rotating open proxy list, for example tramadol.tfres.net (warning, site contains some nasty javascript). This makes them slow, but apparently fast enough for the search engines to index them. What can we do about that, besides DoS the website?
-- Tim Starling