On Thu, May 20, 2004 at 11:49:26PM -0700, Brion Vibber wrote:
On Tomk32's request I've blocked www.themensuche.de (217.172.181.98) from access to our servers. It was "hosting" German Wikipedia pages by sending every request on to de.wikipedia.org and slurping the content, but using it solely for the purpose of bringing in search hits. The text is never displayed; instead every hit uses a JavaScript redirect through an intermediary or two to amazon.de.
I wrote mails to three other sites obviously running proxies without cache (and have more than 500MB traffic per month). There might be a few others, surely not as evil as themensuche.de but it still costs a lot of traffic which could be avoided by using the database dumps instead. Is anybody up for hunting and contacting them? We would need more stats because the webalizer stats by kbyte only show the TOP10. Also many hits/files but only a few visits is a good indicator.
which if you go there, sends you on to the amazon.de search page. This script is executed automatically before any content is shown (and the markup is invalid and nothing at all shows in some browsers even if JS is disabled). There is no way to read Wikipedia content at that site short of turning off JavaScript and using 'view source'.
There are plugins for firefox for turning off css too.