Hello Eric,
The trick is to run MediaWiki with short URLs so that the pages are http://domain.tld/wiki/Page_Name and the history would be at http://domain.tld/w/index.php?title=Page_Name&diff=next&old=4879.
Then in robots.txt all you need to do is block /w/index.php and it should only cache and scan all of your content and not your history and edit pages.
I hope that this helps, Kasimir Gabert
On 11/23/06, Eric K ek79501@yahoo.com wrote:
Google is checking all the Difference links and the histories - which it does not need to because this puts unnecessary load on the server. Is there any way to stop this? I'm not sure what to specify in the Robots.txt file. thanks Eric
Server Logs:
66.249.72.163 www.my-site.com - - [30/Oct/2006:18:26:54 -0600] "GET /index.php?title=MyPage&diff=next&oldid=4879 HTTP/1.1" 200 7673 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.72.235 www.my-site.com - - [23/Nov/2006:05:13:50 -0600] "GET /index.php?title=Special:Recentchanges&limit=50&days=14&hideminor=1&hideliu=1&from=20061121121419 HTTP/1.1" 200 2795 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
Cheap Talk? Check out Yahoo! Messenger's low PC-to-Phone call rates. _______________________________________________ MediaWiki-l mailing list MediaWiki-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/mediawiki-l