If you are using the Wikipedia URL scheme (/wiki/Page_title and
/w/index.php?title=Page_title&foo=bar), you can just ban bots from /w/ in
robots.txt.
On Fri, Mar 13, 2015 at 7:26 AM, Al <alj62888(a)yahoo.com> wrote:
Hi,
In my Google Webmaster Tools account, there are a lot of crawl 404 errors
for non-existent pages. It appears that it will request a page that does
not exist (with &redlink=1), get a 200 status, of course, and the "create"
page, and then somehow they derive a link to the same url but without the
&redlink or &edit parameters (probably from the menu links on the page),
which they then try to crawl and receive 404.
Does anyone know how to deal with this so that the google crawler do
this?
It looks like google first discovers the redlinks mostly from a previous
spam page which was subsequently deleted. But, once google sees it the
first time, then remember it for quite some time.
Thanks,Al
_______________________________________________
MediaWiki-l mailing list
To unsubscribe, go to:
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
--
Best regards,
Max Semenik ([[User:MaxSem]])