Hi, can anyone help me with my robot.txt. My contents for the page reads as follows:
User-agent: * Disallow: /Help Disallow: /MediaWiki Disallow: /Template Disallow: /skins/
But it is blocking pages like:
- http://www.dummipedia.org/Special:Protectedpages - http://dummipedia.org/Special:Allpages
and external pages like:
- http://www.stumbleupon.com/ - http://www.searchtheweb.com/
As you can see, my robot.txt did not block these pages. Also, should I block the print version to prevent what Google calls "duplicate content"? If so, how?
Response will be very much appreciated.
PM Poon
ekompute wrote:
Hi, can anyone help me with my robot.txt.
The name is 'robots.txt'
My contents for the page reads as follows:
User-agent: * Disallow: /Help Disallow: /MediaWiki Disallow: /Template Disallow: /skins/
But it is blocking pages like:
Special pages try to autoprotect themselves. See how they have '<meta name="robots" content="noindex,nofollow" />' A crawler traversing Special:Allpages would likely produce too much load.
and external pages like:
$wgNoFollowLinks = false;
http://www.mediawiki.org/wiki/Manual:$wgNoFollowLinks http://www.mediawiki.org/wiki/Manual:$wgNoFollowDomainExceptions
As you can see, my robot.txt did not block these pages. Also, should I block the print version to prevent what Google calls "duplicate content"? If so, how?
Disable /index.php (printable, edit...)
Response will be very much appreciated.
PM Poon
Thank you very much Platonides. Your reply is clear and easy to follow.
PM Poon
On Sun, Feb 8, 2009 at 9:08 AM, Platonides Platonides@gmail.com wrote:
ekompute wrote:
Hi, can anyone help me with my robot.txt.
The name is 'robots.txt'
My contents for the page reads as follows:
User-agent: * Disallow: /Help Disallow: /MediaWiki Disallow: /Template Disallow: /skins/
But it is blocking pages like:
Special pages try to autoprotect themselves. See how they have '<meta name="robots" content="noindex,nofollow" />' A crawler traversing Special:Allpages would likely produce too much load.
and external pages like:
$wgNoFollowLinks = false;
http://www.mediawiki.org/wiki/Manual:$wgNoFollowLinks http://www.mediawiki.org/wiki/Manual:$wgNoFollowDomainExceptions
As you can see, my robot.txt did not block these pages. Also, should I
block
the print version to prevent what Google calls "duplicate content"? If
so,
how?
Disable /index.php (printable, edit...)
Response will be very much appreciated.
PM Poon
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
mediawiki-l@lists.wikimedia.org