Kasimir Gabert wrote:
Hello,
Excluding index.php using robots.txt should work if an article link
on your page is
http://mysite.tld/My_Page. The robots would then
not crawl
http://mysite.tld/index.php?title=My_Page&action=editedit,
etc.
Kasimir, I believe you have written above a beautiful solution for my
need. My article links on my site (
http://wikigogy.org) are indeed
done without reference to index.php but the 'edit', 'history' and
other action pages that I wish to exclude are done with that
reference. I had not realized this simple elegant solution. I will
try it. It should look like this in my-wiki/robots.txt, right?:
User-agent: *
Disallow: index.php*
Is the asterisk on index.php* correct and needed?
I think I should NOT have the asterisk in the URL prefix. I think
asterisk is only for the User-agent line, meaning all robots. I think
it should look like this in my-site/robots.txt:
User-agent: *
Disallow: index.php
and it will disallow robots from everything that is, or starts
with, "index.php", which all the action page URLs do start with on my
site but not article names because I am using pretty urls.
I read up on robots.txt here:
*