On Thu, 2003-04-10 at 19:58, Nick Reinking wrote:
I think the misunderstanding is not on Google's part. As far as I can tell, Google isn't indexing that page.
A quick search on google for "wibrator wikipedia" shows a subsection for the edit link. Note that it doesn't have any 'Cached' link. This means that google saw a link to the edit page in a page that could be indexed.
I didn't say it was being cached, that its content could be word-searched, or that it had been spidered through to other pages. I said it was *indexed*. Now, maybe Google uses some word other than "indexed" to mean "contained in a database of links which are shown to users when they search for words contained in the link". I'll buy that. Maybe the word they use is "florble". In that case, the page is being florbled despite our best efforts to stop it from being florbled.
Is there any way we can tell google not to florble pages that are explicitly excluded by our robots.txt file so that people will stop complaining to *us* about google's overzealous florbling?
Hypothetically we could jimmy the page to not produce edit links if the user agent is googlebot, but that would be very annoying for several reasons: 1) The google-cached page would be missing those links. 2) This would screw with page caching. Google hits a lot of pages, and we'd have to either not cache any of its hits or be very careful in coding around it.
-- brion vibber (brion @ pobox.com)