Re: [Wikitech-l] URL change? (was test.wikipedia.com indexed...)

11 Apr 2003

      On Thu, Apr 10, 2003 at 08:46:21PM -0700, Brion Vibber wrote:
...
On Thu, 2003-04-10 at 19:58, Nick Reinking wrote:
...
I think the misunderstanding is not on Google's part.  As far as I can
tell, Google isn't indexing that page.
A quick search on google for "wibrator wikipedia" shows a subsection for
the edit link.  Note that it doesn't have any 'Cached' link.  This means
that google saw a link to the edit page in a page that could be indexed.
I didn't say it was being cached, that its content could be
word-searched, or that it had been spidered through to other pages. I
said it was *indexed*. Now, maybe Google uses some word other than
"indexed" to mean "contained in a database of links which are shown to
users when they search for words contained in the link". I'll buy that.
Maybe the word they use is "florble". In that case, the page is being
florbled despite our best efforts to stop it from being florbled.
Is there any way we can tell google not to florble pages that are
explicitly excluded by our robots.txt file so that people will stop
complaining to *us* about google's overzealous florbling?
Hypothetically we could jimmy the page to not produce edit links if the
user agent is googlebot, but that would be very annoying for several
reasons:

The google-cached page would be missing those links.
This would screw with page caching. Google hits a lot of pages, and

we'd have to either not cache any of its hits or be very careful in
coding around it.
-- brion vibber (brion @ pobox.com)
I've always understood 'indexed' to mean 'downloaded the entire page and
added its contents to a searchable database.'  As far as I know,
robots.txt just tells google (and everybody else) not to download the
page; it doesn't say they can't link to it.  Since Masturbacja says to
follow links, but robots.txt says not to index edit links, Google
does the sensible thing: creates the link in its database, but doesn't
index the content.  Go figure; the Google engineers would probably
cooperate with you if you asked them nicely.  :)
-- 
Nick Reinking -- eschewing obfuscation since 1981 -- Minneapolis, MN

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] URL change? (was test.wikipedia.com indexed...)