[Foundation-l] Request to allow Google to search list archives again

Michael Bimmler mbimmler at gmail.com
Sun Apr 27 17:30:19 UTC 2008


[courtesy copy to foundation-l, though I suggest that discussion, if any, be
centralised on wikitech-l]

Hi all,
the search index for the mailinglist archives was last rebuilt in January.
Now, after having made quite a few queries about this here and at other
places, I learnt (and obviously had to accept) that rebuilding the search
index is quite a resources-consuming process which resulted in crashes.

To put it bluntly, I dare suggest from a non-technical POV that the "htdig"
(that's the name, isn't it?) experiment has failed. If we can only update
our search index every 6 months or so, it is pointless to have it.

Instead, I suggest that http://lists.wikimedia.org/robots.txt be modified as
to allow Google (and other search engines) to crawl /pipermail/ again. I do
not really see the privacy issues of this, nabble, gmane etc. are
google-searchable as well and I really don't see the point in barring Google
from our own archive.

If I am very honest, I do not even remember anymore, why we decided to bar
Google from http://lists.wikimedia.org/pipermail.
Was it due to privacy concerns? If so, which, and why is
lists.wikimedia.orgas an archive different from Nabble/Gmane?

Thanks,
Michael


-- 
Michael Bimmler
mbimmler at gmail.com


More information about the foundation-l mailing list