On Sun, Mar 10, 2013 at 8:53 PM, Platonides <Platonides(a)gmail.com> wrote:
I'm not convinced about [[en:MediaWiki_talk:*]]
and
[[en:Template_talk:*]], they can bring quite a bit of noise (similarly
for [[en:Wikipedia:Village_pump_(technical)]]). I see how interesting
discussions could be happening there, though.
The tabs in the search results page (sorry I didn't mention them in the
previous email) can be used to filter results to more relevant content, if
desired. I think that might help coping with noise.
Besides feedback on whether the engine works as
you'd expect, I would
like
to start some discussion about the ability for
Google's bots to crawl
some
of the resources that are currently included in
the URL filters, but
return
no results. For example, the IRC logs at
bots.wmflabs.org/~wm-bot/logs/.
Some workarounds are used (e.g. using github for code search since gitweb
isn't crawlable) but that isn't possible for all resources. What can we
do
to improve the situation?
Do we really
want Google to index them?
Why log them publicly if we don't make them searchable? Either we're
committed to being open or we're not... having a public but hard-to-use
archive seems somewhat contradictory to me.
--Waldir