[WikiEN-l] No-indexing of project-space pages

Newyorkbrad (Wikipedia) newyorkbrad at gmail.com
Wed Jul 23 15:21:03 UTC 2008


On 7/23/08, Stephen Bain <stephen.bain at gmail.com> wrote:
>
> On Wed, Jul 23, 2008 at 10:47 AM, Newyorkbrad (Wikipedia)
> <newyorkbrad at gmail.com> wrote:
> > A couple of months ago, I raised on this list the issue of "no-indexing"
> > Wikipedia pages outside the mainspace, principally including
> project-space
> > pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> > including userspace as well.  By no-indexing, I refer to coding these
> pages
> > such that they will not be picked up by Google or other search engines.
>
> Note that much of this is already done, see our robots file:
>
> http://en.wikipedia.org/robots.txt
>
> Currently all AFD, RFA, RFC and RFAR subpages (but not the main AFD
> page, the main RFA page etc) are blocked from indexing. Of your
> examples the admin noticeboard and userspace are probably the big
> examples of pages that are still indexed that we might not want to be
> so.
>
> Note that the robots file can easily be updated by a request on
> bugzilla [1] if there is consensus for it.
>
> > - That Wikipedia currently lacks a top-quality internal search
> capability,
> > and therefore we need to be able to use external search engines such as
> > Google to perform administrator functions and the like.  There is some
> merit
>
> On this point, there's been great improvement in MediaWiki's search
> capabilities this year with the MWSearch backend coming online.
>
> ----
> [1] Like this request, for example:
> https://bugzilla.wikimedia.org/show_bug.cgi?id=10288
>
> --
> Stephen Bain
> stephen.bain at gmail.com



Thank you for this update.  I think there may have been progress that I have
missed in the past couple of months.  When I posted on this topic a few
months ago, either some of these types of pages were not yet no-indexed, or
no one mentioned the fact, or if they did I overlooked it.

Other pages that should be excluded from indexing (if they aren't already)
include SSP, RfCU, the old PAIN archives, WQA, and I'm sure people can put
together a list of a few more.

As for userspace, I think the optimal solution would be to allow the
individual user to opt in or out of indexing, if that is doable without too
much fuss.  (And indefblocked or banned users would automatically be
no-indexed, to give those with identifiable usernames one fewer grievance to
pursue after they have left us.)  Query whether "in" or "out" would be the
better default.

Newyorkbrad


More information about the WikiEN-l mailing list