[Foundation-l] Hiding namespaces from search engines

Brian McNeil brian.mcneil at wikinewsie.org
Wed Mar 19 21:21:04 UTC 2008


If the search back-end knows what matches with results from robots.txt then
it can hide these results, offer an option to search including hidden
results, and for the paranoid explain these are pages that search engines
are requested not to index. By this I mean even when results are returned,
not just the set of checkboxes when you get no results.

Where I see this applying on English Wikinews as a simple example is:

/wiki/Story_preparation/
/wiki/Portal:Prepared_stories/

One is main namespace, the other Portal. Both are searched by default.

Try going to Wikinews and searching for "Carter", 6th result is the prepared
obituary. With Shimon Peres it was until I made some changes recently 3rd or
4th and will likely pop up again in a day or so.

Question is, on the above basis do people think this is a worthwhile entry
to add to bugzilla? Will it in any way benefit other projects?


Brian McNeil
-----Original Message-----
From: foundation-l-bounces at lists.wikimedia.org
[mailto:foundation-l-bounces at lists.wikimedia.org] On Behalf Of Chad
Sent: 19 March 2008 21:42
To: Wikimedia Foundation Mailing List
Subject: Re: [Foundation-l] Hiding namespaces from search engines

I guess the question is:

Would we hide it from indexing or only from returning the results? The
latter seems easier than the former (and was where I was going with it).

-Chad

On Wed, Mar 19, 2008 at 3:24 PM, Bryan Tong Minh
<bryan.tongminh at gmail.com> wrote:
>
> On Wed, Mar 19, 2008 at 8:12 PM, Chad <innocentkiller at gmail.com> wrote:
>  > Likewise. After I said that, I started looking
>  >  at the code in my local MW install, not entirely
>  >  sure where it would go. I'll keep looking around,
>  >  as this would be a great extension to have.
>  >
>  >  -Chad
>  >
>  >
>  >
>  >  On Wed, Mar 19, 2008 at 3:03 PM, Bryan Tong Minh
>  >  <bryan.tongminh at gmail.com> wrote:
>  >  > On Wed, Mar 19, 2008 at 4:10 PM, Chad <innocentkiller at gmail.com>
wrote:
>  >  >  > Not currently, no.
>  >  >  >
>  >  >  >  Although, an extension could easily be written I
>  >  >  >  would think.
>  >  >  >
>  >  >  >  -Chad
>  >  >  >
>  >  >  >  On Wed, Mar 19, 2008 at 10:40 AM, Brian McNeil
>  >  >  >
>  >  >  >
>  >  >  > <brian.mcneil at wikinewsie.org> wrote:
>  >  >  >  > Which leads to the question...
>  >  >  >  >
>  >  >  >  >  Is there any way to get the internal search to honour some
sort of ranking
>  >  >  >  >  to put stuff in robots.txt at the very bottom?
>  >  >  >  >
>  >  >  >  >
>  >  >  I doubt the easiness.
>  >  >
>  >  >
>  >  >
>  >
>  >
>  > >  _______________________________________________
>  >  >  foundation-l mailing list
>  >  >  foundation-l at lists.wikimedia.org
>  >  >  Unsubscribe:
https://lists.wikimedia.org/mailman/listinfo/foundation-l
>  >  >
>  >
>  >  _______________________________________________
>  >  foundation-l mailing list
>  >  foundation-l at lists.wikimedia.org
>  >  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>  >
>  You should look in the MWSearch extension. However this frontend
>  relies on the lucene backend. The current version is 2.0, but in a
>  separate branch the 2.1 version is on track. That's were you should
>  look (It's Java).
>
>  Bryan
>
>
>
>  _______________________________________________
>  foundation-l mailing list
>  foundation-l at lists.wikimedia.org
>  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l at lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l




More information about the foundation-l mailing list