"David Gerard" wrote
2008/5/1 Charles Matthews charles.r.matthews@ntlworld.com:
Not for that reason - but David's suggestion is a Very Bad Idea. Tools for site development (eg redlink lists) naturally reside in userspace, and we should care about editors finding them.
ooh, ouch, yes.
(How findable are these in practice in Google at present?)
Oh, very.
We need a different question, really. If MediaWiki were deliberately designed to have some namespaces that were for search engines to find, and others not, how should that be set up?
Something like the article space, Wikipedia: space and some Development: space might do. All discussion namespaces are inherently dodgy. But one more namespace that was non-article, non-policy might do; and one more namespace on the other side, so that AfDs didn't have to be in the Wikipedia: space - now this is sounding more reasonable to me.
Charles
----------------------------------------- Email sent from www.virginmedia.com/email Virus-checked using McAfee(R) Software and scanned for spam
On Thu, May 1, 2008 at 3:35 PM, Charles Matthews charles.r.matthews@ntlworld.com wrote:
Oh, very.
We need a different question, really. If MediaWiki were deliberately designed to have some namespaces that were for search engines to find, and others not, how should that be set up?
Something like the article space, Wikipedia: space and some Development: space might do. All discussion namespaces are inherently dodgy. But one more namespace that was non-article, non-policy might do; and one more namespace on the other side, so that AfDs didn't have to be in the Wikipedia: space - now this is sounding more reasonable to me.
I think the closest we're going to get to a good technical solution would be to have some magic words __INDEX__ and __NOINDEX__ that will force indexing on or off for a document. The default could be set per-namespace, and I'm assuming we'd want articlespace to be the only one indexed by default. Not unlike how tables of contents are handled: <4 sections and there is no TOC, >=4 and there is before the first section, but __TOC__ and __NOTOC__ can be used to override this.
And whatever the magic words are, there should be a way to specify in a template "index what transcludes this, but *don't index this template*" for things like {{policy}}. "<includeonly>__INDEX__</includeonly>" would take care of that nicely. (This assumes the keywords actually have effect when transcluded, which IMO they should.) Same for {{essay}} and the like. That way much of the userspace content that people will want to search for will be in Google. People would be advised to avoid using __INDEX__ on their userpage to uphold [[WP:NOTMYSPACE]], or on userfied articles so substandard articles don't show up in Google, etc.
My $0.02, kind of rambling, but hopefully there's some salvageable ideas in there...
On Fri, May 2, 2008 at 11:22 AM, Chris Howie cdhowie@gmail.com wrote:
I think the closest we're going to get to a good technical solution would be to have some magic words __INDEX__ and __NOINDEX__ that will force indexing on or off for a document. The default could be set per-namespace, and I'm assuming we'd want articlespace to be the only one indexed by default.
I think everything should be indexed by default, and we can address problems as they appear (with {{NOINDEX}}) rather than hiding valuable information from the web. Obviously when something is sure to cause problems a template could be used preemptively. Better to make small organic changes than giant sweeping ones in my opinion.
On Fri, May 2, 2008 at 4:24 PM, Judson Dunn cohesion@sleepyhead.org wrote:
On Fri, May 2, 2008 at 11:22 AM, Chris Howie cdhowie@gmail.com wrote:
I think the closest we're going to get to a good technical solution would be to have some magic words __INDEX__ and __NOINDEX__ that will force indexing on or off for a document. The default could be set per-namespace, and I'm assuming we'd want articlespace to be the only one indexed by default.
I think everything should be indexed by default, and we can address problems as they appear (with {{NOINDEX}}) rather than hiding valuable information from the web. Obviously when something is sure to cause problems a template could be used preemptively. Better to make small organic changes than giant sweeping ones in my opinion.
I'm not particularly sold on either preemptive or as-needed yet, just tossing out ideas. There are certain areas that have already been preemptively excluded from search results (like AfD). I'm surprised that ANI hasn't received the same treatment yet.
Even if we use the preemptive approach, a rather large chunk of WP: would probably wind up in Google anyway, if we forced indexing in templates like {{policy}}, {{guideline}}, {{essay}}, etc. A lot of other stuff would probably be useful too but I think the majority of stuff people are searching for would still be there.
Oh and we'd probably also want to allow indexing to Help: by default, if we wind up going that route.
I'm still puzzled: I'd like someone to explain once again why we're doing this instead of having a discussion about when to apply courtesy blanking, which leaves the decision more solidly in the hands of the community, and is less of a blunt instrument than this.
RR
On Fri, May 2, 2008 at 1:05 AM, Charles Matthews < charles.r.matthews@ntlworld.com> wrote:
"David Gerard" wrote
2008/5/1 Charles Matthews charles.r.matthews@ntlworld.com:
Not for that reason - but David's suggestion is a Very Bad Idea.
Tools for site development (eg redlink lists) naturally reside in userspace, and we should care about editors finding them.
ooh, ouch, yes.
(How findable are these in practice in Google at present?)
Oh, very.
We need a different question, really. If MediaWiki were deliberately designed to have some namespaces that were for search engines to find, and others not, how should that be set up?
Something like the article space, Wikipedia: space and some Development: space might do. All discussion namespaces are inherently dodgy. But one more namespace that was non-article, non-policy might do; and one more namespace on the other side, so that AfDs didn't have to be in the Wikipedia: space - now this is sounding more reasonable to me.
Charles
Email sent from www.virginmedia.com/email Virus-checked http://www.virginmedia.com/emailVirus-checked using McAfee(R) Software and scanned for spam
WikiEN-l mailing list WikiEN-l@lists.wikimedia.org To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l