On 8/5/08, Anthony wikimail@inbox.org wrote:
On Tue, Aug 5, 2008 at 9:03 AM, Jonathan Hughes lifebaka@gmail.com wrote:
One thing no-indexing user and user talk namespaces would help with is to curb the recent trend of userpage spam. I see half a dozen or more userpages a day which are spam or masquerading as articles. If userspace wasn't indexed, pretty soon the companies/persons who attempt this sort
of
advertising will figure out it doesn't work; no one ever finds their "article" from Google or Yahoo.
The most recent example that springs into my mind is [[User:Kliff Hanger Dot Com]] (whose page I didn't think spamish enough to delete, though I still blanked it), http://en.wikipedia.org/wiki/User:Kliff_Hanger_Dot_Com . There's no question that page should not be sitting around in userspace where people can Google it.
Do you really think noindexing of user pages would make any difference there? The page is obviously self-promotional, but I would think the purpose is more to promote oneself to Wikipedians, not to random Google searchers.
I don't even agree with you that "there's no question" that user page shoudn't be in Google. Wikipedians seem to have chosen to not allow certain types of self-promotion on user pages, but that's by no means a policy which is beyond question. Most websites *allow* blatant self-promotion on people's user page equivalents - for example, Google Knol certainly wouldn't delete a page like this, and they wouldn't noindex it either. Unless there's some other info I'm missing, I'd assume good faith here and give the person who created that page the benefit of the doubt; assume that they weren't aware of the rules and thought there was nothing wrong with what they were doing. (As an aside, had the user chosen the username "Klff Hanger", and made a few contributions to articles, I don't even see a rules violation, though I admit I'm not up to date on the current !rules. I can think of lots of user pages which are self-promotional.)
Of course, this brings up another issue, which I think is the real problem with indexing user and user talk pages (as well as project and project talk pages). User pages probably *should* be in search engines, they just shouldn't be ranked nearly as highly as they tend to be.
In hindsight, these pages probably should have been put on a different domain name, and probably a single domain name for all users in all projects. That's probably a long way off if it ever gets implemented at all (with SUL now mostly? complete it's a possibility), but one thing that can be done today is that nofollow can be applied to links to these pages. Then at least the search engines will give them a lower rank. Maybe I should submit a couple bug reports. _______________________________________________
My initial concerns when I started this thread related primarily to project-space pages, not userspace, and I would propose that the former be addressed first to avoid the typical situation where the discussion meanders in various directions and therefore comparatively little actually gets done. There is absolutely no reason that after several months of discussion DRV, AN/ANI/AN3, SSP, RfCU, WQA, and the former PAIN and CSN archives are, to the best of my knowledge, still searchable.
My own view with respect to userspace is that the individual user should probably have the ability to decide whether he or she wants his or her pages indexed, subject to override where necessary (e.g., an indefblocked user's page should not be indexed). I don't have a strong view on whether userpages should be presumptively indexed with the user having the ability to opt out, or presumptively no-indexed with the user having the ability to opt in.
Newyorkbrad