Robots can be useful as probably Google contributes a lot to bringing new visitors to the site. Being a community this surely is a usefull thing. An issue worth considering is banning robots from special pages, Recent Changes, Talk, User, etc. by an appropriate META tag. IMHO, robots and spiders should only be allowed to go over the articles.
<snip>
regards, [[user:WojPob]]
I COMPLETELY disagree with this. Let the robots crawl everything. It's better that someone finds one of our Talk or User pages and cruises on over to our main site than to simply find a completely website!
Chuck
===== Come to my homepage! Venu al mia hejmpagxo! http://amuzulo.babil.komputilo.org/ ==== Venu al la senpaga, libera enciklopedio esperanta reta! http://eo.wikipedia.com/
_________________________________________________________ Do You Yahoo!? Información de Estados Unidos y América Latina, en Yahoo! Noticias. Visítanos en http://noticias.espanol.yahoo.com
On 5/17/02 3:26 PM, "Chuck Smith" msochuck@yahoo.com wrote:
Robots can be useful as probably Google contributes a lot to bringing new visitors to the site. Being a community this surely is a usefull thing. An issue worth considering is banning robots from special pages, Recent Changes, Talk, User, etc. by an appropriate META tag. IMHO, robots and spiders should only be allowed to go over the articles.
<snip>
regards, [[user:WojPob]]
I COMPLETELY disagree with this. Let the robots crawl everything. It's better that someone finds one of our Talk or User pages and cruises on over to our main site than to simply find a completely website!
Chuck
I agree with Chuck, strongly.
The Cunctator wrote:
On 5/17/02 3:26 PM, "Chuck Smith" msochuck@yahoo.com wrote:
[Snip]
I COMPLETELY disagree with this. Let the robots crawl everything. It's better that someone finds one of our Talk or User pages and cruises on over to our main site than to simply find a completely website!
Chuck
I agree with Chuck, strongly.
[Wikipedia-l] To manage your subscription to this list, please go here: http://www.nupedia.com/mailman/listinfo/wikipedia-l
I have to wonder though...if a spider goes to Recent Changes and then to "Last 5000 changes" (and last 90 days, and last 30 days, and last 2500 changes, and last 1000 changes, and every such combination) it seems to me the server load could get pretty high. Perhaps talk pages should be spidered, but not recent changes or the history (diff/changes).
On ven, 2002-05-17 at 13:04, General Wesc (LKBM) wrote:
I have to wonder though...if a spider goes to Recent Changes and then to "Last 5000 changes" (and last 90 days, and last 30 days, and last 2500 changes, and last 1000 changes, and every such combination) it seems to me the server load could get pretty high.
Does anyone _really_ ever want to look at the last 5000 changes?
As for the higher day values, they'll be useful for the less active other-language wikipedias once they're converted, though it might be good to have "intelligent" scaling on that bar. 90 days worth of changes on the English 'pedia would go well over ever the 5000 changes limit... (At the moment, 5000 only gets us back to about April 27.)
Perhaps talk pages should be spidered, but not recent changes or the history (diff/changes).
A robots.txt could easily be set up to disallow /wiki/special%3ARecentChanges (and various case variations). That only stops _nice_ spiders, of course.
History links would need to be changed to be sufficiently distinguishable, for instance using /wiki.phtml?title=Foo&action=history etc; then ban /wiki.phtml.
-- brion vibber (brion @ pobox.com)
wikipedia-l@lists.wikimedia.org