On Wednesday 03 December 2003 14:55, Jimmy Wales wrote:
I wrote:
mmm, yummy. When will we get up the nerve to turn full-text searching back on?
Andrew Alder wrote:
Is this even a good idea? I know everyone has assumed we will, but the current use of Google has its advantages too (see the Village Pump).
Or has this been fully discussed here already, long ago?
As for me, I always just assumed it. There are some big drawbacks to google, namely that it isn't realtime, which makes doing certain kinds of study difficult. Also, Michael Hardy has reported to me that one page he used to find in Google can no longer be found in Google, due presumably to the vagaries of Google indexing.
I have stumbled upon quite a few wikipedia pages that were not being indexed by google, but the same pages on one of the many mirror-type (nationmaster, etc.) sites were being indexed. My take on the whole situation is that google is treating en.wikipedia.org and en2.wikipedia.org as two different entities. Search for 'Rivers of France wikipedia' (http://www.google.com/search?hl=en&lr=&ie=UTF-8&oe=utf-8&q=R...) to see an example of both en and en2 competing for the top spot.
Example for knock-offs scoring higher than wikipedia: Search for 'Napoleonic code' (http://www.google.com/search?q=Napoleonic+code&sourceid=mozilla-search&a...) where both sciencedaily.com (IIRC a rather new mirror) and nationmaster.com rank higher than wikipedia. This could not be explained with the google pagerank algorithm, because wikipedia surely gets a lot more quality and quantity links than others do - But it would make sense if the wikipedia ranking essentially gets divided by two.
This would seem to reduce traffic to wikipedia, which would obviously be a bad thing. Is there some different load-balancing scheme that could be implemented that would be transparent to google?
Best, Sascha Noyes