Because on many occasions Google does find the page in question in a certain other language.
According to Google, the Hungarian Wikipedia has 3870 articles written "in English": http://www.google.com/search?hl=en&lr=lang_en&as_q=h%C3%A1bor%C3%BA&...
Over 52,000 articles "in Czech": http://www.google.com/search?hl=en&lr=lang_cs&as_q=h%C3%A1bor%C3%BA&...
Two articles which Google thinks are in Chinese (simplified): http://www.google.com/search?lr=lang_zh-CN&as_sitesearch=hu.wikipedia.or...
The Hungarian article about the National anthem of Russia is supposedly in traditional Chinese: http://www.google.com/search?lr=lang_zh-CN&as_sitesearch=hu.wikipedia.or...
And so on.
Regards, Endre (KovacsUr@huwiki)
----- Original Message ----- From: "Angela" beesley@gmail.com To: "Wikimedia developers" wikitech-l@wikimedia.org Sent: Friday, September 02, 2005 10:22 PM Subject: Re: [Wikitech-l] Assisting Google's language recognition?
Google miscategorizes the language of some of the Hungarian Wikipedia pages. E.g. it thinks that our Adolf Hitler article is in czech.
How do you know they are miscategorising the language?
http://www.google.com/search?q=inurl%3A%22Adolf+Hitler%22+site%3Ahu.wikiped ia.org
This makes it seem like they haven't indexed the page at all, not that they've marked it as the wrong language.
Angela. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l