On 29/10/12 21:55, Jane Darnell wrote:
Thanks for clearing that up! I was wondering how it
was possible (and of
course was worried that maybe others were seeing undesirable languages).
I'm happy to do that. Things should "just work", but if they don't,
worth digging trying to find out the reason for the alleged misbehavior.
In this case your language preferences were a bit illogical, since you
prefered US English to Dutch but then any kind of English less than
that. As a general rule, I don't think it makes much sense to have a
completely different language with a weight which places it in the midst
of the preference for different variants of one language family.
There are some cases where it could made sense, if the first variant is
very different than the language specified by the generic tag. But en-us
and en is not such case.
I started logging accept-language pairs this morning, as well as the
language on which the survey was shown to them. In most cases it is what
should have been done from looking at the whole accept-language,
headers. I haven't seen other "shuffled" requests, but I see a number of
requests done for a language tag (en-za, fr-FR, es-ES, es-CL, es-mx...)
which don't include the parent tag. So the language choice algorithm,
not being able to match to any accepted language, ends up showing it in
English as last resort.
In most cases, the link followed from Wikimedia Commons based on the
local language has fixed this problem.
I will add another check of partial strings to cover those cases. But
it's the User Agent task to instruct the user to make sane decisions,
not the application.
From section 14.4 of rfc2616 :
Note: When making the choice of linguistic preference available to
the user, we remind implementors of the fact that users are not
familiar with the details of language matching as described above,
and should provide appropriate guidance. As an example, users
might assume that on selecting "en-gb", they will be served any
kind of English document if British English is not available. A
user agent might suggest in such a case to add "en" to get the
best matching behavior.
There's some buggy browser out there regarding this. It is very unlikely
that so many users misconfigured their browser in this way.
I have added some more logging code and I'll try to catch the guilty UAs.
PS: We are at 1254 answers!