2008/11/1 Erik Moeller <erik(a)wikimedia.org>rg>:
2008/11/1 Thomas Dalton
<thomas.dalton(a)gmail.com>om>:
Well, you may end up with a disproportionate
representation of people
speaking languages near the beginning of the alphabet.
Putting some countries first protects against such selection bias in
common cases, though it could potentially introduce other biases
(countries not in the top list may be underrepresented). The only way
to truly protect against selection biases of any kind is to randomize
the list, which obviously is much more cumbersome.
Indeed, there is no ideal solution.
We'll have to see the actual data to assess how
large these potential
distortions might be. For example, if 95% of respondents completed the
country/languages questions, then the selection bias of not finding
your country is probably relatively small.
It's the people that stopped answering questions completely just
before the language questions that are the problem - there is no way
to know if they gave up because they couldn't find their language or
because they'd just had enough. Obviously, if very few people stopped
at that point then it doesn't matter, but chances are a significant
number would have stopped at that point by random chance which makes
it difficult to interpret the data.