On Wed, Jun 15, 2011 at 8:46 AM, Alec Conroy alecmconroy@gmail.com wrote:
We could directly ask them to tell us, but upon reflection, the information is already hidden in our database. A multilingual user is one that actively edits two projects of different languages.
That doesn't follow. Perhaps someone speaks a language, but doesn't edit the corresponding wiki. For instance, I know a decent amount of Hebrew, although I wouldn't call myself fluent in Modern Hebrew. But I'm a native English speaker, and English Wikipedia articles are almost always better than the corresponding Hebrew ones (often even on Judaism-related topics). So I have no reason to read the Hebrew Wikipedia, when it takes more effort for me and the content isn't usually as good. Likewise, some people edit exclusively or almost exclusively on multilingual projects like Commons.
On the other hand, people might edit on projects in languages they don't understand. For instance, they might be running scripts that automatically fix interwikis or such. This is less likely, though, once you exclude bot accounts.
If you want this info, toolserver queries are the right way to do it. It should be pretty easy to pull this kind of info out of the revision or recentchanges tables, although it would require reading a lot of data. The simplest way would be to get a list of usernames for each wiki that have edited in the last X days, then use a script to reverse the lists so that you get a list of languages for each user. You'd probably want to only include unified accounts here. (How many accounts still aren't unified?)