The recent elections showed us that language issues and translation are something we have to take very seriously from now on. As a first step towards improving communication, it seems like we should get an idea of which users speak which languages?
We could directly ask them to tell us, but upon reflection, the information is already hidden in our database. A multilingual user is one that actively edits two projects of different languages.
In devising a comprehensive translation strategy, we need to know how interconnected any two given projects are. We also need to know how connected any given project is to English, since it's our working language.
We need to pay special attention to languages that are very 'distant' from English-- distant in the sense of having few members who fluent in both English and the language in question.
Could someone aid me in getting this data, or explaining why I don't need it or why we already have it, etc?
Specifically, I'm looking for: # For each non-english-language project, how many of their active users are ALSO active on an english-language project? (the answer is should be a single whole number for each project) # For any two projects, how many users are there who are active on both? (answer is a square matrix, roughly 750x750 ) # For any two languages, how many users appear to speak both languages? (answer is a square matrix, roughly 750x750)
Does anyone know how to pull this out of the database? It's an important question for us to recruit translators and really just assess "where we are" in terms of inter-project language capabilities.
Alec