‬

2018-06-18 2:12 GMT+03:00 Olya Irzak <oirzak@gmail.com>:

Dear Wikidata community,

We're working on a project called Wikibabel to machine-translate parts of Wikipedia into underserved languages, starting with Swahili.

In hopes that some of our ideas can be helpful to machine translation projects, we wrote a blogpost about how we prioritized which pages to translate, and what categories need a human in the loop:
https://medium.com/@oirzak/wikibabel-equalizing-information-access-on-a-budget-4038f750e90e

Rumor has it that the Wikidata community has thought deeply about information access. We'd love your feedback on our work. Please let us know about past / ongoing machine translation related projects so we can learn from & collaborate with them.

I'm not sure how has the Wikidata community think deeply about it.

One project that does something related to what you're doing is GapFinder ( https://www.mediawiki.org/wiki/GapFinder ). As far as I know, the GapFinder frontend is not developed actively, but the recommendation API behind it is being actively maintained and developed, but you should ask the Research team for more info (see https://www.mediawiki.org/wiki/Wikimedia_Research ).

Project Tiger is also doing something similar: https://meta.wikimedia.org/wiki/Project_Tiger_Editathon_2018

As a general comment, displaying machine-translated text in a way that appears that is had been written by humans is misleading and damaging. I don't know any Swahili, but in languages that I can read (Russian, Hebrew, Catalan, Spanish, French, German), the quality of machine translation is at its best good as an aid during writing a translation by a human, and it's never good for actually reading. I also don't understand why do you invest credits into pre-machine-translating articles that people can machine-translate for free, but maybe I'm missing something about how your project works.