HI Irene, Wikibabel, Gerd, and Wikidatans, 

How does Wikidata's new lexicographical project work with regard to Swahili (since it is a Wikipedia / Wikidata language) and Google Translate / GNMT re your "Our approach leverages Google Translate to make English Wikipedia articles accessible to underserved communities" (re: https://medium.com/@oirzak/wikibabel-equalizing-information-access-on-a-budget-4038f750e90e)? 

Will a Wikibabel team you help create add Swahili lexemes to the lexicographical project - https://www.wikidata.org/wiki/Wikidata:Lexicographical_data - and then Google GNMT - which is end-to-end translation software ... https://1.bp.blogspot.com/-jwgtcgkgG2o/WDSBrwu9jeI/AAAAAAAABbM/2Eobq-N9_nYeAdeH-sB_NZGbhyoSWgReACLcB/s1600/image01.gif (https://ai.googleblog.com/2016/11/zero-shot-translation-with-googles.html) - use this new Swahili lexicographical data by processing this through its algorithms? 

(WUaS seeks to facilitate machine translation in all 7097 living languages, and by growing out of Google GNMT; WUaS donated itself for co-development to Wikidata in 2015).


On Sun, Jun 17, 2018 at 10:28 PM, Gerard Meijssen <gerard.meijssen@gmail.com> wrote:
I am giving a lot of attention to content that deals with Africa. At that I also target the Swahili wikipedia [1] (I have not filled in all the red links yet). At this moment I am adding information in Wikidata about Tanzanian wards based on sw.wikipedia categories and templates.

Many of the African language Wikipedias are struggling. By making the lists as complete as possible based on categories and lists, the information becomes more useful and better, it can be and is used in the same manner on multiple Wikipedias. At this moment zu yo en sw Wikipedia. As the information is made available using Listeria lists, the information gets updated as and when new information becomes available.

Another notion of mine is that it will help with individual info boxes eg for politicians, or indeed Tanzanian wards .. :)

NB I am a big fan of providing information using machine translation. However, PLEASE consider the lessons learned from the Cebuano Wikipedia and make the texts available in a cached way; not in the final form as saved text.

PS when there is something where we can collaborate, please let me know.

[1] https://sw.wikipedia.org/wiki/Mtumiaji:GerardM

On 18 June 2018 at 01:12, Olya Irzak <oirzak@gmail.com> wrote:
Dear Wikidata community,

We're working on a project called Wikibabel to machine-translate parts of Wikipedia into underserved languages, starting with Swahili.

In hopes that some of our ideas can be helpful to machine translation projects, we wrote a blogpost about how we prioritized which pages to translate, and what categories need a human in the loop:

Rumor has it that the Wikidata community has thought deeply about information access. We'd love your feedback on our work. Please let us know about past / ongoing machine translation related projects so we can learn from & collaborate with them.

Best regards,
Olya & the Wikibabel crew 

Wikidata mailing list

Wikidata mailing list


- Scott MacLeod - Founder & President  
- World University and School

- CC World University and School - like CC Wikipedia with best STEM-centric CC OpenCourseWare - incorporated as a nonprofit university and school in California, and is a U.S. 501 (c) (3) tax-exempt educational organization.