Wikimedia as a movement has over the years given consideration to small language Wikipedias. I would like to point you to a recent study I alongside with Hady Elsahar of the Université de Lyon and Pavlos Vougiouklis of the University of Southampton have been pursuing, which has been recently translated to accepted publications.
My research interest involves mainly underserved languages on Wikidata and Wikipedia, and how we can support them better.
One of the ways to support small Wikipedias was the ArticlePlaceholder [1]. The idea is to use the existing multilingual information in Wikidata [2] and display it in a reader friendly way on Wikipedia in the respective language (if a Wikidata label exists in this language).
However, at the moment the data is given only in a tabular form, which is not very reader friendly and might not be the ideal way to engage editors to work on the articles.
Therefore, we worked on producing sentences from the information on Wikidata in the given language. We trained a neural network model, the details can be found in the preprint of the NAACL paper here: https://arxiv.org/abs/1803.07116 Given the promising results of the approach using our neural network, we extended the work to see how we could fit in this text generation into the existing ArticlePlaceholder and tested it with the Esperanto and Arabic Wikipedia communities. The ESWC paper preprint for this work can be found here: https://2018.eswc-conferences.org/wp-content/uploads/2018/02/ESWC2018_paper_...
We show that our approach is feasible for generating text from Wikidata for Wikipedia. Editors tend to reuse the sentences, which shows it can be a good encouragement to create full articles from those summaries.
We would like to implement the work in a test Wikipedia to see if communities are interested in adopting the technology on a large scale in their Wikipedias.
Furthermore, we would love to hear your input: Do you believe, one sentence summaries are enough, can we serve the communities needs better with more than one sentence? Is this still true if longer abstracts would be of lower text quality? What other interesting use cases for such a technology in the Wikimedia world can you imagine? And especially if you are part of a underserved language Wikipedia community, what is your opinion to the project?
[1] https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder and https://commons.wikimedia.org/wiki/File:Generating_Article_Placeholders_from... [2] https://eprints.soton.ac.uk/413433/1/Open_Sym_Short_Paper_Wikidata_Multiling...