Hi Peter,

from my (admittedly from more than ten years ago) experience with the Russian Wikipedia, articles are often translated from other language Wikipedias, with references just taken over and not being independently checked (perhaps it is checked that an online reference is still available, but not its content). I also see my articles on the English Wikipedia being translated to other languages, even when they have Dutch or Russian references which I do not expect the translators to be able to read. This is an anecdotal evidence though.

In addition, there are very few projects where the main population is monolingual, in almost all cases the bulk of the editors speak also a major language which is used like lingua franca (like Russian for the Chuvash Wikipedia, or perhaps Spanish for the Quechua Wikipedia). This makes the problem less acute.

Best
Yaroslav

On Wed, Jun 8, 2022 at 8:44 PM Peter Southwood <peter.southwood@telkomsa.net> wrote:

Interesting research. Maybe I just missed it, but I didn’t notice any discussion of relation of availability of reliable sources to coverage in different languages. In English Wikipedia we are not allowed to write about topics which are not covered by suitable sources, but there may also be more and a wider range of sources available in English, and English Wikipedia is also written by people with a wider range of languages, making more non-English sources available. Is there any research on comparing  this tendency in other languages? If there is no-one editing a Wikipedia who can read a source, no-one can write about its content. It can be very difficult to find sources for some topics, and it would be unsurprising if geographical topics in an area where a given language is not spoken are not covered in that language.

Cheers,

Peter

 

From: Emily Lescak [mailto:elescak@wikimedia.org]
Sent: 08 June 2022 15:13
To: wikimedia-l@lists.wikimedia.org; analytics@lists.wikimedia.org; wiki-research-l@lists.wikimedia.org
Subject: [Wikimedia-l] Wikimedia Research Showcase June 15

 

Hi all,

The next Research ShowcaseWikipedia's Languages, will be live-streamed Wednesday, June 15, at 4:00 AM PST/11:00 AM UTC. View your local time here

YouTube stream: https://www.youtube.com/watch?v=AZQM1dtn3g0

You are welcome to ask questions via YouTube chat or on IRC at #wikimedia-research. 

This month's presentations: 

Quantifying knowledge synchronisation in the 21st century

By Jisung Yoon (Pohang University of Science and Technology)

Humans acquire and accumulate knowledge through language usage and eagerly exchange their knowledge for advancement. Although geographical barriers had previously limited communication, the emergence of information technology has opened new avenues for knowledge exchange. However, it is unclear which communication pathway is dominant in the 21st century. Here, we explore the dominant path of knowledge diffusion in the 21st century using Wikipedia, the largest communal dataset. We evaluate the similarity of shared knowledge between population groups, distinguished based on their language usage. When population groups are more engaged with each other, their knowledge structure is more similar, where engagement is indicated by socio-economic connections, such as cultural, linguistic, and historical features. Moreover, geographical proximity is no longer a critical requirement for knowledge dissemination. Furthermore, we integrate our data into a mechanistic model to better understand the underlying mechanism and suggest that the knowledge "Silk Road" of the 21st century is based online.


The Language Geography of Wikipedia

By Martin Dittus

Every language is a system of being, doing, knowing, and imagining. With over 7,000 active languages in the world, how many languages are fully represented online? To answer this question, digital non-profit Whose Knowledge? initiated the first ever report on the State of the Internet's Languages. As part of this report, Martin Dittus and Mark Graham have investigated the languages of Wikipedia. Wikipedia began with a single English-language edition more than two decades ago, and now offers more than 300 language editions, which places it at the forefront of digital language support. However, this does not mean that speakers of these languages get access to the same content: Wikipedia’s language editions vary widely in scale. We further find that this inequality is also reflected in Wikipedia’s geographic coverage: not all places are captured in every language. Wikipedia's coverage often follows the global distribution of speakers of the respective language. Yet even when we account for the distribution of language populations, certain language communities are much more strongly represented on Wikipedia than others. As a consequence, we find that for many countries in Africa, Central and South America, and South Asia, most of the content about those countries is in a foreign language, often a European-colonial language. In other words, in many of these places, people may need to be able to speak a second (possibly foreign) language in order to access Wikipedia information about their own places. Why do we see these differences? And what can be done to improve things?

You can also watch our past research showcases here: https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

 

Emily, on behalf of the Research team

 

--

Emily Lescak (she / her)

Senior Research Community Officer

The Wikimedia Foundation

 

Virus-free. www.avg.com

 

_______________________________________________
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/7TWLX36C5PSHFFSQGCXGMVR35QB7LRRV/
To unsubscribe send an email to wikimedia-l-leave@lists.wikimedia.org