Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
Piotr, if you model organic editing production with a Poisson distribution, which is reasonable for a first approximation, 3x+ disparities are just natural for the same population sizes:
https://en.wikipedia.org/wiki/Poisson_distribution
I'm not sure the images in that article capture the wide platykurtosis of large Poisson distributions.
Best regards, Jim
Interesting topic! Here is a useful analogy regarding the distribution of sizes. There has been study of how big cities are within countries or worldwide, and there are recurring patterns of the scale of the largest to the second largest, and the second-largest to the third, and so forth.
Without getting into this too deeply you might at least check if the size relations among Wikipedias are like those of cities, that is, if they have a similar-looking distribution. If they do, the underlying forces and dynamics for city sizes might also apply to wikipediae or other sites.
The math is described by Zipf’s law and/or Gibrat’s distribution. https://en.wikipedia.org/wiki/Zipf%27s_law https://en.wikipedia.org/wiki/Zipf's_law, and https://en.wikipedia.org/wiki/Gibrat%27s_law https://en.wikipedia.org/wiki/Gibrat's_law. The work by Xavier Gabaix, cited there, was my introduction to it.
Like the choice of what city to move to, the relevant Wikipedias for a user will usually need to be “close” — geographically for a city, or to the languages the user knows for a Wikipedia. There are other factors driving a user’s choice, if we think of the user as choosing. If the user wishes to study an obscure academic subject, they may have to use a large wikipedia, and that drives them to also participate there. If the user is focused on a geographically local subject, that drives the choice. A larger wikipedia is more useful than a small one, therefore the distribution of wikipedia sizes would be more unequal than the distribution of personal languages.
It sounds like, based on Poland and Korea, you can show that Internet availability is not driving all the difference. Good to know. — peter meyer
On Jul 24, 2018, at 11:30 AM, James Salsman jsalsman@gmail.com wrote:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
Piotr, if you model organic editing production with a Poisson distribution, which is reasonable for a first approximation, 3x+ disparities are just natural for the same population sizes:
https://en.wikipedia.org/wiki/Poisson_distribution
I'm not sure the images in that article capture the wide platykurtosis of large Poisson distributions.
Best regards, Jim
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
on a slightly related note, I analyzed the cultural preferences for image, references, links, word count etc. saturation in good and featured articles on 8 wikis and found significant cultural variation:
http://crow.kozminski.edu.pl/papers/cultures%20of%20wikipedias.pdf
best,
dj
On Tue, Jul 24, 2018 at 7:17 PM, Peter Meyer econterms@gmail.com wrote:
Interesting topic! Here is a useful analogy regarding the distribution of sizes. There has been study of how big cities are within countries or worldwide, and there are recurring patterns of the scale of the largest to the second largest, and the second-largest to the third, and so forth.
Without getting into this too deeply you might at least check if the size relations among Wikipedias are like those of cities, that is, if they have a similar-looking distribution. If they do, the underlying forces and dynamics for city sizes might also apply to wikipediae or other sites.
The math is described by Zipf’s law and/or Gibrat’s distribution. https://en.wikipedia.org/wiki/Zipf%27s_law https://en.wikipedia.org/ wiki/Zipf's_law, and https://en.wikipedia.org/wiki/Gibrat%27s_law < https://en.wikipedia.org/wiki/Gibrat%27s_law%3E. The work by Xavier Gabaix, cited there, was my introduction to it.
Like the choice of what city to move to, the relevant Wikipedias for a user will usually need to be “close” — geographically for a city, or to the languages the user knows for a Wikipedia. There are other factors driving a user’s choice, if we think of the user as choosing. If the user wishes to study an obscure academic subject, they may have to use a large wikipedia, and that drives them to also participate there. If the user is focused on a geographically local subject, that drives the choice. A larger wikipedia is more useful than a small one, therefore the distribution of wikipedia sizes would be more unequal than the distribution of personal languages.
It sounds like, based on Poland and Korea, you can show that Internet availability is not driving all the difference. Good to know. — peter meyer
On Jul 24, 2018, at 11:30 AM, James Salsman jsalsman@gmail.com wrote:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
Piotr, if you model organic editing production with a Poisson distribution, which is reasonable for a first approximation, 3x+ disparities are just natural for the same population sizes:
https://en.wikipedia.org/wiki/Poisson_distribution
I'm not sure the images in that article capture the wide platykurtosis of large Poisson distributions.
Best regards, Jim
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Regarding featured articles, I conducted a small study (should be out in Oct.) on the Portuguese Wikipedia about those related to Ancient History. Although the sample was obviously small, my findings were clear and confirmed by many admins later: most articles are translations/new material made by a very small group of frequent editors, who use their stats to legitimate power as admins. Again here, cultural issues pair with specific community behavior.
Great material, Dariusz, thanks for sharing!
Juliana
On Tue, Jul 24, 2018 at 7:17 PM, Dariusz Jemielniak darekj@alk.edu.pl wrote:
on a slightly related note, I analyzed the cultural preferences for image, references, links, word count etc. saturation in good and featured articles on 8 wikis and found significant cultural variation:
http://crow.kozminski.edu.pl/papers/cultures%20of%20wikipedias.pdf
best,
dj
On Tue, Jul 24, 2018 at 7:17 PM, Peter Meyer econterms@gmail.com wrote:
Interesting topic! Here is a useful analogy regarding the distribution of sizes. There has been study of how big cities are within countries or worldwide, and there are recurring patterns of the scale of the largest
to
the second largest, and the second-largest to the third, and so forth.
Without getting into this too deeply you might at least check if the size relations among Wikipedias are like those of cities, that is, if they
have
a similar-looking distribution. If they do, the underlying forces and dynamics for city sizes might also apply to wikipediae or other sites.
The math is described by Zipf’s law and/or Gibrat’s distribution. https://en.wikipedia.org/wiki/Zipf%27s_law https://en.wikipedia.org/ wiki/Zipf's_law, and https://en.wikipedia.org/wiki/Gibrat%27s_law < https://en.wikipedia.org/wiki/Gibrat%27s_law%3E. The work by Xavier Gabaix, cited there, was my introduction to it.
Like the choice of what city to move to, the relevant Wikipedias for a user will usually need to be “close” — geographically for a city, or to
the
languages the user knows for a Wikipedia. There are other factors
driving
a user’s choice, if we think of the user as choosing. If the user wishes to study an obscure academic subject, they may have to use a large wikipedia, and that drives them to also participate there. If the user
is
focused on a geographically local subject, that drives the choice. A larger wikipedia is more useful than a small one, therefore the distribution of wikipedia sizes would be more unequal than the
distribution
of personal languages.
It sounds like, based on Poland and Korea, you can show that Internet availability is not driving all the difference. Good to know. — peter meyer
On Jul 24, 2018, at 11:30 AM, James Salsman jsalsman@gmail.com
wrote:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
Piotr, if you model organic editing production with a Poisson distribution, which is reasonable for a first approximation, 3x+ disparities are just natural for the same population sizes:
https://en.wikipedia.org/wiki/Poisson_distribution
I'm not sure the images in that article capture the wide platykurtosis of large Poisson distributions.
Best regards, Jim
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- ________________________________________________________ http://nerds.kozminski.edu.pl/ prof. dr hab. Dariusz Jemielniak kierownik katedry MINDS (Management in Networked and Digital Societies) Akademia Leona Koźmińskiego http://NeRDS.kozminski.edu.pl http://nerds.kozminski.edu.pl/
*Ostatnie artykuły:*
- Dariusz Jemielniak, Maciej Wilamowski (2017) Cultural Diversity of
Quality of Information on Wikipedias http://crow.kozminski.edu.pl/papers/cultures%20of%20wikipedias.pdf *Journal of the Association for Information Science and Technology* 68: 10. 2460–2470.
- Dariusz Jemielniak (2016) Wikimedia Movement Governance: The Limits
of A-Hierarchical Organization http://www.crow.kozminski.edu.pl/papers/wikimedia_governance.pdf *Journal of Organizational Change Management *29: 3. 361-378.
- Dariusz Jemielniak, Eduard Aibar (2016) Bridging the Gap Between
Wikipedia and Academia http://www.crow.kozminski.edu.pl/papers/bridging.pdf *Journal of the Association for Information Science and Technology* 67: 7. 1773-1776.
- Dariusz Jemielniak (2016) Breaking the Glass Ceiling on Wikipedia
http://www.crow.kozminski.edu.pl/papers/glass-ceiling.pdf *Feminist Review *113: 1. 103-108.
- Tadeusz Chełkowski, Peter Gloor, Dariusz Jemielniak (2016)
Inequalities in Open Source Software Development: Analysis of Contributor’s Commits in Apache Software Foundation Projects http://journals.plos.org/plosone/article/asset?id=10. 1371%2Fjournal.pone.0152976.PDF , *PLoS ONE* 11: 4. e0152976. _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org