on a slightly related note, I analyzed the cultural preferences for image, references, links, word count etc. saturation in good and featured articles on 8 wikis and found significant cultural variation:
http://crow.kozminski.edu.pl/papers/cultures%20of%20wikipedias.pdf
best,
dj
On Tue, Jul 24, 2018 at 7:17 PM, Peter Meyer econterms@gmail.com wrote:
Interesting topic! Here is a useful analogy regarding the distribution of sizes. There has been study of how big cities are within countries or worldwide, and there are recurring patterns of the scale of the largest to the second largest, and the second-largest to the third, and so forth.
Without getting into this too deeply you might at least check if the size relations among Wikipedias are like those of cities, that is, if they have a similar-looking distribution. If they do, the underlying forces and dynamics for city sizes might also apply to wikipediae or other sites.
The math is described by Zipf’s law and/or Gibrat’s distribution. https://en.wikipedia.org/wiki/Zipf%27s_law https://en.wikipedia.org/ wiki/Zipf's_law, and https://en.wikipedia.org/wiki/Gibrat%27s_law < https://en.wikipedia.org/wiki/Gibrat%27s_law%3E. The work by Xavier Gabaix, cited there, was my introduction to it.
Like the choice of what city to move to, the relevant Wikipedias for a user will usually need to be “close” — geographically for a city, or to the languages the user knows for a Wikipedia. There are other factors driving a user’s choice, if we think of the user as choosing. If the user wishes to study an obscure academic subject, they may have to use a large wikipedia, and that drives them to also participate there. If the user is focused on a geographically local subject, that drives the choice. A larger wikipedia is more useful than a small one, therefore the distribution of wikipedia sizes would be more unequal than the distribution of personal languages.
It sounds like, based on Poland and Korea, you can show that Internet availability is not driving all the difference. Good to know. — peter meyer
On Jul 24, 2018, at 11:30 AM, James Salsman jsalsman@gmail.com wrote:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
Piotr, if you model organic editing production with a Poisson distribution, which is reasonable for a first approximation, 3x+ disparities are just natural for the same population sizes:
https://en.wikipedia.org/wiki/Poisson_distribution
I'm not sure the images in that article capture the wide platykurtosis of large Poisson distributions.
Best regards, Jim
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l