Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
Very interesting and much-needee research. Thanks for doing this. I'd love to see the results and even the process.
Some things to consider: 1. How long is the tradition of having published encyclopedias in that culture? 2. Alphabet: Using a common alphabet may make it somewhat easier to translate information between languages that use it, especially for things like towns and biographies. The Korean alphabet is used only by one language, but the Latin and the Cyrillic alphabets are used by many (with variations). 3. How long is the tradition of *actually* having public education for everybody: rich and poor, cities and villages? By "actually" I mean "not just by law, but in practice". 4. How long is the tradition of mostly-universal literacy? ("Literacy" is one of the most fuzzily defined concepts. Here I refer to something like "being able to read a newspaper and to write a one-page letter in one's own native language".) 5. How long is the tradition of having public libraries in most towns and villages? 6. How common is it to know other languages? 7. How isolated or open is the society that speaks this language in terms of access to media from other countries, translation of literature from other languages, travel to other countries? 8. How widespread are basic computer literacy skills: using a web browser; sending an email; copying, down/uploading, and deleting files. 9. How long is the tradition of having language resources, such as dictionaries, spelling standards, thesauri, style guides? 10. Is the language used completely in public education for teaching, textbooks, and homework? Or is the education mostly done in a foreign language? (This, roughly, is the situation in the Philippines and in many African countries.) 11. When did the language become an official language of a country? (If at all.) 12. Are there political, cultural, or government-suported movements for language development or preservation? 13. When did it become universally possible to fully write this language on a computer, with complete keyboards and fonts support? E.g., English has been easy to use on any computer for as long as there are computers; Polish, German, Russian and many other languages have been supported for a long time, but still struggled with encodings and diacritics in the 1990s; India and Burma are still struggling; I'm not sure about Korea.
These are the immediate things I can think about. There are probably many more criteria that could be considered.
The economics around a country are probably very important (poverty, access to infrastructure, healthcare, etc.), and you mentioned in your first email that you accounted for it, although I don't know in how much detail, so I trust you on that :)
בתאריך 24 ביולי 2018 12:04, "Piotr Konieczny" piokon@post.pl כתב:
Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
One other thing to consider is the specifics of how a language group/culture deals with collaborative work. I have no idea how to tackle this, though I've seen some studies in that direction.
I'm sure some of you here have heard about the absolute mess and conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists, very hostile to newcomers and split into groups constantly fighting for power. I'm sure that's part of why PT:WP isn't bigger.
Juliana
On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
Very interesting and much-needee research. Thanks for doing this. I'd love to see the results and even the process.
Some things to consider:
- How long is the tradition of having published encyclopedias in that
culture? 2. Alphabet: Using a common alphabet may make it somewhat easier to translate information between languages that use it, especially for things like towns and biographies. The Korean alphabet is used only by one language, but the Latin and the Cyrillic alphabets are used by many (with variations). 3. How long is the tradition of *actually* having public education for everybody: rich and poor, cities and villages? By "actually" I mean "not just by law, but in practice". 4. How long is the tradition of mostly-universal literacy? ("Literacy" is one of the most fuzzily defined concepts. Here I refer to something like "being able to read a newspaper and to write a one-page letter in one's own native language".) 5. How long is the tradition of having public libraries in most towns and villages? 6. How common is it to know other languages? 7. How isolated or open is the society that speaks this language in terms of access to media from other countries, translation of literature from other languages, travel to other countries? 8. How widespread are basic computer literacy skills: using a web browser; sending an email; copying, down/uploading, and deleting files. 9. How long is the tradition of having language resources, such as dictionaries, spelling standards, thesauri, style guides? 10. Is the language used completely in public education for teaching, textbooks, and homework? Or is the education mostly done in a foreign language? (This, roughly, is the situation in the Philippines and in many African countries.) 11. When did the language become an official language of a country? (If at all.) 12. Are there political, cultural, or government-suported movements for language development or preservation? 13. When did it become universally possible to fully write this language on a computer, with complete keyboards and fonts support? E.g., English has been easy to use on any computer for as long as there are computers; Polish, German, Russian and many other languages have been supported for a long time, but still struggled with encodings and diacritics in the 1990s; India and Burma are still struggling; I'm not sure about Korea.
These are the immediate things I can think about. There are probably many more criteria that could be considered.
The economics around a country are probably very important (poverty, access to infrastructure, healthcare, etc.), and you mentioned in your first email that you accounted for it, although I don't know in how much detail, so I trust you on that :)
בתאריך 24 ביולי 2018 12:04, "Piotr Konieczny" piokon@post.pl כתב:
Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
This is a very interesting project.
Just in short remark in line with Juliana’s observation: the hardest part would be to account for the specific "inner" culture developed by each wikimedian communities. Since most of them has started on a relatively small scale, numerous norms and lasting social dynamics can be explained by the initial choices / tastes of a limited set of individuals. Of course, they may in turn result from a wider cultural background but also may be simply idiosyncratic.
I guess discriminating this factor would be quite hard. Perhaps using contributing data (when they exist) in the dumps and the archives of mailing lists would help at least to get a general idea of the initial social environment.
Alexander Doria / PCL
Le 24 juil. 2018 à 12:04, Juliana Bastos Marques domusaurea@gmail.com a écrit :
One other thing to consider is the specifics of how a language group/culture deals with collaborative work. I have no idea how to tackle this, though I've seen some studies in that direction.
I'm sure some of you here have heard about the absolute mess and conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists, very hostile to newcomers and split into groups constantly fighting for power. I'm sure that's part of why PT:WP isn't bigger.
Juliana
On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
Very interesting and much-needee research. Thanks for doing this. I'd love to see the results and even the process.
Some things to consider:
- How long is the tradition of having published encyclopedias in that
culture? 2. Alphabet: Using a common alphabet may make it somewhat easier to translate information between languages that use it, especially for things like towns and biographies. The Korean alphabet is used only by one language, but the Latin and the Cyrillic alphabets are used by many (with variations). 3. How long is the tradition of *actually* having public education for everybody: rich and poor, cities and villages? By "actually" I mean "not just by law, but in practice". 4. How long is the tradition of mostly-universal literacy? ("Literacy" is one of the most fuzzily defined concepts. Here I refer to something like "being able to read a newspaper and to write a one-page letter in one's own native language".) 5. How long is the tradition of having public libraries in most towns and villages? 6. How common is it to know other languages? 7. How isolated or open is the society that speaks this language in terms of access to media from other countries, translation of literature from other languages, travel to other countries? 8. How widespread are basic computer literacy skills: using a web browser; sending an email; copying, down/uploading, and deleting files. 9. How long is the tradition of having language resources, such as dictionaries, spelling standards, thesauri, style guides? 10. Is the language used completely in public education for teaching, textbooks, and homework? Or is the education mostly done in a foreign language? (This, roughly, is the situation in the Philippines and in many African countries.) 11. When did the language become an official language of a country? (If at all.) 12. Are there political, cultural, or government-suported movements for language development or preservation? 13. When did it become universally possible to fully write this language on a computer, with complete keyboards and fonts support? E.g., English has been easy to use on any computer for as long as there are computers; Polish, German, Russian and many other languages have been supported for a long time, but still struggled with encodings and diacritics in the 1990s; India and Burma are still struggling; I'm not sure about Korea.
These are the immediate things I can think about. There are probably many more criteria that could be considered.
The economics around a country are probably very important (poverty, access to infrastructure, healthcare, etc.), and you mentioned in your first email that you accounted for it, although I don't know in how much detail, so I trust you on that :)
בתאריך 24 ביולי 2018 12:04, "Piotr Konieczny" piokon@post.pl כתב:
Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- www.domusaurea.org _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Along this line I saw a terrific study recently looking at patent coauthors. Patents can be filed by individuals or by multiple individuals, and if people work together on patents in different groups this builds “networks” among inventors, in which they have previous coauthorship links. If patents are filed only by single individuals there might be just as many inventions, but the networks are not built together as much.
The study looked at patents in Sweden and Spain in the 19th century. It is by David Andersson and Patricio Saiz who are experts in the patent data from these countries. They found the Swedish patents were likely to be coauthored, and the Spanish ones were not. They looked at the resulting network links. They argue that it led to more industrialization and growth in the Swedish case than in the Spanish case.
This is very helpful and insightful I thought. it was kind of gripping because they make a connection over the course of 100 years, in which the individuals from the early period are no longer relevant in the later period; it is an assertion about a long-lasting property.
Is this from a more cooperative culture in one place, and the opportunity for such networks to industrialize using later technologies? Or, is it a result of different industries naturally springing up in the different countries? Not entirely clear.
However the link to a fundamentally flexible cooperative cultures that exist before wikipedia could explain the differences in growth. This is one paper to analogize to. Maybe the places where patents are most coauthored also generate larger decentralized/cooperative works.
On Jul 24, 2018, at 8:19 AM, Pierre-Carl Langlais pierrecarl.langlais@gmail.com wrote:
This is a very interesting project.
Just in short remark in line with Juliana’s observation: the hardest part would be to account for the specific "inner" culture developed by each wikimedian communities. Since most of them has started on a relatively small scale, numerous norms and lasting social dynamics can be explained by the initial choices / tastes of a limited set of individuals. Of course, they may in turn result from a wider cultural background but also may be simply idiosyncratic.
I guess discriminating this factor would be quite hard. Perhaps using contributing data (when they exist) in the dumps and the archives of mailing lists would help at least to get a general idea of the initial social environment.
Alexander Doria / PCL
Le 24 juil. 2018 à 12:04, Juliana Bastos Marques domusaurea@gmail.com a écrit :
One other thing to consider is the specifics of how a language group/culture deals with collaborative work. I have no idea how to tackle this, though I've seen some studies in that direction.
I'm sure some of you here have heard about the absolute mess and conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists, very hostile to newcomers and split into groups constantly fighting for power. I'm sure that's part of why PT:WP isn't bigger.
Juliana
On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni < amir.aharoni@mail.huji.ac.il> wrote:
Very interesting and much-needee research. Thanks for doing this. I'd love to see the results and even the process.
Some things to consider:
- How long is the tradition of having published encyclopedias in that
culture? 2. Alphabet: Using a common alphabet may make it somewhat easier to translate information between languages that use it, especially for things like towns and biographies. The Korean alphabet is used only by one language, but the Latin and the Cyrillic alphabets are used by many (with variations). 3. How long is the tradition of *actually* having public education for everybody: rich and poor, cities and villages? By "actually" I mean "not just by law, but in practice". 4. How long is the tradition of mostly-universal literacy? ("Literacy" is one of the most fuzzily defined concepts. Here I refer to something like "being able to read a newspaper and to write a one-page letter in one's own native language".) 5. How long is the tradition of having public libraries in most towns and villages? 6. How common is it to know other languages? 7. How isolated or open is the society that speaks this language in terms of access to media from other countries, translation of literature from other languages, travel to other countries? 8. How widespread are basic computer literacy skills: using a web browser; sending an email; copying, down/uploading, and deleting files. 9. How long is the tradition of having language resources, such as dictionaries, spelling standards, thesauri, style guides? 10. Is the language used completely in public education for teaching, textbooks, and homework? Or is the education mostly done in a foreign language? (This, roughly, is the situation in the Philippines and in many African countries.) 11. When did the language become an official language of a country? (If at all.) 12. Are there political, cultural, or government-suported movements for language development or preservation? 13. When did it become universally possible to fully write this language on a computer, with complete keyboards and fonts support? E.g., English has been easy to use on any computer for as long as there are computers; Polish, German, Russian and many other languages have been supported for a long time, but still struggled with encodings and diacritics in the 1990s; India and Burma are still struggling; I'm not sure about Korea.
These are the immediate things I can think about. There are probably many more criteria that could be considered.
The economics around a country are probably very important (poverty, access to infrastructure, healthcare, etc.), and you mentioned in your first email that you accounted for it, although I don't know in how much detail, so I trust you on that :)
בתאריך 24 ביולי 2018 12:04, "Piotr Konieczny" piokon@post.pl כתב:
Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- www.domusaurea.org _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Very interesting project indeed!
There is a study presented at Hypertext 2015, in which the authors compared the behaviour of Yahoo Answer users across several countries. To perform their comparison, they used cultural metrics from previous studies, which you may find useful. Here’s the paper: http://www.cse.usf.edu/dsg/data/publications/papers/culture_ht.pdf
Hope this can be useful. Alessandro
––– Alessandro Piscopo Web and Internet Science Group School of Electronics and Computer Science University of Southampton email: A.Piscopo@soton.ac.ukmailto:A.Piscopo@soton.ac.uk
On 24 Jul 2018, at 18:27, Peter Meyer <econterms@gmail.commailto:econterms@gmail.com> wrote:
Along this line I saw a terrific study recently looking at patent coauthors. Patents can be filed by individuals or by multiple individuals, and if people work together on patents in different groups this builds “networks” among inventors, in which they have previous coauthorship links. If patents are filed only by single individuals there might be just as many inventions, but the networks are not built together as much.
The study looked at patents in Sweden and Spain in the 19th century. It is by David Andersson and Patricio Saiz who are experts in the patent data from these countries. They found the Swedish patents were likely to be coauthored, and the Spanish ones were not. They looked at the resulting network links. They argue that it led to more industrialization and growth in the Swedish case than in the Spanish case.
This is very helpful and insightful I thought. it was kind of gripping because they make a connection over the course of 100 years, in which the individuals from the early period are no longer relevant in the later period; it is an assertion about a long-lasting property.
Is this from a more cooperative culture in one place, and the opportunity for such networks to industrialize using later technologies? Or, is it a result of different industries naturally springing up in the different countries? Not entirely clear.
However the link to a fundamentally flexible cooperative cultures that exist before wikipedia could explain the differences in growth. This is one paper to analogize to. Maybe the places where patents are most coauthored also generate larger decentralized/cooperative works.
On Jul 24, 2018, at 8:19 AM, Pierre-Carl Langlais <pierrecarl.langlais@gmail.commailto:pierrecarl.langlais@gmail.com> wrote:
This is a very interesting project.
Just in short remark in line with Juliana’s observation: the hardest part would be to account for the specific "inner" culture developed by each wikimedian communities. Since most of them has started on a relatively small scale, numerous norms and lasting social dynamics can be explained by the initial choices / tastes of a limited set of individuals. Of course, they may in turn result from a wider cultural background but also may be simply idiosyncratic.
I guess discriminating this factor would be quite hard. Perhaps using contributing data (when they exist) in the dumps and the archives of mailing lists would help at least to get a general idea of the initial social environment.
Alexander Doria / PCL
Le 24 juil. 2018 à 12:04, Juliana Bastos Marques <domusaurea@gmail.commailto:domusaurea@gmail.com> a écrit :
One other thing to consider is the specifics of how a language group/culture deals with collaborative work. I have no idea how to tackle this, though I've seen some studies in that direction.
I'm sure some of you here have heard about the absolute mess and conflict-ridden Portuguese Wikipedia. It's packed with hard deletionists, very hostile to newcomers and split into groups constantly fighting for power. I'm sure that's part of why PT:WP isn't bigger.
Juliana
On Tue, Jul 24, 2018 at 10:53 AM, Amir E. Aharoni < amir.aharoni@mail.huji.ac.ilmailto:amir.aharoni@mail.huji.ac.il> wrote:
Very interesting and much-needee research. Thanks for doing this. I'd love to see the results and even the process.
Some things to consider: 1. How long is the tradition of having published encyclopedias in that culture? 2. Alphabet: Using a common alphabet may make it somewhat easier to translate information between languages that use it, especially for things like towns and biographies. The Korean alphabet is used only by one language, but the Latin and the Cyrillic alphabets are used by many (with variations). 3. How long is the tradition of *actually* having public education for everybody: rich and poor, cities and villages? By "actually" I mean "not just by law, but in practice". 4. How long is the tradition of mostly-universal literacy? ("Literacy" is one of the most fuzzily defined concepts. Here I refer to something like "being able to read a newspaper and to write a one-page letter in one's own native language".) 5. How long is the tradition of having public libraries in most towns and villages? 6. How common is it to know other languages? 7. How isolated or open is the society that speaks this language in terms of access to media from other countries, translation of literature from other languages, travel to other countries? 8. How widespread are basic computer literacy skills: using a web browser; sending an email; copying, down/uploading, and deleting files. 9. How long is the tradition of having language resources, such as dictionaries, spelling standards, thesauri, style guides? 10. Is the language used completely in public education for teaching, textbooks, and homework? Or is the education mostly done in a foreign language? (This, roughly, is the situation in the Philippines and in many African countries.) 11. When did the language become an official language of a country? (If at all.) 12. Are there political, cultural, or government-suported movements for language development or preservation? 13. When did it become universally possible to fully write this language on a computer, with complete keyboards and fonts support? E.g., English has been easy to use on any computer for as long as there are computers; Polish, German, Russian and many other languages have been supported for a long time, but still struggled with encodings and diacritics in the 1990s; India and Burma are still struggling; I'm not sure about Korea.
These are the immediate things I can think about. There are probably many more criteria that could be considered.
The economics around a country are probably very important (poverty, access to infrastructure, healthcare, etc.), and you mentioned in your first email that you accounted for it, although I don't know in how much detail, so I trust you on that :)
בתאריך 24 ביולי 2018 12:04, "Piotr Konieczny" <piokon@post.plmailto:piokon@post.pl> כתב:
Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- www.domusaurea.orghttp://www.domusaurea.org _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.orgmailto:Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.orgmailto:Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Hi, the second most obvious factor is going to be the availability of internet access, but also the type of internet access, and how long people have had internet access.
The unproven assumption is that Wikipedia is written by people with internet experience and leisure time access to the internet via the desktop environment.
Over a decade ago when I was working in a marketing company, there was a rule of thumb that people only started shopping on the internet after two years of internet experience. I don’t know if that was ever scientifically tested, or what the equivalent would be for editing Wikipedia, but I’m pretty sure that editing Wikipedia is not an entry level experience on the internet.
We do know both from experience of training people to edit Wikipedia and also from looking at recent changes, that Wikipedia is almost a broadcast media in the mobile environment. There are some people who edit on tablets and even smartphones, but the editing community is mostly via the desktop environment. Just to confuse things desktop doesn't just include laptops in this context, there are even people using tablets but opting for the desktop environment rather than the mobile one.
So two languages with similar populations on the internet could have radically different Wikipedia sizes because in one culture access is fairly new and mostly smartphone based whilst in the other it is a longstanding thing with a large proportion of experienced Internet users with PC access.
The biggest difference though is going to be the policy of that Wikipedia community re bot creation of articles, with Cebuano, Swedish and Waray at one extreme. Such policies change over time, the English Wikipedia went through one of its early growth surges when a bot was used to start articles on all populated places in the USA, so it would be an oversimplification to simply list English as one of the Wikipedias that is currently chary about bot creation of articles. A very simplistic way to look at this is to order Wikipedias not by number of articles but by number of edits. On that basis Polish with 53m edits would drop behind the rather smaller Japanese Wikipedia as that has 69 million edits. Cebuano with 5.3 million articles but only 23 m edits would drop a long way from second place.
Other theories re differences between sizes of Wikipedia include ones re multilingual people. Phenomena such as the tendency of Indian editors to edit in English rather than Indic languages. One theory is that people are editing in a language that they perceive as “higher status” another that Wikipedians have multiple motivations and that some people edit in a language they are not fully fluent in in order to practice that language, a third is that Wikipedia is written in the correct alphabet for each language, but many people only have access to Latin keyboards. I am familar with this from Georgia where a large proportion of Georgians communicate on sites such as Facebook writing Georgian in the Latin script, but last I heard Wikipedia editing is restricted to those who can switch to Georgian script. Obviously this last issue is changing over time as particular scripts become available on the internet or as options in Wikipedia editing.
I would be very interested to see your paper, thanks for picking this topic.
Get Outlook for iOShttps://aka.ms/o0ukef ________________________________ From: 30012764400n behalf of Sent: Tuesday, July 24, 2018 10:03 am To: Research into Wikimedia content and communities Subject: [Wiki-research-l] Country (culture...) as a factor in contributing to collective intelligence projects
Dear all,
I am working on a paper on why/whether people contribute (or not) to collective intelligence differently projects in different countries. The paper was inspired, partially, by several discussions I had with various people on why different language Wikipedia's have different sizes, besides (doh) the popularity of the language (and yes, English is biggest because it is international; and yes, I am aware a few Wikipedias are outliers because of bots creating machine translations or auto-populating villages or such). But for example, Poland and South Korea have roughly similar population/speakers and development status, yet Polish Wikipedia is over 3x the size of the SK one and no bot can account for that. So, there's more to that. I am already feeding dozens of parameters to a spreadsheet for some modelling, but I a) wonder what I might have missed - before a reviewer asks 'why didn't you check for xyz' and b) would like to have a few nice sentences about how things that people expect to matter do not (or vice versa). Hence, my question to you all, in the form of this open question mini survey:
Why do you think different language Wikipedia's have different sizes, outside of the popularity of a given language?
For reference, list of Wikipedias by size and language: https://meta.wikimedia.org/wiki/List_of_Wikipedias
TIA!
-- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Another issue in terms of choice of language to contribute in could relate to their motivation to add the content and presumed audience for the content. A multi-lingual person might decide to write about (say) magnetism in English (or other widely spoken language) in the belief that magnetism is of worldwide interest, but might choose to write about a local folk story in a more local language in the belief that it is likely to be of interest only to local people.
Also given that there are different policies on different Wikipedias, it may be that a topic might not pass notability on English Wikipedia but be entirely acceptable on another Wikipedia.
Also, my observation of English Wikipedia is that regular contributors tend to divide into article-starters (a smaller group) and article-expanders (a much larger group). If there are cultural reasons (or Wikipedia policy reasons) why people fluent in one language are less likely to be article starters, this may limit the range of topics for the article-expanders to work on and hence the growth of the encyclopedia overall. There may also be cultural reasons why certain types of article are not started in some Wikipedias, e.g. popular culture articles (e.g. Pokemon characters) might not be seen as "encyclopedic" in some cultures.
As to the specific difference between Polish Wikipedia and South Korean Wikipedia, I would observe that South Korea is a nation obsessed with computer gaming both for personal leisure through to professional sport, and it is a very time-consuming passion.
https://en.wikipedia.org/wiki/Video_gaming_in_South_Korea
So maybe gaming takes away the time from those who might otherwise contribute to Wikipedia.
Kerry
wiki-research-l@lists.wikimedia.org