Great question!
The high level answer is many more than is assumed by most folks. The
main challenge with calculating this number is missing interlanguage
links, but in 2010, we found for instance that the English Wikipedia
only covered about 41% of concepts in the Japanese Wikipedia, with a
missing interlanguage link rate of only 2% (as determined by two
bilingual coders going through 150 sample articles). The equivalent
number for Italian was 65% with an 8% missing interlanguage link rate.
We have more detailed and up-to-date results (that are also more
complicated), but this gets across the general idea.
There is also the matter of article-level diversity (e.g. what gets
covered about a given concept in different language editions), but this
an issue for another day.
On 4/1/2012 2:21 PM, emijrp wrote:
Hi Brent. How many articles exist in other Wikipedias
and don't have
an English translation at English Wikipedia? Any estimate?
2012/3/31 Brent Hecht <brent(a)u.northwestern.edu
<mailto:brent@u.northwestern.edu>>
Hello Wikidata Folks,
My name is Brent Hecht, and I've done a great deal of research on
the differences and similarities between the language editions. I
was really excited to hear about the Wikidata project moving
forward, and I think some of my research might be of assistance.
I'd enjoy being able to help the community make this important
transition.
In particular, my experience navigating interlanguage link
conflicts might be able to help in Phase 1. Please let me know if
there's anything I can do over the short term or long term!
Some of my relevant papers:
[1] Bao, P., Hecht, B., Carton, S., Quaderi, M., Horn, M. and
Gergle, D. 2012. Omnipedia: Bridging the Wikipedia Language Gap.
CHI '12: 30th International Conference on Human Factors in
Computing Systems (2012).
[2] Hecht, B. and Gergle, D. 2010. The Tower of Babel Meets Web
2.0: User-Generated Content and Its Applications in a Multilingual
Context. CHI '10: 28th International Conference on Human Factors
in Computing Systems (Atlanta, GA, 2010), 291--300.
[3] Hecht, B. and Gergle, D. 2009. Measuring Self-Focus Bias in
Community-Maintained Knowledge Repositories. Communities and
Technologies 2009: 4th International Conference on Communities and
Technologies (State College, PA, 2009), 11--19.
- Brent
Brent Hecht
Ph.D. Candidate in Computer Science
CollabLab: The Collaborative Technology Laboratory
Northwestern University
w:http://www.brenthecht.com <http://www.brenthecht.com/>
e:brent@u.northwestern.edu <mailto:brent@u.northwestern.edu>
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata-l
_______________________________________________
Wikidata-l mailing list
Wikidata-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l