I am not sure how easy it would be to determine this, but your
examples suggest:
Number of unique articles started on xx-WP that are subsequently
translated on other WPs. This highlights the particular virtues of a
world-wide collaboration.
DGG David Gooodman
On 5/5/07, Craig Franklin <craig(a)halo-17.net> wrote:
Hi Lars,
I should have said, trying to make a qualitative comparison using
quantitiative measures is quite difficult given the nature of Wikipedias.
To illustrate using the "article count" used on
www.wikipedia.org, I had a
looksie at two Wikis, Quechua (2,166 articles), and Friulian (2,041
articles). Using this quantitative comparison, they should be about equal
in comprehensiveness.
However, after hitting "Random" five times on each, I got the following
pages:
Quechua:
http://qu.wikipedia.org/wiki/Chhukruna (stub)
http://qu.wikipedia.org/wiki/Minsk (stub)
http://qu.wikipedia.org/wiki/Yatawaki (stubby, most of the article is just a
bullet-point list)
http://qu.wikipedia.org/wiki/Kiru_ismu (stub)
http://qu.wikipedia.org/wiki/T'aklla (stub)
Friulian:
http://fur.wikipedia.org/wiki/1622 (stub)
http://fur.wikipedia.org/wiki/Timp_coordenât_universâl (short, maybe a Start
class on en:)
http://fur.wikipedia.org/wiki/Lauc (full article, longer by word count than
version on en:, about the same size at that on it: if you remove the large
bar graph from the it: version)
http://fur.wikipedia.org/wiki/Toponims_Talian_Furlan_D (list)
http://fur.wikipedia.org/wiki/Islam (full article)
Anecdotal maybe, but clearly once you eyeball it, the Friulian Wikipedia has
the edge in terms of comprehensiveness and quality over Quechua. Trouble
is, I don't see how this can be algorithmically determined using an
automated process. You're going to need humans to look at these things and
make the determination, and really, who has time to do that for 200+
wikipedias? Not to mention the accusations of bias, poor article selection,
and other such things that will be made.
Using word counts is also going to be a problem too, languages like Norfuk
that have lots of small particle words and the like will show inflated word
counts compared to languages like Mandarin that don't have written words in
the Western sense, languages like Gaeilge or Cymraeg where the very notion
of "word" is a pretty nebulous one, or languages like Kalaallisut which
compress many meanings and affixes into each and every word.
Cheers,
Craig Franklin
Date: Thu, 3 May 2007 11:30:53 +0200 (CEST)
From: Lars Aronsson <lars(a)aronsson.se>
Subject: Re: [Wikipedia-l] Quality vs Quantity
To: wikipedia-l(a)lists.wikimedia.org
Message-ID: <Pine.LNX.4.64.0705031128240.15940(a)localhost.localdomain>
Content-Type: TEXT/PLAIN; charset=US-ASCII
Craig Franklin wrote:
You show me a way of quantitatively comparing two
Wikis using an
automated process, and I'll show you a language or Wiki that
will break it.
Today the front page
www.wikipedia.org measures and compares the
number of articles. You can begin to "break" that method. And
then you can figure out some method that might perhaps be slightly
better. I've suggested two already.
--
Lars Aronsson (lars(a)aronsson.se)
Aronsson Datateknik -
http://aronsson.se
_______________________________________________
Wikipedia-l mailing list
Wikipedia-l(a)lists.wikimedia.org
http://lists.wikimedia.org/mailman/listinfo/wikipedia-l