Hi,
You show me a way of quantitatively comparing two Wikis using an automated process, and I'll show you a language or Wiki that will break it.
Cheers, Craig Franklin
Message: 10 Date: Thu, 3 May 2007 08:02:17 +0200 (CEST) From: Lars Aronsson lars@aronsson.se Subject: Re: [Wikipedia-l] Quality vs Quantity To: wikipedia-l@lists.wikimedia.org Message-ID: Pine.LNX.4.64.0705030758430.12001@localhost.localdomain Content-Type: TEXT/PLAIN; charset=US-ASCII
Berto 'd Sera wrote:
100% true. Just compound words in german may make a great difference towards English, in piemontese we thousands of 'L L' n' 'n that would count as words and are but pronominal particles, plus we usually say everything twice (double subject, double locatives, etc).
The size of the compressed article dumps would be a better comparison then, because the same content would still occupy the same space after all redundancies have been removed.