Hi,
You show me a way of quantitatively comparing two Wikis using an
automated process, and I'll show you a language or Wiki that will break
it.
Cheers,
Craig Franklin
> Message: 10
> Date: Thu, 3 May 2007 08:02:17 +0200 (CEST)
> From: Lars Aronsson <lars(a)aronsson.se>
> Subject: Re: [Wikipedia-l] Quality vs Quantity
> To: wikipedia-l(a)lists.wikimedia.org
> Message-ID: <Pine.LNX.4.64.0705030758430.12001(a)localhost.localdomain>
> Content-Type: TEXT/PLAIN; charset=US-ASCII
>
> Berto 'd Sera wrote:
>
> > 100% true. Just compound words in german may make a great
> > difference towards English, in piemontese we thousands of 'L L'
> > n' 'n that would count as words and are but pronominal
> > particles, plus we usually say everything twice (double subject,
> > double locatives, etc).
>
> The size of the compressed article dumps would be a better
> comparison then, because the same content would still occupy the
> same space after all redundancies have been removed.
>
>