[Foundation-l] Tragical dynamics: that run for the number of articles

Mark Williamson node.ue at gmail.com
Sat Jun 28 23:18:21 UTC 2008


First of all, that is for Arabic and Chinese, which probably have the
worst quality of Google Translate.

Second of all, Google consistently fared better than almost every
other system, a surprising feat for a very recently developed system.

Even if machine translation isn't completely accurate, it's often
enough to get an idea of the content of the page, and I know I have
learned about several topics through reading translated articles from
pl.wp.

Mark

On 28/06/2008, Tomasz Ganicz <polimerek at gmail.com> wrote:
> 2008/6/28 Ziko van Dijk <zvandijk at googlemail.com>:
>
> > I have discussed my study with many people (one had similar results),
>  > but no one was so aggressive, Tomasz.
>  >
>  >> b)your own subconcious attitude toward various nations and Wikipedias
>  >
>  > ? Is this an accusation?
>  >
>
>
> No, I am just a scientist, so I have a tendency to be sceptical and
>  have basic knowledge about typical mistakes of doing statistical
>  research.Too small sample, no clear criteria of evaluating it, and you
>  did not tested the experimental error or replication of your method,
>  by comparing results from several experiments asking other people to
>  use your meaning of what "real" article is.
>
>  50 articles sample tested by one person, who for sure have its own
>  attitudes is not enough to say that this or another Wikipedia is
>  better or worse. Everyone has its own attitudes towards one or another
>  nation. It is very natural thing. And if there is no clear definition
>  of what is "real" article and what is not, and to evaluate this it was
>  used google machine translation (which according to NIST survey from
>  2006 is found to be OK in only around 49% cases) so I am quite sure
>  that your results cannot be taken seriously. You could have stastical
>  error at least around 15-20% (if not more), so the results 0,60 or
>  0,80 is in experimental error range.
>
>  Anyway it would be interesting to make better planned experiments to
>  evaluate the quality of Wikipedia articles, but for sure it has to be
>  done on larger sample, some sort of "hard" criteria or a group of at
>  least 10 researchers speaking diffrent languages and having different
>  cultural background when to use "soft, human based" criteria.
>
>
>  --
>  Tomek "Polimerek" Ganicz
>  http://pl.wikimedia.org/wiki/User:Polimerek
>  http://www.ganicz.pl/poli/
>  http://www.ptchem.lodz.pl/en/TomaszGanicz.html
>
>  _______________________________________________
>
> foundation-l mailing list
>  foundation-l at lists.wikimedia.org
>  Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



More information about the foundation-l mailing list