[Foundation-l] Tragical dynamics: that run for the number of articles

Ziko van Dijk zvandijk at googlemail.com
Sun Jun 29 09:55:49 UTC 2008


Tomasz,
My impression is that you do not like the results because pl.WP has a
poor ratio, that's what you initially complained about. I know - and
never denied - that 50 is a small sample; I did it for 53 Wikipedias.
I do have criteria, even if I did not list them up for you, you have
not read my paper. But you are immediately accusing me on judging
purely on feelings and attitudes to nations.

For example,  at geo stubs I want at least two informations that are
not bot created.
http://pl.wikipedia.org/wiki/Abisynia_(powiat_bialski)
I checked the first some of that Kategoria:Zalążki artykułów o polskich wsiach
and only Abramy I'd count as real, because there are two informations
about history (1599 and 1676). I suppose that many of the other 50,159
articles of that category are pseudo articles (they all have that part
about the administrative division of 1975-1998).
The same thing can be said about Kategoria:Zalążek artykułu o
miejscowości francuskiej with 35,066 cities in France.
Schematic Planetoid articles are no real articles, like the 14.444 in
Kategoria:Planetoidy pasa głównego.

So, I can imagine why pl.WP has only 64% real articles according to my sample.

Ziko

2008/6/28 Tomasz Ganicz <polimerek at gmail.com>:
> 2008/6/28 Ziko van Dijk <zvandijk at googlemail.com>:
>> I have discussed my study with many people (one had similar results),
>> but no one was so aggressive, Tomasz.
>>
>>> b)your own subconcious attitude toward various nations and Wikipedias
>>
>> ? Is this an accusation?
>>
>
> No, I am just a scientist, so I have a tendency to be sceptical and
> have basic knowledge about typical mistakes of doing statistical
> research.Too small sample, no clear criteria of evaluating it, and you
> did not tested the experimental error or replication of your method,
> by comparing results from several experiments asking other people to
> use your meaning of what "real" article is.
>
> 50 articles sample tested by one person, who for sure have its own
> attitudes is not enough to say that this or another Wikipedia is
> better or worse. Everyone has its own attitudes towards one or another
> nation. It is very natural thing. And if there is no clear definition
> of what is "real" article and what is not, and to evaluate this it was
> used google machine translation (which according to NIST survey from
> 2006 is found to be OK in only around 49% cases) so I am quite sure
> that your results cannot be taken seriously. You could have stastical
> error at least around 15-20% (if not more), so the results 0,60 or
> 0,80 is in experimental error range.
>
> Anyway it would be interesting to make better planned experiments to
> evaluate the quality of Wikipedia articles, but for sure it has to be
> done on larger sample, some sort of "hard" criteria or a group of at
> least 10 researchers speaking diffrent languages and having different
> cultural background when to use "soft, human based" criteria.
>
> --
> Tomek "Polimerek" Ganicz
> http://pl.wikimedia.org/wiki/User:Polimerek
> http://www.ganicz.pl/poli/
> http://www.ptchem.lodz.pl/en/TomaszGanicz.html
>
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



-- 
Ziko van Dijk
NL-Silvolde


More information about the foundation-l mailing list