[Foundation-l] Tragical dynamics: that run for the number of articles

Tomasz Ganicz polimerek at gmail.com
Sat Jun 28 09:20:55 UTC 2008


2008/6/28 Andre Engels <andreengels at gmail.com>:
> On Sat, Jun 28, 2008 at 9:47 AM, Tomasz Ganicz <polimerek at gmail.com> wrote:
>
>> Can you explain how this evalution been done? How do you distinguish
>> between "real" and other articles? Especially I don't believe in
>> statiscts shown for en Wikipedia. I have a feeing that there is much
>> more bot created articles in en Wikipedia than your statistcs show.
>
> That is described in his first mail: He did 'random article' 50 times
> and used that as a sample.
>

Well it is not described - I mean there is no clear criteria of
evaluation mentioned.
Does he speak Japanese or Polish? Is it possible to recognize "real"
and "unreal" articles without understanding them?

Compare:

http://he.wikipedia.org/wiki/%D7%9C%D7%95%D7%93%D7%96%27

Is it "real" or "unreal" article and why? I have a feeling that it is
bot created, but I am no sure about it, as I don't speak Hebrew :-)

And what about this:

http://uk.wikipedia.org/wiki/%D0%A4%D1%96%D0%B3%D1%83%D0%BB%D1%81_%D1%96_%D0%90%D0%BB%D1%96%D0%BD%D1%8C%D1%8F

It is quite long, but I am almost sure that it is bot created and
untouch by any human, because it contains only statistical data and
sentences looking as if they were machine created. I don't speak
Ukrainian well but understand it a little bit. But it is still just my
feelings...

It is funny that this article is longer than similar in es-Wikipedia,
although Spanish one was edited by humans for sure :-)

http://es.wikipedia.org/wiki/F%C3%ADgols_y_Ali%C3%B1%C3%A1

and moreover - if you check all Wikipedias which contain article about
Fígols i Alinyà only Spanish one looks as edited by human (but it is
just my feelings I can be wrong).

And this:

http://ta.wikipedia.org/wiki/%E0%AE%B5%E0%AE%BE%E0%AE%B0%E0%AF%8D%E0%AE%9A%E0%AE%BE

real or not real? I really don't know, probably bot-created :-)

I think if we would like to perform serios evaluation of "real" and
"unreal" articles it should be based on clear, not based on "feelings"
criteria, done on larger samples (at least 500 articles) and by people
who understand what they are reading.


-- 
Tomek "Polimerek" Ganicz
http://pl.wikimedia.org/wiki/User:Polimerek
http://www.ganicz.pl/poli/
http://www.ptchem.lodz.pl/en/TomaszGanicz.html



More information about the foundation-l mailing list