[Foundation-l] Tragical dynamics: that run for the number of articles

Ziko van Dijk zvandijk at googlemail.com
Sat Jun 28 12:34:25 UTC 2008


There is Google Translater, and the Interwikis help as well. That
article of he.WP about Lodz I would count as a real article, because
there is information more than in a data base (links to Holocaust
related articles, something about 19th century, economy (textile)).
Indeed, I would like to make a more scientific scheme and apply it to
a larger sample, maybe there will establish a research group about. I
believe that my method does give a reasonable picture; of course,
whether my results say "50.000" real articles or "52.000" is not
really a measurable difference.
Ziko

PS: By the way, it is fun to browse a foreign language Wikipedia with
the help of Google translater - not perfect, but interesting what
others write about.


2008/6/28 Tomasz Ganicz <polimerek at gmail.com>:
> 2008/6/28 Andre Engels <andreengels at gmail.com>:
>> On Sat, Jun 28, 2008 at 9:47 AM, Tomasz Ganicz <polimerek at gmail.com> wrote:
>>
>>> Can you explain how this evalution been done? How do you distinguish
>>> between "real" and other articles? Especially I don't believe in
>>> statiscts shown for en Wikipedia. I have a feeing that there is much
>>> more bot created articles in en Wikipedia than your statistcs show.
>>
>> That is described in his first mail: He did 'random article' 50 times
>> and used that as a sample.
>>
>
> Well it is not described - I mean there is no clear criteria of
> evaluation mentioned.
> Does he speak Japanese or Polish? Is it possible to recognize "real"
> and "unreal" articles without understanding them?
>
> Compare:
>
> http://he.wikipedia.org/wiki/%D7%9C%D7%95%D7%93%D7%96%27
>
> Is it "real" or "unreal" article and why? I have a feeling that it is
> bot created, but I am no sure about it, as I don't speak Hebrew :-)
>
> And what about this:
>
> http://uk.wikipedia.org/wiki/%D0%A4%D1%96%D0%B3%D1%83%D0%BB%D1%81_%D1%96_%D0%90%D0%BB%D1%96%D0%BD%D1%8C%D1%8F
>
> It is quite long, but I am almost sure that it is bot created and
> untouch by any human, because it contains only statistical data and
> sentences looking as if they were machine created. I don't speak
> Ukrainian well but understand it a little bit. But it is still just my
> feelings...
>
> It is funny that this article is longer than similar in es-Wikipedia,
> although Spanish one was edited by humans for sure :-)
>
> http://es.wikipedia.org/wiki/F%C3%ADgols_y_Ali%C3%B1%C3%A1
>
> and moreover - if you check all Wikipedias which contain article about
> Fígols i Alinyà only Spanish one looks as edited by human (but it is
> just my feelings I can be wrong).
>
> And this:
>
> http://ta.wikipedia.org/wiki/%E0%AE%B5%E0%AE%BE%E0%AE%B0%E0%AF%8D%E0%AE%9A%E0%AE%BE
>
> real or not real? I really don't know, probably bot-created :-)
>
> I think if we would like to perform serios evaluation of "real" and
> "unreal" articles it should be based on clear, not based on "feelings"
> criteria, done on larger samples (at least 500 articles) and by people
> who understand what they are reading.
>
>
> --
> Tomek "Polimerek" Ganicz
> http://pl.wikimedia.org/wiki/User:Polimerek
> http://www.ganicz.pl/poli/
> http://www.ptchem.lodz.pl/en/TomaszGanicz.html
>
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>



-- 
Ziko van Dijk
NL-Silvolde



More information about the foundation-l mailing list