Among the Big Wikipedias, the pl.WP has one of the lowest quota of real articles:
Artikel (off.) realt. Art. Artikel W (Quot.) EN 1400000 1344000 0,96 DE 696000 668160 0,96 FR 613000 514920 0,84 JA 466000 466000 1 IT 408000 301920 0,74 PL 467000 298880 0,64 ES 326000 293400 0,9 NL 404000 274720 0,68 SV 272000 217600 0,8 PT 338000 209560 0,62 RU 233000 195720 0,84 ZH 164000 144320 0,88 (most numbers from jan. 2008, en, de and pt older; estimations should be rounded, in fact)
Only 64 % real articles in pl.WP, while the much criticized sv.WP has 80%. But this is not about blaming some Wikipedians, but about finding out how to compare WPs in a more effective way. The average size (bytes per article) does not work either. Take the article "Berlin" in Opper Sorabian (hsb). It has 3740 bytes. Sounds good, but only 454 bytes (six short sentences) are the actual text. 1823 bytes alone are for the interwikis. This is not a manipulation, but you see the difficulties when reading Wikimedia statistics. Even a "geographical stub" with infoboxes, categories and interwikis produces a lot of bytes. It takes a human to evaluate. Ziko
2008/6/27 Tomasz Ganicz polimerek@gmail.com:
2008/6/27 Ziko van Dijk zvandijk@googlemail.com:
Maybe this is not the most popular item, but I do like to comment on the news about Japanese and Polish Wikipedias and their 500,000 articles each. In fact, jp.WP actually has 500,000, but pl.WP does not. In an attempt to compare Wikipedia language editions I have clicked the button "random articles" and with a sample of 50 clicks each I have calculated how many articles a language edition really has, minus all those pseudo articles.
A pseudo article is e.g. http://pdc.wikipedia.org/wiki/Bikini http://co.wikipedia.org/wiki/191 http://ksh.wikipedia.org/wiki/Varsseveld http://pl.wikipedia.org/wiki/Tandil http://vo.wikipedia.org/wiki/Poplar_Bluff
Many Wikipedias loose, in my calculation, quite a huge percentage of their articles. There is one honourable exception: Japanese Wikipedia, which in 50 clicks showed absolutely no pseudo article. If Japanese Wikipedia would have such a floppy policy about new articles as many others have, jp.WP were already close to one million "articles". Pl.WP has for about 300,000 real articles, very respectable, but not what it seems to be.
Since the beginnings, Wikipedians report about the number of articles, having to tell something about to the media and to be proud about their achievements. They rank Wikipedia language editions by the number of articles. This has caused tragical dynamics: many Wikipedians and Wikipedias are so obsessed with this number that they produce rubbish articles to show off. Volapük Wikipedia with more than 100,000 pseudo articles created by a single bot using user is only the top of the iceberg, and when someone called to close vo.WP, vo.WP was supported by a amazing number of users from many language editions: cosi fan tutte. Wikipedians could and should use their time for more useful article work.
Well... Bear in mind that English Wikipedia also contains quite a lot of bot-created articles and in fact English Wikipedia was the first one to produce it. The others just followed the idea and started to do it in order to artifically increase the number of articles. Polish started to do it, when our rank went down due to mass production of bot-created articles in Swedish, Italian, French and other Wikipedias.
Comapare:
http://pl.wikipedia.org/wiki/Aignerville
and
http://en.wikipedia.org/wiki/Aignerville
or
http://pl.wikipedia.org/wiki/Is%C3%B2vol
and
http://it.wikipedia.org/wiki/Is%C3%B2vol
http://nl.wikipedia.org/wiki/Eksj%C3%B6_(stad)
and
http://pl.wikipedia.org/wiki/Eksj%C3%B6
http://pl.wikipedia.org/wiki/Dystrykt_Set%C3%BAbal
and
http://nn.wikipedia.org/wiki/Set%C3%BAbal
etc...
Nothing really special with Polish Wikipedia - many others do exactly the same including English. We had simply more active coders who knew how to feed bots. But - as you can compare with other Wikipedias they did sometimes really good job - in a sense that many bot created stubs in Polish Wikipedia contains more data than their equivalents in for example Swedish or French Wikipedia.
http://fr.wikipedia.org/wiki/Gr%C3%B3dek
http://fr.wikipedia.org/wiki/Drzewica
http://fr.wikipedia.org/wiki/Pszczyna
http://fr.wikipedia.org/wiki/Jas%C5%82o
etc...
-- Tomek "Polimerek" Ganicz http://pl.wikimedia.org/wiki/User:Polimerek http://www.ganicz.pl/poli/ http://www.ptchem.lodz.pl/en/TomaszGanicz.html
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l