From wikipedia, the free encyclopedia: "The Persian Wikipedia is the Persian language version of Wikipedia. As the Persian language is written right to left, so is the website."
I just realized that the English Wikipedia has articles on quite a number of wikipedia language editions: http://en.wikipedia.org/wiki/Category:Wikipedias_by_language
But hey, isn't there more to say about it than just outdated numbers of articles or the writing direction? Doesn't each wikipedia have something unique, a special culture, notable events which are not milestones?
Spend a while to polish the article about your wikipedia to spare it the fate of rotting as a "wikipedia-related stub" in the cleanup department ;-)
greetings, elian
PS: In the german wikipedia these articles would probably have been deleted as irrelevant...ehemm...that's our specific culture
This stub-related article is a stub.
On 9/6/05, Elisabeth Bauer elian@djini.de wrote:
From wikipedia, the free encyclopedia: "The Persian Wikipedia is the Persian language version of Wikipedia. As the Persian language is written right to left, so is the website."
I just realized that the English Wikipedia has articles on quite a number of wikipedia language editions: http://en.wikipedia.org/wiki/Category:Wikipedias_by_language
But hey, isn't there more to say about it than just outdated numbers of articles or the writing direction? Doesn't each wikipedia have something unique, a special culture, notable events which are not milestones?
Spend a while to polish the article about your wikipedia to spare it the fate of rotting as a "wikipedia-related stub" in the cleanup department ;-)
greetings, elian
PS: In the german wikipedia these articles would probably have been deleted as irrelevant...ehemm...that's our specific culture _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Tue, Sep 06, 2005 at 05:49:47AM +0200, Elisabeth Bauer wrote:
PS: In the german wikipedia these articles would probably have been deleted as irrelevant...ehemm...that's our specific culture
Are you sure they would ? Major wikipedias are among the most important websites of the net, with interesting history and a lot of influence over the internet culture at large.
You wouldn't want the paper 'pedias write about wikipedia before the wikipedia does ?
On Tue, 06 Sep 2005 06:34:53 +0200, Tomasz Wegrzanowski wrote:
On Tue, Sep 06, 2005 at 05:49:47AM +0200, Elisabeth Bauer wrote:
PS: In the german wikipedia these articles would probably have been deleted as irrelevant...ehemm...that's our specific culture
Are you sure they would ? Major wikipedias are among the most important websites of the net, with interesting history and a lot of influence over the internet culture at large.
It's a pretty safe assumption, yes. The German WP doesn't keep articles just because the subject is notable. If an article is too short, it will get deleted unless someone expands it (IIRC a time limit of a few days is quite common).
Roger
It's a pretty safe assumption, yes. The German WP doesn't keep articles just because the subject is notable. If an article is too short, it will get deleted unless someone expands it (IIRC a time limit of a few days is quite common). Roger
It seems to me that Swedish Wikipedia is quite the opposite - they have over 100,000 articles mostly because of the huge amount of substubs...
It seems to me that Swedish Wikipedia is quite the opposite - they have over 100,000 articles mostly because of the huge amount of substubs...
The Poles are on their way to 100.000 as well by botway that create a lot of stubs. So are the Italian and Portugese which to my surprise are gaining on nl: very rapidly ... probably also by bot-stub-adding ... I wish them good luck and I hope that having more info in their own languages available will attract more contributors :)
Waerth/Walter
The Poles are on their way to 100.000 as well by botway that create a lot of stubs. So are the Italian and Portugese which to my surprise are gaining on nl: very rapidly ... probably also by bot-stub-adding ... I wish them good luck and I hope that having more info in their own languages available will attract more contributors :) Waerth/Walter
Well, the bot articles are often better than man-made stubs, and they are great as templates for further expansion. It would be a good idea to make a project at meta for collecting such data to be used by bots.
On Tue, 6 Sep 2005, Walter van Kalken wrote:
The Poles are on their way to 100.000 as well by botway that create a lot of stubs. So are the Italian and Portugese which to my surprise are gaining on nl: very rapidly ... probably also by bot-stub-adding ... I wish them good luck and I hope that having more info in their own languages available will attract more contributors :)
Italian will go in about a month from 60,000 to 100,000 mostly by bots adding stubs about geographical places. There is already a project to expands those stubs to reasonable quality, translating whatever info is available in the town's native language wikipedia - and from other languages, of course. In the meantime, the usual 4,000 new articles/month rate has not changed much.
The popularity of bots, due to the rapid increase, has also promoted the development of many bots dedicated to fix ortography and layout issues, so existing pages are rapidly improving.
Alfio
Alfio Puglisi wrote:
On Tue, 6 Sep 2005, Walter van Kalken wrote:
The Poles are on their way to 100.000 as well by botway that create a lot of stubs. So are the Italian and Portugese which to my surprise are gaining on nl: very rapidly ... probably also by bot-stub-adding ... I wish them good luck and I hope that having more info in their own languages available will attract more contributors :)
Italian will go in about a month from 60,000 to 100,000 mostly by bots adding stubs about geographical places. There is already a project to expands those stubs to reasonable quality, translating whatever info is available in the town's native language wikipedia - and from other languages, of course. In the meantime, the usual 4,000 new articles/month rate has not changed much.
The popularity of bots, due to the rapid increase, has also promoted the development of many bots dedicated to fix ortography and layout issues, so existing pages are rapidly improving.
Alfio _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
And, also, on it.wikipedia we are proposing for deletion a lot of articles that we consider sub-stubs (one line or less); the nice thing is that most of them are expanded, so the quality of it.wiki is increasing. And a few other wikis have asked our databases for the geographical places, so the project is not just ours
Cruccone
Paweł Dembowski wrote:
It seems to me that Swedish Wikipedia is quite the opposite - they have over 100,000 articles mostly because of the huge amount of substubs...
I agree that this is embarrasing and should be addressed. I think that the Danish Wikipedia, with 30,000 articles, has an even higher percentage of (sub-)stubs than the Swedish one, but this is just a feeling and I have no numbers to prove this. We need a statistic for the amount of (sub-)stubs, so we can talk verifiable numbers (and set goals) instead of guestimates. How do we define that? Is the ">200 ch" count ("alternative" article count, [1]) in Erik Zachte's Wikistats a good metric? Or the percentage of articles longer than 0.5 kilobytes [2]? I think 200 characters is an OK stub, but perhaps a substub is less than 70 characters? This leaves us with the Special:Shortpages page. That page has the advantage of being instantly updated, which Wikistats is not.
The Swedish Wikipedia has 421 articles (0.4% of 102K) shorter than 70 bytes and the Danish has 351 (1.1% of 31K). As a comparison, the Dutch Wikipedia has 79 (0.08% of 89K) and the Polish has 387 (0.4% of 93K). This makes the Polish look just as bad as the Swedish, since both have 0.4% of articles shorter than 70 bytes. But perhaps a substub should be defined at 50 bytes instead? Or 100 bytes or 150?
[1] Article count (alternate), longer than 200 bytes, http://en.wikipedia.org/wikistats/EN/TablesArticlesTotalAlt.htm
[2] Articles over 0.5 Kb or 500 bytes, http://en.wikipedia.org/wikistats/EN/TablesArticlesGt500Bytes.htm
As of July 2005, [1] [2] All languages 1.6 M 62% English 595 K 73% Japanese 129 K 52% French 122 K 72% Dutch 75 K 74% 75K is more than Swedish's 68K Polish 68 K 65% Italian 47 K 76% Swedish 68 K 42% Low percentage Spanish 53 K 70% Portuguese 48 K 52% Chinese 33 K 38% Even lower percentage Hebrew 20 K 75% Norwegian 25 K 52% Finnish 24 K 64% Russian 20 K 58% Esperanto 22 K 51% Danish 20 K 45% Almost as low percentage
The Swedish Wikipedia has 421 articles (0.4% of 102K) shorter than 70 bytes and the Danish has 351 (1.1% of 31K). As a comparison, the Dutch Wikipedia has 79 (0.08% of 89K) and the Polish has 387 (0.4% of 93K). This makes the Polish look just as bad as the Swedish, since both have 0.4% of articles shorter than 70 bytes. But perhaps a substub should be defined at 50 bytes instead? Or 100 bytes or 150?
Does this include disambiguation pages? If so, it's not a very good estimate, since they *should* be short...
Paweł Dembowski wrote:
Does this include disambiguation pages? If so, it's not a very good estimate, since they *should* be short...
Yes, I think [[Special:Shortpages]] includes disambiguation pages. And also redirect pages where #redirect is spelled in lower case. I was able to remove entries from [[Special:Shortpages]] by changing the spelling to upper case #REDIRECT.
So, we still need a better tool to measure the level of stubs.
On Tue, Sep 06, 2005 at 11:52:21PM +0200, Lars Aronsson wrote:
Paweł Dembowski wrote:
It seems to me that Swedish Wikipedia is quite the opposite - they have over 100,000 articles mostly because of the huge amount of substubs...
I agree that this is embarrasing and should be addressed. I think that the Danish Wikipedia, with 30,000 articles, has an even higher percentage of (sub-)stubs than the Swedish one, but this is just a feeling and I have no numbers to prove this. We need a statistic for the amount of (sub-)stubs, so we can talk verifiable numbers (and set goals) instead of guestimates. How do we define that? Is the ">200 ch" count ("alternative" article count, [1]) in Erik Zachte's Wikistats a good metric? Or the percentage of articles longer than 0.5 kilobytes [2]? I think 200 characters is an OK stub, but perhaps a substub is less than 70 characters? This leaves us with the Special:Shortpages page. That page has the advantage of being instantly updated, which Wikistats is not.
The Swedish Wikipedia has 421 articles (0.4% of 102K) shorter than 70 bytes and the Danish has 351 (1.1% of 31K). As a comparison, the Dutch Wikipedia has 79 (0.08% of 89K) and the Polish has 387 (0.4% of 93K). This makes the Polish look just as bad as the Swedish, since both have 0.4% of articles shorter than 70 bytes. But perhaps a substub should be defined at 50 bytes instead? Or 100 bytes or 150?
Numbers like 0.4% of articles tell more about effectiveness of the wikicleaning process than about the typical article. (and by the way, Special:Shortpages is not updated live on WikiMedia servers)
Just take a look at the list of shortest pages on Polish Wikipedia - they're almost all: * Redirects (what are they doing on the list ?) * Disambiguation pages without descriptions for the links. Sometimes articles have titles so obvious that {{disambig}} + list of the links is enough. * A few cases of things that look like leftovers of the past technical problems * A few cases of things that should be immediately deteled, but have been missed or are simply too recent and will be deleted soon
I think that the problem is how much value is placed on article count.
Rather, we should place the value on size in bytes -- obviously, some languages take up more or less space than others, but it does seem to work better: some Wikis with high article counts but low amounts of content appear lower or on the same level with Wikis with low article counts but relatively high amounts of content.
For example, see br.wiki, scn.wiki, li.wiki, compare them with bn.wiki, sa.wiki (much of the size of sa.wiki is artificial as well due to whole sections of the Rgveda being copied verbatim when they really belong in Wikisource), and gd.wiki.
In fact, you can tell just how nasty so many of the articles on sa.wiki are by taking a look at this image: http://en.wikipedia.org/wikistats/EN/PlotDatabaseSize7.png
It's the only wiki of such a size to have the vast majority of its growth in giant leaps like that, which is indicative of a bot or some other fast, low-quality article adding technique.
Mark
On 06/09/05, Tomasz Wegrzanowski taw@users.sf.net wrote:
On Tue, Sep 06, 2005 at 11:52:21PM +0200, Lars Aronsson wrote:
Paweł Dembowski wrote:
It seems to me that Swedish Wikipedia is quite the opposite - they have over 100,000 articles mostly because of the huge amount of substubs...
I agree that this is embarrasing and should be addressed. I think that the Danish Wikipedia, with 30,000 articles, has an even higher percentage of (sub-)stubs than the Swedish one, but this is just a feeling and I have no numbers to prove this. We need a statistic for the amount of (sub-)stubs, so we can talk verifiable numbers (and set goals) instead of guestimates. How do we define that? Is the ">200 ch" count ("alternative" article count, [1]) in Erik Zachte's Wikistats a good metric? Or the percentage of articles longer than 0.5 kilobytes [2]? I think 200 characters is an OK stub, but perhaps a substub is less than 70 characters? This leaves us with the Special:Shortpages page. That page has the advantage of being instantly updated, which Wikistats is not.
The Swedish Wikipedia has 421 articles (0.4% of 102K) shorter than 70 bytes and the Danish has 351 (1.1% of 31K). As a comparison, the Dutch Wikipedia has 79 (0.08% of 89K) and the Polish has 387 (0.4% of 93K). This makes the Polish look just as bad as the Swedish, since both have 0.4% of articles shorter than 70 bytes. But perhaps a substub should be defined at 50 bytes instead? Or 100 bytes or 150?
Numbers like 0.4% of articles tell more about effectiveness of the wikicleaning process than about the typical article. (and by the way, Special:Shortpages is not updated live on WikiMedia servers)
Just take a look at the list of shortest pages on Polish Wikipedia - they're almost all:
- Redirects (what are they doing on the list ?)
- Disambiguation pages without descriptions for the links. Sometimes articles have titles so obvious that {{disambig}} + list of the links is enough.
- A few cases of things that look like leftovers of the past technical problems
- A few cases of things that should be immediately deteled, but have been missed or are simply too recent and will be deleted soon
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
It's the only wiki of such a size to have the vast majority of its growth in giant leaps like that, which is indicative of a bot or some other fast, low-quality article adding technique. Mark
Well, bot-made town articles are often better than man-made stubs...
And what's with "don't kill the stub" ?
Traroth
--- Roger Luethi collector@hellgate.ch a écrit :
It's a pretty safe assumption, yes. The German WP doesn't keep articles just because the subject is notable. If an article is too short, it will get deleted unless someone expands it (IIRC a time limit of a few days is quite common).
Roger
___________________________________________________________________________ Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger Téléchargez cette version sur http://fr.messenger.yahoo.com
Ahhhh Elian, I see: we need your article on the stub-gnomes even in English ... well I am trying to translate it into Italian, it is fun, but really quite a difficult task as it has many of those "fine meanings" that don't need translation, but you need to re-write the whole text giving the same meanings in the other language to it (I already wrote it once and then re-wrote .... but it still lacks something) .... so I suppose the litteral translation "stub gnome" in English does not work well ... maybe we should think about Snowwhite and her "dwarfs" - so these could be come the "stub-dwarfs" ;-)
So who are going to be the English writing stub-dwarfs that make some nice present to these articles "giving" a bit more information :-)
Well: I am not contributing much to wikipedias (I am more concentrated on wiktionary) ... simply because my home are "languages" ;-) and this mail is about them :-)
Ciao!
Sabine
Elisabeth Bauer wrote:
From wikipedia, the free encyclopedia: "The Persian Wikipedia is the Persian language version of Wikipedia. As the Persian language is written right to left, so is the website."
I just realized that the English Wikipedia has articles on quite a number of wikipedia language editions: http://en.wikipedia.org/wiki/Category:Wikipedias_by_language
But hey, isn't there more to say about it than just outdated numbers of articles or the writing direction? Doesn't each wikipedia have something unique, a special culture, notable events which are not milestones?
Spend a while to polish the article about your wikipedia to spare it the fate of rotting as a "wikipedia-related stub" in the cleanup department ;-)
greetings, elian
PS: In the german wikipedia these articles would probably have been deleted as irrelevant...ehemm...that's our specific culture
___________________________________ Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB http://mail.yahoo.it
wikipedia-l@lists.wikimedia.org