For months and months we've talked about revamping the article count system, but nothing's changed. The article count is still an extension of the "comma count" used to filter out empty articles in a search back in the UseMod days.
Currently, a page is counted as an "article" for "we have X articles" purposes if it is: * in the article namespace (so excludes talk pages, user pages, Wikipedia: help and utility pages) * not a redirect * contains a comma (!)
Now, we are well aware that page-count fever has gripped Wikipedia for some time. The obsession with breaking the 100,000-page barrier on the English stifled any implementation of reforms for fear of reducing the count. Concerns about languages which don't use the ASCII comma character have been shrugged off. Well, today I've seen enough.
While the English wiki has galumphed along for ages, secure in its place as The World's Largest Damn Wiki, the smaller languages are in intense (though friendly) competition with one another for runner-up positions. "In real life," Youssefsan tells me, "people look for economic growth; here for page growth. Both use 'creative accounting.'"
On the francophone Wikipedia, we have been exposed as the slaves to the comma count that we all are but are ashamed to admit. See: http://fr.wikipedia.org/w/wiki.phtml?title=CULTe&action=edit&oldid=3...
(Those who have trouble with my PGP-signed mail, go to fr.wikipedia.org, look up article 'CULTe', and hit 'Modifier cette page'.)
Yes that's right, people have started adding commas as hidden comments just to increase the stupid comma count. NO MORE, I say! Ils ne passeront pas!
Unless a better count system is proposed, I will replace the comma check with a greater-than-zero-size check within twelve hours.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Unless a better count system is proposed, I will replace the comma check with a greater-than-zero-size check within twelve hours.
I agree with the proposal, but would make the size threshhold a little bigger to deal with those articles which were essentially deleted but have really been left with residual spaces or other junk. Although the shortest article list for English is not available at the moment I did look at this function for the other languages. The longest "article" that I found that way was at "Brakiopodo" in the Esperanto Wikipedia. It contains only 4 hyphens. The shortest article that I found with what could most remotely accepted as content was at 7 in more than one language. This suggests that the greater-than-size check might be set at 4, 5, or 6.
BTW the Esperanto Wikipedia seems to have the greatest number of zero length articles.
Eclecticology
--- Brion Vibber brion@pobox.com wrote:
For months and months we've talked about revamping the article count system, but nothing's changed. The article count is still an extension of the "comma count" used to filter out empty articles in a search back in the UseMod days.
Currently, a page is counted as an "article" for "we have X articles" purposes if it is:
- in the article namespace (so excludes talk
pages, user pages, Wikipedia: help and utility pages)
- not a redirect
- contains a comma (!)
Now, we are well aware that page-count fever has gripped Wikipedia for some time. The obsession with breaking the 100,000-page barrier on the English stifled any implementation of reforms for fear of reducing the count. Concerns about languages which don't use the ASCII comma character have been shrugged off. Well, today I've seen enough.
While the English wiki has galumphed along for ages, secure in its place as The World's Largest Damn Wiki, the smaller languages are in intense (though friendly) competition with one another for runner-up positions. "In real life," Youssefsan tells me, "people look for economic growth; here for page growth. Both use 'creative accounting.'"
On the francophone Wikipedia, we have been exposed as the slaves to the comma count that we all are but are ashamed to admit. See:
http://fr.wikipedia.org/w/wiki.phtml?title=CULTe&action=edit&oldid=3...
(Those who have trouble with my PGP-signed mail, go to fr.wikipedia.org, look up article 'CULTe', and hit 'Modifier cette page'.)
Yes that's right, people have started adding commas as hidden comments just to increase the stupid comma count. NO MORE, I say! Ils ne passeront pas!
Unless a better count system is proposed, I will replace the comma check with a greater-than-zero-size check within twelve hours.
-- brion vibber (brion @ pobox.com)
Oui ! The french wikipedia has become a hunt place for commas ! Even in articles less than 10 words long...
Why not a mix between the comma method, and a certain number of words ? (at least people will know how to count ;-))
__________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
Le dim 09/03/2003 à 13:28, Anthere a écrit :
--- Brion Vibber brion@pobox.com wrote:
For months and months we've talked about revamping the article count system, but nothing's changed. The article count is still an extension of the "comma count" used to filter out empty articles in a search back in the UseMod days.
Currently, a page is counted as an "article" for "we have X articles" purposes if it is:
- in the article namespace (so excludes talk
pages, user pages, Wikipedia: help and utility pages)
- not a redirect
- contains a comma (!)
Now, we are well aware that page-count fever has gripped Wikipedia for some time. The obsession with breaking the 100,000-page barrier on the English stifled any implementation of reforms for fear of reducing the count. Concerns about languages which don't use the ASCII comma character have been shrugged off. Well, today I've seen enough.
While the English wiki has galumphed along for ages, secure in its place as The World's Largest Damn Wiki, the smaller languages are in intense (though friendly) competition with one another for runner-up positions. "In real life," Youssefsan tells me, "people look for economic growth; here for page growth. Both use 'creative accounting.'"
On the francophone Wikipedia, we have been exposed as the slaves to the comma count that we all are but are ashamed to admit. See:
http://fr.wikipedia.org/w/wiki.phtml?title=CULTe&action=edit&oldid=3...
(Those who have trouble with my PGP-signed mail, go to fr.wikipedia.org, look up article 'CULTe', and hit 'Modifier cette page'.)
Yes that's right, people have started adding commas as hidden comments just to increase the stupid comma count. NO MORE, I say! Ils ne passeront pas!
Unless a better count system is proposed, I will replace the comma check with a greater-than-zero-size check within twelve hours.
-- brion vibber (brion @ pobox.com)
Oui ! The french wikipedia has become a hunt place for commas ! Even in articles less than 10 words long...
Ouhlàlà... Warning commas-hunters ! Anthere is back !!!
Anthere, take a look at [[SHA-1]] and tell me how many commas are there? There is only one, and it is not needed, from a stylistic point of view. This article is a very good example : Some articles may not have commas, unless they are long and complete enough to be considerated as articles.
That's why I have added commas to every article which doesn't have one... C'est aussi simple que ça.
Why not a mix between the comma method, and a certain number of words ? (at least people will know how to count ;-))
Greater-than-zero-size check is simple and efficient enough...
Oui ! The french wikipedia has become a hunt place for commas ! Even in articles less than 10 words long...
Ouhl�l�... Warning commas-hunters ! Anthere is back !!!
Ah, Tim ! Si tu n'existais pas, il faudrait t'inventer !
Tiens, d'ailleurs, faudra que tu nous explique comment on te reconnait, car franchement, connaitre ta cl�, �a ne m'avance pas du tout du tout du tout
Et tant que je ne comprend pas, Tim , c'est Tim, et Tim, c'est toi
Anthere, take a look at [[SHA-1]] and tell me how many commas are there? There is only one, and it is not needed, from a stylistic point of view. This article is a very good example : Some articles may not have commas, unless they are long and complete enough to be considerated as articles.
I never said the comma technic was the right one to count an article as a "good" article Your example is an extrem case. I see no pb with you adding a comma here.
But all the articles you did this morning, with commas in comments are much poorer than SHA-1, and imho hardly deserve the name of encyclopedic articles. Dictionary articles yes, but not encyclopedia articles.
This is a dictionary article, with commas in comment :
http://fr.wikipedia.org/w/wiki.phtml?title=Aiguchi&diff=0&oldid=3434...
That's why I have added commas to every article which doesn't have one... C'est aussi simple que �a.
Why not a mix between the comma method, and a
certain
number of words ? (at least people will know how
to
count ;-))
Greater-than-zero-size check is simple and efficient enough...
It is simple. But it is slightly ridiculous. Certainly not the best way to convince newcomers we are a great resource
By the way, I am happy that Tim took some time to improve some of Tim articles. Have you seen ?
http://fr.wikipedia.org/wiki/Banane
Athymik (athymik@ifrance.com) Cl� publique http://athymik.devveb.org/public_key.asc
Attention, cl� publique, nous avons affaire � un Tim certifi� !
Ant
__________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
=20
Oui ! The french wikipedia has become a hunt place for commas ! Even in articles less than 10 words long...
=20 Ouhl=E0l=E0... Warning commas-hunters ! Anthere is back !!!
Ah, Tim ! Si tu n'existais pas, il faudrait t'inventer !
Tiens, d'ailleurs, faudra que tu nous explique comment on te reconnait, car franchement, connaitre ta cl=E9, =E7a ne m'avance pas du tout du tout du tout
Télécharges la clé, que tu importes dans PGP/GnuPG. Ensuite, avec GnuPG/PGP, tu vérifies la signature du message. Le principe est que seul le possesseur de la clé secrète correspondant à la clé publique peut générer une signature correcte du message. C'est compris ?
Et tant que je ne comprend pas, Tim , c'est Tim, et Tim, c'est toi
Erreur, Anthere, erreur ! Sais-tu (oui, tu devrais avoir) qu'il y a près d'une dizaine de Tim sur Wikipedia ? Il y aussi Athyrance, Athrystant, Athypique, etc... Aussi, j'espère que tu ne te refuseras pas à comprendre...
Anthere, take a look at [[SHA-1]] and tell me how many commas are there? There is only one, and it is not needed, from a stylistic point of view. This article is a very good example : Some articles may not have commas, unless they are long and complete enough to be considerated as articles.
I never said the comma technic was the right one to count an article as a "good" article Your example is an extrem case. I see no pb with you adding a comma here.
But all the articles you did this morning, with commas in comments are much poorer than SHA-1, and imho hardly deserve the name of encyclopedic articles. Dictionary articles yes, but not encyclopedia articles.
Dictionary, encyclopedia articles... That's another problem.
This is a dictionary article, with commas in comment :
http://fr.wikipedia.org/w/wiki.phtml?title=3DAiguchi&diff=3D0&oldid=... 43
I think that the being who did this just cannot understand French... ;->
=20
That's why I have added commas to every article which doesn't have one... C'est aussi simple que =E7a. =20
Why not a mix between the comma method, and a
certain
number of words ? (at least people will know how
to
count ;-))
=20 Greater-than-zero-size check is simple and efficient enough...
It is simple. But it is slightly ridiculous. Certainly not the best way to convince newcomers we are a great resource
Encore une histoire "d'image de marque", Anthere... Encore un différend entre nous à ce sujet.
By the way, I am happy that Tim took some time to improve some of Tim articles. Have you seen ?
Tim is *very* nice ! But who are you talking about ? Athymik, Athyrance, Atrhystant, Athypique, Athyrail ?
Athymik (athymik@ifrance.com) Cl=E9 publique http://athymik.devveb.org/public_key.asc
Attention, cl=E9 publique, nous avons affaire =E0 un Tim certifi=E9 !
Ant
Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
Dis Anthere, tu nous fais un petit sondage là ? Yahoons-nous ? Et si on répond, est-ce qu'on a droit au "Tax Center" en plus ?
--- Athymik athymik@ifrance.com wrote:
=20
Oui ! The french wikipedia has become a hunt place
for
commas ! Even in articles less than 10 words long...
=20 Ouhl=E0l=E0... Warning commas-hunters ! Anthere
is back
!!!
Ah, Tim ! Si tu n'existais pas, il faudrait
t'inventer
!
Tiens, d'ailleurs, faudra que tu nous explique
comment
on te reconnait, car franchement, connaitre ta
cl=E9, =E7a
ne m'avance pas du tout du tout du tout
T�l�charges la cl�, que tu importes dans PGP/GnuPG. Ensuite, avec GnuPG/PGP, tu v�rifies la signature du message. Le principe est que seul le possesseur de la cl� secr�te correspondant � la cl� publique peut g�n�rer une signature correcte du message. C'est compris ?
et bien....j'ose � peine le dire...mais....euhhhh....non.
Raclement de gorge Le principe, je vois mais en pratique, je ne vois pas
C'est quoi GnuPG/PGP ?
Et tant que je ne comprend pas, Tim , c'est Tim, et Tim, c'est toi
Erreur, Anthere, erreur ! Sais-tu (oui, tu devrais avoir) qu'il y a pr�s d'une dizaine de Tim sur Wikipedia ? Il y aussi Athyrance, Athrystant, Athypique, etc... Aussi, j'esp�re que tu ne te refuseras pas � comprendre...
Comprend tr�s bien
et Athybot ? C'est pas toi ? Ou Iala ?
Tr�s confus
Greater-than-zero-size check is simple and
efficient
enough...
It is simple. But it is slightly ridiculous.
Certainly
not the best way to convince newcomers we are a
great
resource
Encore une histoire "d'image de marque", Anthere... Encore un diff�rend entre nous � ce sujet.
un peu n�gatif tout �a Tim...
Dommage
__________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - establish your business online http://webhosting.yahoo.com
On Thu, 13 Mar 2003, Anthere wrote:
Raclement de gorge Le principe, je vois mais en pratique, je ne vois pas
C'est quoi GnuPG/PGP ?
http://fr.wikipedia.org/wiki/GNU_Privacy_Guard http://fr.wikipedia.org/wiki/Pretty_Good_Privacy
-- brion vibber (brion @ pobox.com)
--- Brion Vibber vibber@aludra.usc.edu wrote:
On Thu, 13 Mar 2003, Anthere wrote:
Raclement de gorge Le principe, je vois mais en
pratique, je ne vois
pas
C'est quoi GnuPG/PGP ?
http://fr.wikipedia.org/wiki/GNU_Privacy_Guard http://fr.wikipedia.org/wiki/Pretty_Good_Privacy
-- brion vibber (brion @ pobox.com)
Waouuuu, it's amazing what we can find on Wikipedia ! Thanks Brion
__________________________________________________ Do you Yahoo!? Yahoo! Web Hosting - establish your business online http://webhosting.yahoo.com
Please check out one of this morning edit (by anom)
http://fr.wikipedia.org/w/wiki.phtml?title=Arabe&diff=0&oldid=34344
__________________________________________________ Do you Yahoo!? Yahoo! Tax Center - forms, calculators, tips, more http://taxes.yahoo.com/
Hi,
The comma count doesn't make sense to me. I would suggest size with a threshold of 100 bytes, or even 200 bytes.
Yann
On 8 Mar 2003 at 14:15, Brion Vibber wrote:
Unless a better count system is proposed, I will replace the comma check with a greater-than-zero-size check within twelve hours.
If I may propose sth in these heated discussion... Just thoughts for the purpose of the http://www.wikipedia.org/wiki/Wikipedia:Multilingual_statistics and not for the new type of main page counter (comma counting, or any other, will be sooner or later hacked and again we will need to improve our security system :-)
Count the total length of all the the pages (uncompressed wiki!) then divide it by 1) the number of articles or 2) 1800 (DTP's standard length of A4 page in latin chars)
(excluding Wikipedia:, Talk: namespaces, etc of course) Present both numbers, total length one of the ratios I think that: - the total length can be quite easily compared to paper encyclopedia's size - one can very easily estimate the length of one paper page - the ratio roughly gives the amount of work spent on creating articles (but not quality, perhaps one should divide by total number of edits?)
If we need to compare latin-based wikis with nonlatin-based wikis perhaps multiplying the ratio by the average latin word length would be sufficient.
Hmm... nevertheless the idea is accepted by the whole community or not I will try to count like that Polish Wikipedia progress on my own.
Regards Youandme
On 9 Mar 2003 at 21:54, Youandme wrote:
Count the total length of all the the pages (uncompressed wiki!) then divide it by
- the number of articles
or 2) 1800 (DTP's standard length of A4 page in latin chars)
Opss... forget about the second one as it's just scaling of the total length.
Youandme
On Sun, 09 Mar 2003 22:24:12 +0100, Youandme wikipedia@wp.pl wrote:
On 9 Mar 2003 at 21:54, Youandme wrote:
Count the total length of all the the pages (uncompressed wiki!) then divide it by
- the number of articles
or 2) 1800 (DTP's standard length of A4 page in latin chars)
Opss... forget about the second one as it's just scaling of the total length.
But it would give an equivalence for the amount of text compared to a traditional encyclopedia. The average Britannica page has more than 1800 chars, I'd wager!
Unless a better count system is proposed, I will replace the comma check with a greater-than-zero-size check within twelve hours.
Good idea -- basically if something non-blank exists in the article namespace, it should be an article, even if it is a stub. If we add categories, we may want to exclude all articles with [[Category:Stub]] from the count (or from an alternative count), but that can wait.
We should not count #REDIRECTs. We should trim whitespace (blanks, tabs, newlines) before doing the bytecount.
Regards,
Erik
wikipedia-l@lists.wikimedia.org