Ray Saintonge wrote:
BTW the Esperanto Wikipedia seems to have the greatest number of zero length articles.
A while ago we had a guy come in and add a zillion pages consisting of externally-linked images with little or no text, and no reference as to copyright status or permission for use. After some yakking, we got him to peacefully retract them. Those that never had any text are thus left empty until they get either filled out or deleted outright.
-- brion vibber (brion @ pobox.com)
I totally support Brion's fervor to change this. But I'm not so sure about the "greater than zero" criterion.
The "comma" trick was a good kludge to get at the idea that random junk is not an article. Pretty much anything with a sentence or two will have a comma (in English) and thus will constitute an article, though perhaps just a stub.
It would be interesting to see some quick statistics, if that's possible, on some various methods, and how the counts are affected.
zero bytes = ? articles 100 bytes = ? articles 500 bytes = ? articles 1000 bytes = ? articles
Any single statistics is going to be limited in the information that it conveys. It might also be fun to look at: total bytes in all "articles" (defined different ways), average bytes per page in article namespace, histogram of number of articles of various lengths, etc.
Of course, it's easy for me to sit here and type up a dream list of statistics. :-)
--Jimbo
wikipedia-l@lists.wikimedia.org