I totally support Brion's fervor to change this. But I'm not so sure about the "greater than zero" criterion.
The "comma" trick was a good kludge to get at the idea that random junk is not an article. Pretty much anything with a sentence or two will have a comma (in English) and thus will constitute an article, though perhaps just a stub.
It would be interesting to see some quick statistics, if that's possible, on some various methods, and how the counts are affected.
zero bytes = ? articles 100 bytes = ? articles 500 bytes = ? articles 1000 bytes = ? articles
Any single statistics is going to be limited in the information that it conveys. It might also be fun to look at: total bytes in all "articles" (defined different ways), average bytes per page in article namespace, histogram of number of articles of various lengths, etc.
Of course, it's easy for me to sit here and type up a dream list of statistics. :-)
--Jimbo