On Tue, Jun 30, 2009 at 11:20 AM, Robert Rohde <rarohde(a)gmail.com> wrote:
However,
given the nastiness of template syntax, I would expect no end of wiki
authors willing to help convert the commonly used stuff.
-Robert Rohde
I was curious just how terrible of a task conversion can be expected
to be. This is just a heuristic I came up with..
# Simple English parser functions
$ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{#' | wc -l
22,211
# Simple English templates
$ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{' | wc -l
416,126 - 22,211 = 393,915
# English parser functions
$ bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{#' | wc -l
430,980
# English templates
$ bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{' | wc -l
44,928,358 - 430,980 = 44,497,378