On Tue, Jun 30, 2009 at 11:20 AM, Robert Rohde rarohde@gmail.com wrote:
However, given the nastiness of template syntax, I would expect no end of wiki authors willing to help convert the commonly used stuff.
-Robert Rohde
I was curious just how terrible of a task conversion can be expected to be. This is just a heuristic I came up with..
# Simple English parser functions $ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{#' | wc -l 22,211
# Simple English templates $ bunzip2 -c simplewiki-20090623-pages-articles.xml.bz2 | grep -o '{{' | wc -l 416,126 - 22,211 = 393,915
# English parser functions $ bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{#' | wc -l 430,980
# English templates $ bunzip2 -c enwiki-20090618-pages-articles.xml.bz2 | grep -o '{{' | wc -l 44,928,358 - 430,980 = 44,497,378