Domas Mituzas wrote:
Anyway, we have to ensure, that most of wikis (at least top20 ones) have got ridden of curly braces and any other expensive parser stuff in these messages, as that costs them up to 10 milliseconds per pageview (if anyone writes a bot to do this automatically, I'd gladly run it with my global super duper privileges :)) :
1) Copy that list 2) Prepend MediaWiki: namespace 3) Post to Special:Export 4) Automate it:
sed s/wiki$/wikipedia/ all.dblist > all.domains sed -i s/metawikipedia/metawikimedia/ all.domains sed -i s/commonswikipedia/commonswikimedia/ all.domains sed -i s/wik/.wik/ all.domains sed -i s/.wikimania([0-9]+)wikipedia/wikimania\1.wikimedia/ all.domains sed -i s/.wikimaniateamwikipedia/wikimaniateam.wikimedia/ all.domains sed -i s/foundation.wikipedia/wikimediafoundation/ all.domains sed -i "s/(strategy|usability|collab|advisory|grants|board|incubator|internal|chair|quality|exec|wikimaniateam|office|.*com).wikipedia/\1.wikimedia/" all.domains sed -i s/_/-/g all.domains sed -i s/arbcom-/arbcom./ all.domains sed -i s/-labs/.labs/ all.domains sed -i s/wg-en.wikipedia/wg.en.wikipedia/ all.domains sed -i s/media.wikiwikipedia/www.mediawiki/ all.domains
while read domain; do wget http://$domain.org/wiki/Special:Export --post-file=postdata.txt -O $domain.txt done < all.domains
6) Profit!!
Wikis using some kind of templating grep -l "{{" *|wc -l 255
Total usage: grep "{{" *|wc -l 732
Using parserfunctions grep "{{#" *|wc -l 28 (across 22 wikis: als.wikipedia.org bar.wikipedia.org ca.wikipedia.org commons.wikimedia.org en.labs.wikimedia.org en.wikibooks.org fa.wikipedia.org fa.wikiquote.org gl.wikipedia.org it.wikinews.org it.wikiquote.org meta.wikimedia.org ru.wikipedia.org simple.wikipedia.org sv.wikibooks.org tr.wikibooks.org tr.wikipedia.org tr.wikisource.org zh.wikibooks.org zh.wikipedia.org zh.wikiquote.org zh.wikisource.org)
grep "{{PAGENAME}}" *|wc -l 18
Used for namespace name: grep "{{ns:" *|wc -l 226
grep "{{localurl:" *|wc -l 5
grep "{{grammar:" *|wc -l 8
grep "{{plural:" *|wc -l 0
grep "<nowiki" *|wc -l 0
Wikis with using all default messages: grep -L "<revision>" * | wc -l 273
Private wikis not read: grep "<html" *|wc -l 23