I've just committed some code to CVS which allows internationalisation of "magic words" (my term) such as #REDIRECT, __NOTOC__, {{NUMBEROFARTICLES}}, etc. I've done it so that each word can have a number of synonyms. I expect we will implement it so that the english words work everywhere: this eliminates conversion cost, avoids annoying people by forcing them to change their habits, and makes it easier for people to contribute to many different language wikis. We can add localised versions as soon as someone from that wiki translates it. Some languages may want more than one localised synonym, for example with and without accents.
In Language.php, there is a new global variable $wgMagicWordsEn, and an accessor function which can be overridden on a language-by-language basis. The variable is an array of arrays, where the first element of each row indicates whether or not the word is case sensitive, and the rest of the elements are synonyms.
This data is loaded into a MagicWord object on demand, using the syntax:
$mw =& MagicWord::get( MAG_REDIRECT )
That's right folks, in a Wikipedia first I've actually used constants! And references! For convenience, there is a $wgMwRedir variable, where the get( MAG_REDIRECT) has been done for you in Setup.php. The MagicWord object has lots of handy searching and replacing functions, which means that you only very rarely have to deal with actual regular expressions.
According to the PHP manual strstr is much faster than preg_match for equivalent tasks, but I did a quick benchmark and found them to be pretty much the same (at least in the large-string limit). So I used regular expressions all the time.
-- Tim Starling
wikitech-l@lists.wikimedia.org