The parser has to parse and treat magic words like __TOC__. These words are defined in languages/messages/MessagesXx.php (and possibly overridden). That theoretically means that *anything* (like "a" or even " ") could be a magic word. That makes it hard to write a fast parser, as basically you would have to process every character one at a time, look for a match, move onto the next character...
So, two questions: 1) Is it possible/feasible to restrict the range of what could be a magic word in some way, like that they have to start with __, or some range of characters. 2) Is it possible to get a complete list of all the magic words currently used for all the languages of Wikipedia? Does the contents of the languages/messages directory already represent that?
I realise that the term "magic word" is somewhat ambiguous: I'm primarily referring to words like __TOC__ that can appear vitually anywhere, rather than words like "subst:" that require a special context, or magic variables like PAGENAME, which (afaik) have to be wrapped in {{..}}.
Thanks, Steve
wikitech-l@lists.wikimedia.org