Sorry for the late reply.
Rowan Collins wrote:
On Thu, 23 Sep 2004 20:09:21 +0100, Timwi timwi@gmx.net wrote:
Not really. We can still recognise redirects with a regexp (or anything else in PHP) before passing the page to the parser.
But why make that a special case? Why say "before using the nice eficient real parser, use a not-a-parser to check for the #REDIRECT directive, and have it do some voodoo" Far better to just have the parser recognise "#REDIRECT" (and any variants anyone wants) and output a parse tree with a special redirect node.
Why is that "better"? I prefer my suggestion because: * it might be more efficient because it means that we don't have to invoke the external parser just to find out whether what the user just submitted is a redirect or not * it means the parser needn't be programmed to recognise redirects (makes the code simpler) * it means we can assume that parse trees will be articles. Otherwise all output code would have to consider this special case. How should a class that is supposed to output LaTeX code react when you give it a redirect?
First of all, even in the current system there is no way for server admins to customise the magic words without modifying actual source code.
Well, technically, no, but Language*.php and LocalSettings.php are more like configuration files that happen to be executable for convenience. Editing the declaration of $wgMagicWordsEn in Language.php is no more difficult or involved than, say, editing a .ini file.
True. As I said, we can make it work quite the same way using #defines, except of course that the module would need to be recompiled.
Actually, I have to admit I had no idea how difficult it would be, but I assumed it would mean having at least a compiler, if not a compiler-compiler and a whole load of other tools. Editing PHP doesn't need that kind of thing, and the way its designed now, you needn't notice your editing code.
That is also true. But I really don't see why it's so hard to have a compiler?
If it were possible to only require a c compiler, it would certainly be a favour to other admins running MediaWiki. It's going to be annoying enough for some of them to have to deal with a binary part as well as PHP.
As I mentioned before, it is *not* necessary for anyone to "deal with" anything. People can continue to use the old not-a-parser if they want!
I think, considering all of these problems we have discussed, it makes a real lot of sense to formulate a "rule" that the design of the parser should fulfill: The parser must know in advance how to parse everything. The resulting parse tree must not depend on anything other than the input wiki text.
Yep, I think you're probably right on that one. And as you say, the more things that can e done inside the parser, the better, since outside means PHP, and is likely to be less efficient.
I had an alternative idea. Currently I'm passing the wiki text as a string parameter to the function that does the actual parsing. I could have it accept a second parameter, a language code, which would influence the magic words.
Just an idea.
Timwi