[Wikitech-l] Re: Test my lex/yacc parser!

29 Sep 2004

Sorry for the late reply.

Rowan Collins wrote:

...
  On Thu, 23 Sep 2004 20:09:21 +0100, Timwi
&lt;timwi(a)gmx.net&gt; wrote:

 Not really. We can still recognise redirects with
a regexp (or anything
else in PHP) before passing the page to the parser.  
 But why make that a special case? Why say "before using the nice
 eficient real parser, use a not-a-parser to check for the #REDIRECT
 directive, and have it do some voodoo" Far better to just have the
 parser recognise "#REDIRECT" (and any variants anyone wants) and
 output a parse tree with a special redirect node. 
Why is that "better"? I prefer my suggestion because:
* it might be more efficient because it means that we don't have to
   invoke the external parser just to find out whether what the user just
   submitted is a redirect or not
* it means the parser needn't be programmed to recognise redirects
   (makes the code simpler)
* it means we can assume that parse trees will be articles. Otherwise
   all output code would have to consider this special case. How should a
   class that is supposed to output LaTeX code react when you give it a
   redirect?

...
  First of all,
even in the current system there is no way for server
admins to customise the magic words without modifying actual source
code.  
 Well, technically, no, but Language*.php and LocalSettings.php are
 more like configuration files that happen to be executable for
 convenience. Editing the declaration of $wgMagicWordsEn in
 Language.php is no more difficult or involved than, say, editing a
 .ini file. 
True. As I said, we can make it work quite the same way using #defines, 
except of course that the module would need to be recompiled.

...
  Actually, I have to admit I had no idea how difficult
it would be, but
 I assumed it would mean having at least a compiler, if not a
 compiler-compiler and a whole load of other tools. Editing PHP doesn't
 need that kind of thing, and the way its designed now, you needn't
 notice your editing code. 
That is also true. But I really don't see why it's so hard to have a 
compiler?

...
  If it were possible to only require a c compiler, it
would certainly
 be a favour to other admins running MediaWiki. It's going to be
 annoying enough for some of them to have to deal with a binary part as
 well as PHP. 
As I mentioned before, it is *not* necessary for anyone to "deal with" 
anything. People can continue to use the old not-a-parser if they want!

...
  I think,
considering all of these problems we have discussed, it makes a
real lot of sense to formulate a "rule" that the design of the parser
should fulfill: The parser must know in advance how to parse everything.
The resulting parse tree must not depend on anything other than the
input wiki text.  
 Yep, I think you're probably right on that one. And as you say, the
 more things that can e done inside the parser, the better, since
 outside means PHP, and is likely to be less efficient. 
I had an alternative idea. Currently I'm passing the wiki text as a 
string parameter to the function that does the actual parsing. I could 
have it accept a second parameter, a language code, which would 
influence the magic words.

Just an idea.

Timwi

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

[Wikitech-l] Re: Test my lex/yacc parser!