Revision: 4439 Author: cosoleto Date: 2007-10-10 09:39:51 +0000 (Wed, 10 Oct 2007)
Log Message: ----------- Disabled debug messages by default. Added __init__.py file.
Modified Paths: -------------- trunk/pywikiparser/Parser.py
Added Paths: ----------- trunk/pywikiparser/__init__.py
Modified: trunk/pywikiparser/Parser.py =================================================================== --- trunk/pywikiparser/Parser.py 2007-10-10 09:34:00 UTC (rev 4438) +++ trunk/pywikiparser/Parser.py 2007-10-10 09:39:51 UTC (rev 4439) @@ -1,4 +1,4 @@ -# -*- coding: utf-8 -*- +# -*- coding: utf-8 -*- """ Mediawiki wikitext parser """ # # (C) 2007 Merlijn 'valhallasw' van Deen @@ -15,11 +15,20 @@
from Lexer import Lexer, Tokens
+_debug = False + +def dbgmsg(text): + if _debug: + print 'debug> ' + text + class ParseError(Exception): """ Parsing Error """
class Parser: - def __init__(self, data): + def __init__(self, data, debug = False): + global _debug + _debug = debug + self.lex = BufferedReader(Lexer(data).lexer())
def expect(self, tokens): @@ -55,7 +64,7 @@ break
node = self.parsetoken(token, restore) - print "Adding %r (was %r)" % (node,token) + dbgmsg("Adding %r (was %r)" % (node,token)) self.par.extend(node) restore = self.lex.commit(restore)
@@ -105,7 +114,7 @@ newitalic = not self.italic newbold = not self.bold
- print 'bold: %r>%r italic: %r>%r' % (self.bold, newbold, self.italic, newitalic) + dbgmsg('bold: %r>%r italic: %r>%r' % (self.bold, newbold, self.italic, newitalic)) if self.italic and not newitalic: if self.par.name == 'i' or not newbold: self.par = self.par.parent @@ -283,7 +292,7 @@ self.expect(Tokens.CURL_OPEN) self.expect(Tokens.CURL_OPEN) pre = self.eat(Tokens.CURL_OPEN) - print 'pre: ' + pre + dbgmsg('pre: ' + pre) if pre: retval.append(pre)
@@ -302,7 +311,7 @@ raise ParseError("Needs implementation")
def parseWikitable(self): - raise ParseError("Needs implementation") + raise ParseError("Needs implementation")
titlere = re.compile(r"[^^]<>[|{}\n]*$") def parseTitle(self, closetoken): @@ -314,7 +323,7 @@ elif next[0] == Tokens.CURL_OPEN: # allow templates to expand restore = self.lex.getrestore() data = self.parseCURL_OPEN(restore) - print 'Parsed template: %r' % (data,) + dbgmsg('Parsed template: %r' % (data,)) for item in data: if isinstance(item, basestring): if not self.titlere.match(item): @@ -326,5 +335,3 @@ raise ParseError('illegal wiki link') title.append(next[1]) return title - - \ No newline at end of file
Added: trunk/pywikiparser/__init__.py =================================================================== --- trunk/pywikiparser/__init__.py (rev 0) +++ trunk/pywikiparser/__init__.py 2007-10-10 09:39:51 UTC (rev 4439) @@ -0,0 +1,10 @@ +# -*- coding: utf-8 -*- +""" Mediawiki wikitext parser """ +# +# (C) 2007 Merlijn 'valhallasw' van Deen +# +# Distributed under the terms of the MIT license. +# +__version__ = u'$Id$' + +from Parser import Parser
Revision: 4439 Author: cosoleto Date: 2007-10-10 09:39:51 +0000 (Wed, 10 Oct 2007)
Log Message:
Disabled debug messages by default. Added __init__.py file.
Modified Paths:
trunk/pywikiparser/Parser.py
(...)
Why was this change committed? The parser is nowhere near usable so disabling debug messages by default is just plain useless. Yes, it creates some sort of parse tree, but using it is a whole different thing. Secondly, I do *not* want to guarantee any backwards compatibility. This means that using the current parser is just plain stupid, because the input and output formats can change, which would mean you would have to rewrite your code.
Ping Yeh was working on a full C-based parser, but I have not yet seen a response. I will mail wikitech-l about that parser later, hoping that he will respond then.
--Merlijn (valhallasw)
Merlijn van Deen wrote:
Why was this change committed? The parser is nowhere near usable so disabling debug messages by default is just plain useless. Yes, it creates some sort of parse tree, but using it is a whole different thing. Secondly, I do *not* want to guarantee any backwards compatibility. This means that using the current parser is just plain stupid, because the input and output formats can change, which would mean you would have to rewrite your code.
Ping Yeh was working on a full C-based parser, but I have not yet seen a response. I will mail wikitech-l about that parser later, hoping that he will respond then.
Mmm... Thank you for your feedback. Well, I have used pywikiparser to fix a problem in copyright.py that needs to recognize italic text sometime. This is only a little use of pywikiparser. I have extensively tested and deducted that new changed function work better and no different solution was available by me. Sure, I don't think your code is usable for others purpose: I see it as INCOMPLETE it is. :) Although I hope to see more fixes soon.
No problem if you want enable by default debug outputs, I can change my code... May I suggest you to write in pywikiparser code these (and more) warnings? In my opinion, I think be a good thing move the 'pywikiparser' directory into 'pywikipedia': more people see the code more possibility of improvements (besides, copyright.py try to import it). Very nice to hear about a C implementation too.
Francesco Cosoleto