https://bugzilla.wikimedia.org/show_bug.cgi?id=54574
--- Comment #4 from Kunal Mehta (Legoktm) <legoktm.wikipedia(a)gmail.com> ---
Wow, that's a big patch =\)
\* codecs is fine with me
\* can you avoid lines > 80 characters? I know that this is not something we
do everywhere, but that's bad looking code. Same goes for if foo: bar. Please
skip a line.
\* can you document thoroughly what's being done? parameters in the generators?
In replace.py ? I find it really hard to understand the "choice" table in the
docstring explaining -scanonce & others.
\* What's this:
\+ f = codecs.open\(filename, 'r', 'utf-8'\)
\+ f.close\(\)
??
I am also not convinced by the fact that after each page, FilterFileAppend is
called, and \#1 path is computed, \#2 a file is opened, written in, and closed.
I'm thinking that a possible cleaner way to do this would be to have a Filter
object: put everything you need in it \(an opened file descriptor, a list of
titles to ignore if you need to use this, etc...\) and keep a reference to it
from the replace & disambig bots. How does that sound to you?
I also know that Daniel wanted first to keep the same file format, but... a
couple of things are wrong here:
\* if you output titles with page.urlname\(\) it will not be possible to read
the file with TextfilePageGenerator afaik. Think of special characters, being
url encoded, and not decoded.
\* if you want to use a Page title for a filename, you want
Page.titleforFilename, not Page.urlname
Thank you\!
--
You are receiving this mail because:
You are the assignee for the bug.