Patches item #2790445, was opened at 2009-05-12 06:30 Message generated for change (Comment added) made by nicdumz You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 7 Private: No Submitted By: sigmaoctantis (sigmaoctantis) Assigned to: Nobody/Anonymous (nobody) Summary: Re 1843798: Add capabiliy to remember pages to replace.py
Initial Comment: A new patch to implement toobaz's function with the changes suggested by wikipedian. https://sourceforge.net/tracker/?func=detail&aid=1843798&group_id=93...
- solve_disambiguation.py and pagegenerators.py:
1. Generator and logging function for -primary option moved from solve_disambiguation.py to pagegenerators.py
2. TODO in solve_disambiguation.py done: generator now starts yielding before all referring pages have been found
3. makes use of new TextfilePageGenerator
4. code is a few lines shorter
- replace.py:
5. "-exclude" option from toobaz's patch implemented. Allows to filter generator through a list of previously edited pages. New pages are appended to the filter file based on choices made: -exclude: logs to filter choice "N"
6. additional command line options for other settings: -editonce: logs to filter choices "Y", "A" -treatonce: logs to filter choices "Y", "A", "N" -scanonce: logs to filter choices "Y", "A", "N"; no change
7. uses generator and file format from solve_disambiguation.py (suggested by wikipedian below)
8. default filter filename is the name of the fix. Files are placed in a subdirectory "replace".
----------------------------------------------------------------------
Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2009-05-12 12:47
Message: Wow, that's a big patch =)
* codecs is fine with me * can you avoid lines > 80 characters? I know that this is not something we do everywhere, but that's bad looking code. Same goes for if foo: bar. Please skip a line. * can you document thoroughly what's being done? parameters in the generators? In replace.py ? I find it really hard to understand the "choice" table in the docstring explaining -scanonce & others. * What's this: + f = codecs.open(filename, 'r', 'utf-8') + f.close() ??
I am also not convinced by the fact that after each page, FilterFileAppend is called, and #1 path is computed, #2 a file is opened, written in, and closed. I'm thinking that a possible cleaner way to do this would be to have a Filter object: put everything you need in it (an opened file descriptor, a list of titles to ignore if you need to use this, etc...) and keep a reference to it from the replace & disambig bots. How does that sound to you?
I also know that Daniel wanted first to keep the same file format, but... a couple of things are wrong here: * if you output titles with page.urlname() it will not be possible to read the file with TextfilePageGenerator afaik. Think of special characters, being url encoded, and not decoded. * if you want to use a Page title for a filename, you want Page.titleforFilename, not Page.urlname
Thank you!
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445...