[Pywikipedia-bugs] [ pywikipediabot-Patches-2790445 ] Re 1843798: Add capabiliy to remember pages to replace.py

12 May 2009


      Patches item #2790445, was opened at 2009-05-12 06:30
Message generated for change (Comment added) made by nicdumz
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445...
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 7
Private: No
Submitted By: sigmaoctantis (sigmaoctantis)
Assigned to: Nobody/Anonymous (nobody)
Summary: Re 1843798: Add capabiliy to remember pages to replace.py
Initial Comment:
A new patch to implement toobaz's function with the changes suggested by wikipedian.
https://sourceforge.net/tracker/?func=detail&aid=1843798&group_id=93...
- solve_disambiguation.py  and  pagegenerators.py:
1.  Generator and logging function for -primary option moved
    from solve_disambiguation.py to pagegenerators.py
2.  TODO in solve_disambiguation.py done:
    generator now starts yielding before all referring pages have been found
3.  makes use of new TextfilePageGenerator
4.  code is a few lines shorter
- replace.py:
5.  "-exclude" option from toobaz's patch implemented.
    Allows to filter generator through a list of previously edited pages.
    New pages are appended to the filter file based on choices made:
    -exclude:   logs to filter choice "N"
6.  additional command line options for other settings:
    -editonce:  logs to filter choices "Y", "A"
    -treatonce: logs to filter choices "Y", "A", "N"
    -scanonce:  logs to filter choices "Y", "A", "N"; no change
7.  uses generator and file format from solve_disambiguation.py
    (suggested by wikipedian below)
8.  default filter filename is the name of the fix. Files are placed
    in a subdirectory "replace".
----------------------------------------------------------------------
...
Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2009-05-12 12:47
Message:
Wow, that's a big patch =)
* codecs is fine with me
* can you avoid lines > 80 characters? I know that this is not something
we do everywhere, but that's bad looking code. Same goes for if foo: bar.
Please skip a line.
* can you document thoroughly what's being done? parameters in the
generators? In replace.py ? I find it really hard to understand the
"choice" table in the docstring explaining -scanonce & others. 
* What's this:
+        f = codecs.open(filename, 'r', 'utf-8')
+        f.close()
??
I am also not convinced by the fact that after each page, FilterFileAppend
is called, and #1 path is computed, #2 a file is opened, written in, and
closed.
I'm thinking that a possible cleaner way to do this would be to have a
Filter object: put everything you need in it (an opened file descriptor, a
list of titles to ignore if you need to use this, etc...) and keep a
reference to it from the replace & disambig bots. How does that sound to
you?
I also know that Daniel wanted first to keep the same file format, but...
a couple of things are wrong here:
* if you output titles with page.urlname() it will not be possible to read
the file with TextfilePageGenerator afaik. Think of special characters,
being url encoded, and not decoded.
* if you want to use a Page title for a filename, you want
Page.titleforFilename, not Page.urlname
Thank you!
----------------------------------------------------------------------
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445...

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

[Pywikipedia-bugs] [ pywikipediabot-Patches-2790445 ] Re 1843798: Add capabiliy to remember pages to replace.py