Patches item #1843798, was opened at 2007-12-03 21:45
Message generated for change (Comment added) made by sigmaoctantis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1843798&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Pietro Battiston (toobaz)
Assigned to: Nobody/Anonymous (nobody)
Summary: Add capabiliy to remember pages to replace.py
Initial Comment:
When doing very long semi-automatic replacements, it can happen to kill the bot and to start again. So you have to say "no" again to all non wanted replacements. It is even worse if you're using an xml dump: it can be several weeks old, and it will make you download lot of pages that where ALREADY corrected.
This patch consist in two parts:
1) a patch to replace.py that adds a new parameter, "-exclude", and makes it accept a path to a file which will be used both for:
-> knowing which articles to exclude from substitution
-> logging denied replaces' pages and pages already known to be not needing replacements
2) a patch to pagegenerators.py that adds a generator filter, able to yield only pages not appearing in a given list
The only doubt I have is: should the replace.py log in some other way? xml? wikipedia module's predefined functions? log into a given wikipedia userpage (so that logs can easily be shared)?
As I've done it, it needs to import os and codecs modules... don't know if it's a problem.
Anyway, a patch like this is something really needed, if needed I can try to improve it.
----------------------------------------------------------------------
Comment By: sigmaoctantis (sigmaoctantis)
Date: 2009-05-12 00:32
Message:
see patch ID: 2790445
https://sourceforge.net/tracker/?func=detail&aid=2790445&group_id=93107&ati…
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-06-23 16:22
Message:
Logged In: NO
closed this patch?
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2008-01-16 09:20
Message:
Logged In: NO
replace.py already has the option -xmlstart:page when using an xml dump,
to skip all entries before "page".
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2008-01-16 07:35
Message:
Logged In: YES
user_id=880694
Originator: NO
We already have something very similar for solve_disambiguation.py. When
you run it with the -primary parameter, e.g. on [[en:London]], it saves all
page titles where the user pressed 'N' to the 'disambiguations' directory,
and skips these pages when you run the same command later.
It saves the URL-encoded titles into a text files, one title per line,
without [[brackets]].
It would be nice if some code could be shared, although I'm not sure if
that's possible (I haven't yet looked at your code, but
solve_disambiguation.py is a bit complicated). But we should keep
solve_disambiguation's format because there are probably people who want to
keep using their logs.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1843798&group_…
Patches item #2790445, was opened at 2009-05-12 00:30
Message generated for change (Settings changed) made by sigmaoctantis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
>Priority: 7
Private: No
Submitted By: sigmaoctantis (sigmaoctantis)
Assigned to: Nobody/Anonymous (nobody)
Summary: Re 1843798: Add capabiliy to remember pages to replace.py
Initial Comment:
A new patch to implement toobaz's function with the changes suggested by wikipedian.
https://sourceforge.net/tracker/?func=detail&aid=1843798&group_id=93107&ati…
- solve_disambiguation.py and pagegenerators.py:
1. Generator and logging function for -primary option moved
from solve_disambiguation.py to pagegenerators.py
2. TODO in solve_disambiguation.py done:
generator now starts yielding before all referring pages have been found
3. makes use of new TextfilePageGenerator
4. code is a few lines shorter
- replace.py:
5. "-exclude" option from toobaz's patch implemented.
Allows to filter generator through a list of previously edited pages.
New pages are appended to the filter file based on choices made:
-exclude: logs to filter choice "N"
6. additional command line options for other settings:
-editonce: logs to filter choices "Y", "A"
-treatonce: logs to filter choices "Y", "A", "N"
-scanonce: logs to filter choices "Y", "A", "N"; no change
7. uses generator and file format from solve_disambiguation.py
(suggested by wikipedian below)
8. default filter filename is the name of the fix. Files are placed
in a subdirectory "replace".
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445&group_…
Patches item #2790445, was opened at 2009-05-12 00:30
Message generated for change (Tracker Item Submitted) made by sigmaoctantis
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: sigmaoctantis (sigmaoctantis)
Assigned to: Nobody/Anonymous (nobody)
Summary: Re 1843798: Add capabiliy to remember pages to replace.py
Initial Comment:
A new patch to implement toobaz's function with the changes suggested by wikipedian.
https://sourceforge.net/tracker/?func=detail&aid=1843798&group_id=93107&ati…
- solve_disambiguation.py and pagegenerators.py:
1. Generator and logging function for -primary option moved
from solve_disambiguation.py to pagegenerators.py
2. TODO in solve_disambiguation.py done:
generator now starts yielding before all referring pages have been found
3. makes use of new TextfilePageGenerator
4. code is a few lines shorter
- replace.py:
5. "-exclude" option from toobaz's patch implemented.
Allows to filter generator through a list of previously edited pages.
New pages are appended to the filter file based on choices made:
-exclude: logs to filter choice "N"
6. additional command line options for other settings:
-editonce: logs to filter choices "Y", "A"
-treatonce: logs to filter choices "Y", "A", "N"
-scanonce: logs to filter choices "Y", "A", "N"; no change
7. uses generator and file format from solve_disambiguation.py
(suggested by wikipedian below)
8. default filter filename is the name of the fix. Files are placed
in a subdirectory "replace".
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2790445&group_…
Bugs item #2790339, was opened at 2009-05-12 00:26
Message generated for change (Tracker Item Submitted) made by platonides
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2790339&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Platonides (platonides)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problems with revdelete
Initial Comment:
Wikipedia is now using revdelete in everyday use.
This means that content, editor or summaries may be selectively deleted.
pywikipediabot isn't prepared to handle that, and will break throwing an exception, even if that field isn't needed for the specific action being done.
I encountered it with instances of <contributor deleted="deleted"/> but will likely happen with all deleted fields.
If you use -page: generator where it has the contributor deleted, it'll throw from
xmlreader.py, line 180, in endElement
text, self.username,
AttributeError: MediaWikiXmlHandler instance has no attribute 'username'
MediaWikiXmlHandler instance has no attribute 'username'
If you were using a -xml: generator, it's much more cyptic:
xmlreader.py", line 64, in __init__
self.username = username.strip()
AttributeError: 'bool' object has no attribute 'strip'
'bool' object has no attribute 'strip'
Using r6870.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2790339&group_…
Bugs item #2790121, was opened at 2009-05-11 16:46
Message generated for change (Comment added) made by nicdumz
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2790121&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: David Crochet (crochet_david)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki.py crash
Initial Comment:
up to r6870, interwiki.py crach :
dcrochet@linux-fxgk:~/pywikipedia> python version.py
Pywikipedia (r6751 (wikipedia.py), avr 29 2009, 16:21:41)
Python 2.4.2 (#1, Jan 10 2008, 17:45:02)
[GCC 4.1.2 20070115 (prerelease) (SUSE Linux)]
dcrochet@linux-fxgk:~/pywikipedia>
dcrochet@linux-fxgk:~/pywikipedia> python interwiki.py -prefixindex:"user:crochet.david"
File "interwiki.py", line 2129
finally:
^
SyntaxError: invalid syntax
dcrochet@linux-fxgk:~/pywikipedia>
dcrochet@linux-fxgk:~/pywikipedia> python version.py
Pywikipedia (r6858 (wikipedia.py), mai 08 2009, 15:23:29)
Python 2.4.2 (#1, Jan 10 2008, 17:45:02)
[GCC 4.1.2 20070115 (prerelease) (SUSE Linux)]
dcrochet@linux-fxgk:~/pywikipedia>
dcrochet@linux-fxgk:~/pywikipedia> python interwiki.py -prefixindex:"user:crochet.david"
File "interwiki.py", line 2150
finally:
^
SyntaxError: invalid syntax
dcrochet@linux-fxgk:~/pywikipedia>
----------------------------------------------------------------------
>Comment By: NicDumZ — Nicolas Dumazet (nicdumz)
Date: 2009-05-11 18:01
Message:
correct; try...except... finally is Python >= 2.5 only.
I fixed this in r6871, thanks :)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2790121&group_…
Bugs item #2790121, was opened at 2009-05-11 16:46
Message generated for change (Tracker Item Submitted) made by crochet_david
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2790121&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: David Crochet (crochet_david)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki.py crash
Initial Comment:
up to r6870, interwiki.py crach :
dcrochet@linux-fxgk:~/pywikipedia> python version.py
Pywikipedia (r6751 (wikipedia.py), avr 29 2009, 16:21:41)
Python 2.4.2 (#1, Jan 10 2008, 17:45:02)
[GCC 4.1.2 20070115 (prerelease) (SUSE Linux)]
dcrochet@linux-fxgk:~/pywikipedia>
dcrochet@linux-fxgk:~/pywikipedia> python interwiki.py -prefixindex:"user:crochet.david"
File "interwiki.py", line 2129
finally:
^
SyntaxError: invalid syntax
dcrochet@linux-fxgk:~/pywikipedia>
dcrochet@linux-fxgk:~/pywikipedia> python version.py
Pywikipedia (r6858 (wikipedia.py), mai 08 2009, 15:23:29)
Python 2.4.2 (#1, Jan 10 2008, 17:45:02)
[GCC 4.1.2 20070115 (prerelease) (SUSE Linux)]
dcrochet@linux-fxgk:~/pywikipedia>
dcrochet@linux-fxgk:~/pywikipedia> python interwiki.py -prefixindex:"user:crochet.david"
File "interwiki.py", line 2150
finally:
^
SyntaxError: invalid syntax
dcrochet@linux-fxgk:~/pywikipedia>
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2790121&group_…
Patches item #2729464, was opened at 2009-04-03 21:54
Message generated for change (Comment added) made by tieump
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2729464&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Tieum P (tieump)
Assigned to: Nobody/Anonymous (nobody)
Summary: Proposed patch for getting new pages from ar wiki
Initial Comment:
The attached patch to wikipedia.py allows to retrieve newpages from ar wiki (and possibly other "right to left" wikis).
----------------------------------------------------------------------
Comment By: Tieum P (tieump)
Date: 2009-05-10 20:20
Message:
Done.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-05-07 05:37
Message:
Patch is outdated or does not apply. Please provide an updated patch based
on the current trunk.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2729464&group_…
Patches item #2729464, was opened at 2009-04-03 21:54
Message generated for change (Settings changed) made by tieump
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2729464&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Tieum P (tieump)
Assigned to: Nobody/Anonymous (nobody)
Summary: Proposed patch for getting new pages from ar wiki
Initial Comment:
The attached patch to wikipedia.py allows to retrieve newpages from ar wiki (and possibly other "right to left" wikis).
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-05-07 05:37
Message:
Patch is outdated or does not apply. Please provide an updated patch based
on the current trunk.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2729464&group_…