Patches item #3108310, was opened at 2010-11-12 21:42
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3108310&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: lankier (lankier)
Assigned to: Nobody/Anonymous (nobody)
Summary: parameter expandtemplates for Page.linkedPages
Initial Comment:
Added parameter expandtemplates for Page.linkedPages. I think it is a usefull parameter.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:53
Message:
I think so too, but I'm not sure if 'expandtemplates' is the clearest term
to use for this. Maybe 'includetranscluded'?
In any case, there should be some documentation on the parameter added.
Could you do that? Thanks!
(unfortunately there is no option for 'accepted, but waiting for updated
patch...')
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3108310&group_…
Patches item #3092870, was opened at 2010-10-22 03:26
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3092870&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: lankier (lankier)
Assigned to: xqt (xqt)
Summary: non ascii in system messages and max retry
Initial Comment:
This patch fixed two issues:
1. Ubuntu has non ascii in system messages.
Test:
$ sudo ifconfig eth0 down
$ cat test.py
import wikipedia
site = wikipedia.getSite()
page = wikipedia.Page(site, 'S')
text = page.get()
$ LANG=ru_RU.utf8 python test.py
Error downloading data: 'ascii' codec can't decode byte 0xd0 in position 27: ordinal not in range(128)
Request ru:/w/api.php?inprop=protection%7Ctalkid%7Csubjectid%7Curl%7Creadable&format=json&rvprop=content%7Cids%7Cflags%7Ctimestamp%7Cuser%7Ccomment%7Csize&prop=revisions%7Cinfo&titles=S&rvlimit=1&action=query
Retrying in 1 minutes...
^C
After fix (added "e = unicode(str(e), locale.getpreferredencoding())"):
$ LANG=ru_RU.utf8 python test.py
<urlopen error [Errno 101] Сеть недоступна>
WARNING: Could not open [...]
2. Added raise MaxTriesExceededError when max tries exceeded.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:49
Message:
I think we should either
a) skip the entire output() machinery and use traceback.print_exc()
instead
or
b) write a wrapper for that does what you propose here (but which can also
be used for traceback.format_exc).
and replace all exception printing with one of those two options.
----------------------------------------------------------------------
Comment By: lankier (lankier)
Date: 2010-11-07 12:44
Message:
We can't fix it in output() because we have an exception before we entered
in output().
What about just replace output(u'%s' %e) -> output(str(e)) ? it works.
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-11-07 08:52
Message:
output should be fixed in output method. Would you please check the
following fix in output method:
def output(...)
...
try:
text = unicode(text, 'utf-8')
except UnicodeDecodeError:
text = unicode(text, 'iso8859-1')
replace it with
try:
text = unicode(text, 'utf-8')
except UnicodeDecodeError:
text = unicode(text, locale.getpreferredencoding())
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3092870&group_…
Patches item #3017517, was opened at 2010-06-17 02:38
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3017517&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: BalaSundaraRaman L (lsundar)
Assigned to: Nobody/Anonymous (nobody)
Summary: cosmetic_changes.py to remove bad wikilinks
Initial Comment:
Translated articles created using http://translate.google.com/toolkit?hl=en suffer from one complex issue. It creates links to impossible pages in the target wiki. Let's take the example below:
( Excerpt from http://en.wikipedia.org/wiki/Corporate_governance )
A related but separate thread of discussions focuses on the impact of a corporate governance system in [[economic efficiency]], with a strong emphasis on shareholders' welfare.
This when translated to Tamil, for example, will have a single word for "in economic efficiency" and the tool wrongly links to that phrase. Since article title can't be of the form "in economic efficiency", it'll remain a red link forever. Since articles are littered with such red links, it's hard to read.
In view of the large-scale http://wikimania2010.wikimedia.org/wiki/Submissions/Google_translation project and the problems we faced ( http://wikimania2010.wikimedia.org/wiki/Submissions/A_Review_of_Google_Tran… ), I've developed a patch for cosmetic_changes.py which'll remove red links of the form [[some phrase]] leaving out cases where the label is different from the target. I've attached the patch as well. The changes by my bot running the modified code is at http://ta.wikipedia.org/wiki/Special:Contributions/SundarBot
If approved, I can give it to a dedicated bot operator with the translation team.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:45
Message:
Shouldn't this be in fixes.py?
In any case, I'm having some trouble understanding the goal of the added
parameters in this patch.
Last but not, what is the /generic/ use for this? The problem you're
solving (which is, as I understand it, unlinking all links, or all red
links) sounds awfully specific for tawiki, so I'm not sure if adding this
to pwb is useful.
----------------------------------------------------------------------
Comment By: BalaSundaraRaman L (lsundar)
Date: 2010-06-17 23:00
Message:
The changes will be visible when run in the following manner:
python cosmetic_changes.py -fewerlinks -keepblue -file:listofarticles.txt
For a diff of its changes, please check http://is.gd/cTJWr
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3017517&group_…
Patches item #3007742, was opened at 2010-05-26 20:19
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3007742&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: rewrite
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Pyr0 ()
Assigned to: Nobody/Anonymous (nobody)
Summary: rvdiffto parameter implementation
Initial Comment:
No revisions diff text loading function is implemented in the framework. Here is one:
Changelog:
Modified site.loadrevisions() method to support rvdiffto parameter.
Added a Page.Revision.Diff class for storing the diff text and revto id.
Modified api.update_page() to save the new diff information.
A method from Page.py is still missing to get diffs just like you get a revision now. But you can get the diff text from page._revision[id].diff.text directly for now.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:38
Message:
Sorry for the slow slow slow sloooow response.
I think diffs are a useful thing to have, but I'm not quite sure what the
goal here is - what is the advantage of using rvdiffto instead of getting
both revisions and comparing them with a python diff function?
I can see the use case for, for instance, an antivandalism bot, but I'm not
quite sure how you would use it with this.
Then on the implementation - I can imagine it makes sense to store diffs
for a certain revision, but I'd expect, for instance, a dict with revid's
such that
page, revid=10001
diffs = {10000: <diff object between 10000 and 10001>, 9000: <diff object
between 9000 and 10001>}, and storing e.g. revision.prev = 10000.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3007742&group_…
Patches item #2835479, was opened at 2009-08-11 03:12
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2835479&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: None
Priority: 5
Private: No
Submitted By: Jean-Daniel Fekete (jdfekte)
Assigned to: Nobody/Anonymous (nobody)
Summary: Allowing xmlreader to read from stdin
Initial Comment:
XML dumps are huge and distributed in 7zip format now. This very small patch allows dumps to be read from the standard input using '-' as file name.
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:27
Message:
Closing this due to old age.
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2010-01-06 11:07
Message:
This does not belong in the 'rewrite' category, as the rewrite branch does
not yet support reading xml files at all
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-11-30 03:48
Message:
@valhallasw: Do you suggest that the patch should be rejected based on your
reasoning?
----------------------------------------------------------------------
Comment By: Jean-Daniel Fekete (jdfekte)
Date: 2009-10-08 03:12
Message:
Decompressing on the fly is important to avoid creating huge files. I need
to process the full dump with revisions of the French wikipedia (for now)
and decompressing it is unreasonable. Using pipes is a standard practice
and they are not slow compared to the parsing and processing time of
Python.
I understand the aesthetic objection but pragmatically, the "-" syntax is
quite convenient and the changes are quite minimal.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2009-10-02 02:43
Message:
Submitter, please address comment dated 2009-08-11 13:45 by valhallasw.
Otherwise this patch will be rejected for certain after 2 weeks.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2009-08-11 04:45
Message:
I never see the point of programs adding '-' as magic filename. Unix has
/dev/stdin, dos/windows have CON. Secondly, it is probably better to use an
internal 7zip decompressor, as pipes tend to be slow.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2835479&group_…
Patches item #2985564, was opened at 2010-04-11 12:44
Message generated for change (Settings changed) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2985564&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: Postponed
Priority: 5
Private: No
Submitted By: masti (masti01)
Assigned to: xqt (xqt)
Summary: cosmetic_changes.py
Initial Comment:
if the link and description are the same except for capitalisation use link as link skipping description. Useful for cleaning up after capitalisation related redirects cleaning. Example: http://pl.wikipedia.org/w/index.php?diff=21129378&oldid=21129374&rcid=21691…
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:23
Message:
Closing this for now, due to no response and because I'm not sure if
correcting spelling should be in cosmetic changes -- I think
cosmetic_changes is not supposed to change meaning, but just the way the
wikitext looks.
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-04-12 07:20
Message:
Wouldn't it be a better way to fix that behavior in the main script? For
example: I use a modificated solve_disambiguation.py which always uses the
page link without the description for disambig pages which is recommended
in de-wiki. Put it could cause problems with articles. The point is: you
could not have any influence to cc but you may have it on fixing_redirects
or solve_disambiguation as an option if using it in non-autonomous mode.
And I coundn't say that this sort of spelling-correction alway works well.
----------------------------------------------------------------------
Comment By: masti (masti01)
Date: 2010-04-12 06:39
Message:
I use that in case we are moving pages due to the fact of misspelling or
wrong capitalisation of article title. Then when fixing_redirects or using
solve_disambiguation we have a proper link but the description stays wrong.
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-04-12 04:12
Message:
I am not sure whether this is a good idea of cause this does spelling
changes overruling human edits. If running this in autonomous mode we must
enshure the result is always right. That's why I would like to wait for
other comments.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2985564&group_…
Patches item #2985564, was opened at 2010-04-11 12:44
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2985564&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: Postponed
Priority: 5
Private: No
Submitted By: masti (masti01)
Assigned to: xqt (xqt)
Summary: cosmetic_changes.py
Initial Comment:
if the link and description are the same except for capitalisation use link as link skipping description. Useful for cleaning up after capitalisation related redirects cleaning. Example: http://pl.wikipedia.org/w/index.php?diff=21129378&oldid=21129374&rcid=21691…
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:23
Message:
Closing this for now, due to no response and because I'm not sure if
correcting spelling should be in cosmetic changes -- I think
cosmetic_changes is not supposed to change meaning, but just the way the
wikitext looks.
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-04-12 07:20
Message:
Wouldn't it be a better way to fix that behavior in the main script? For
example: I use a modificated solve_disambiguation.py which always uses the
page link without the description for disambig pages which is recommended
in de-wiki. Put it could cause problems with articles. The point is: you
could not have any influence to cc but you may have it on fixing_redirects
or solve_disambiguation as an option if using it in non-autonomous mode.
And I coundn't say that this sort of spelling-correction alway works well.
----------------------------------------------------------------------
Comment By: masti (masti01)
Date: 2010-04-12 06:39
Message:
I use that in case we are moving pages due to the fact of misspelling or
wrong capitalisation of article title. Then when fixing_redirects or using
solve_disambiguation we have a proper link but the description stays wrong.
----------------------------------------------------------------------
Comment By: xqt (xqt)
Date: 2010-04-12 04:12
Message:
I am not sure whether this is a good idea of cause this does spelling
changes overruling human edits. If running this in autonomous mode we must
enshure the result is always right. That's why I would like to wait for
other comments.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=2985564&group_…
Support Requests item #3428321, was opened at 2011-10-25 09:33
Message generated for change (Comment added) made by valhallasw
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3428321&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: reza (reza1615)
Assigned to: Nobody/Anonymous (nobody)
Summary: add_text.py has problems in windows CMD
Initial Comment:
add_text.py arguments (-text -exception -summary -newimages ) doesn't support percentage characters specially for CMD in windows.
for example this example doesn't work correctly in windows CMD
python add_text.py -links%3A%22%D8%A7%D9%84%DA%AF%D9%88%3A%D9%81%D9%84%D8%A7%D9%86%22 -text%3A%7B%7B%D8%A7%D9%84%DA%AF%D9%88%3A%D9%81%D9%84%D8%A7%D9%86%7D%7D
is it possible to change their Unicode setting like pagegenerator.py arguments?
p.s. for percentage characters i use http://meyerweb.com/eric/tools/dencoder/
----------------------------------------------------------------------
>Comment By: Merlijn S. van Deen (valhallasw)
Date: 2012-03-21 09:19
Message:
First of all, sorry for the slow response. Yes, you are right, you cannot
do that with -text. I'm also not quite sure whether the 'correct' behaviour
is to allow urlencoded text for -text:.
Is creating a page with an ascii name (i.e. user:reza/template) and
subst-ing that (i.e. -text:{{subst:user:reza/template}} ) a workable
solution for you?
Last, but not least, what happens if you just paste arabic text into the
console? i.e. -text:"{{فلان}}"? This should work on windows, but I'm
not sure what will be placed on the page...
----------------------------------------------------------------------
Comment By: reza (reza1615)
Date: 2011-10-26 04:28
Message:
-text: argument doesn't encode to Unicode text so it puts this percentage
text to pages! instead of {{فلان}}
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2011-10-25 10:36
Message:
You shouldn't encode the :, only the page title. So the correct command
would be:
python add_text.py
-links:%22%D8%A7%D9%84%DA%AF%D9%88%3A%D9%81%D9%84%D8%A7%D9%86%22
-text:%7B%7B%D8%A7%D9%84%DA%AF%D9%88%3A%D9%81%D9%84%D8%A7%D9%86%7D%7D
which should work.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603139&aid=3428321&group_…
Patches item #3508665, was opened at 2012-03-19 06:25
Message generated for change (Comment added) made by xqt
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3508665&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Bertrand Grondin (grondin85)
>Assigned to: xqt (xqt)
Summary: syntax error on pywikibot/textlib.py
Initial Comment:
I saw a syntax error in pywikibot/textlib.py, line 181.
I joint you a patch to solve it (missing un coma)
----------------------------------------------------------------------
>Comment By: xqt (xqt)
Date: 2012-03-19 06:38
Message:
done in r10029
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3508665&group_…
Patches item #3508665, was opened at 2012-03-19 06:25
Message generated for change (Tracker Item Submitted) made by grondin85
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3508665&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Bertrand Grondin (grondin85)
Assigned to: Nobody/Anonymous (nobody)
Summary: syntax error on pywikibot/textlib.py
Initial Comment:
I saw a syntax error in pywikibot/textlib.py, line 181.
I joint you a patch to solve it (missing un coma)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=3508665&group_…