Bugs item #1615700, was opened at 2006-12-14 08:03
Message generated for change (Settings changed) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1615700&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Invalid
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: "category.py move [-inplace]" ignores some instances
Initial Comment:
I observed (very few) instances when the command:
python category.py ... move -inplace -from:... -to:...
did not catch all occurences of the -from category, so that ater the run it was not empty, since some pages belonging to it were not altered at all. I believe, replace.py did not find the category tag inside these pages.
I all such cases, the following was true, but I cannot tell, which of those (if any) is a trigger for replace.py's erroneous behaviour:
- "move -inplace" was used.
- the category tag was formatted like "[[namespace:pagename|sortkey]]", with a non-empty sort key present.
- the "[[namespace:" was
neither "[[Category:",
nor the default local name for the wiki/language (as present in $namespaceNames),
but another name which can be found only in the $namespaceAliases of the wiki language.
(My *guess* is the latter being the cause)
There may have been cases when category.py did find and correctly replaced the category tag in pages despite the conditions above, but I am not aware of any.
----------------------------------------------------------------------
Comment By: Tavernier (tavernier)
Date: 2008-01-26 23:48
Message:
Logged In: YES
user_id=1705732
Originator: NO
can you give links ? does the problem still occur ?
sometimes he don't catch the category because it's transcluded from a
template.
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2007-11-21 10:13
Message:
Logged In: YES
user_id=855050
Originator: NO
I have made many revisions to the "replaceCategoryInPlace()" function
since this bug was opened. If there is still a problem, please identify
specific articles/sites that don't get moved correctly. If not, this bug
can be closed.
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2007-06-01 06:32
Message:
Logged In: YES
user_id=46450
Originator: YES
The sort key seems of no importance.
Here you can see that there are three ways to write "Category" in the
Wikipedia of Ripuarian languages:
http://ksh.wikipedia.org/w/index.php?title=Betupper_%28Minsch%29&diff=prev&…
- the list of categories below the page shows only one name, and there is
no wikilink of either at the end of the page.
I ran:
python category.py move -inplace -from:Mynsch -to:Minsch
and it resonded:
Checked for running processes. 1 processes currently running, including
the current process.
Getting [[Saachjrupp:Mynsch]]...
Getting 1 pages from wikipedia:ksh...
Getting a page to check if we're logged in on wikipedia:ksh
Sleeping for 5.6 seconds, 2007-06-01 10:05:04
Changing page [[ksh:Betupper (Minsch)]]
There are no subcategories in category Saachjrupp:Mynsch
Dumping to category.dump.bz2, please wait...
Here is the change made by the bot:
http://ksh.wikipedia.org/w/index.php?title=Betupper_%28Minsch%29&diff=next&…
1. It got the generic english name (category:Mynsch altered to
Category:Minsch) and normalized it.
2. It got the default localized name (Saachjrupp:Mynsch) and normalized it
to the generic english one. The default localized form is defined in
$namespaceNames.
3. It did NOT catch the alternate localized name (Kategorie:Mynsch, left
unchanged). BUT as you can see from the box at the bottom of the page, the
page is now in two categories. I.e. Mediawiki understands the alternate
localized form. Several alternate localized names can be defined in
$namespaceAliases, see
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/messages/M…
I believe, category.py does either not know of $namespaceAliases, or uses
only one of the alternate names listed there.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2007-04-26 15:30
Message:
Logged In: YES
user_id=1107255
Originator: NO
Please let us know if this bug report is still applicable to the current
code. If no response is given, the bug report will be closed one month from
now. This message was added in an effort to reduce the number of open
issues on this project. Siebrand
----------------------------------------------------------------------
Comment By: Cyde Weys (cydeweys)
Date: 2007-01-28 15:26
Message:
Logged In: YES
user_id=1506848
Originator: NO
Hrmm, I'm confused. How many different ways are there to write
"Category:"? I thought there was just one, whatever the language's wiki
uses. Can you give a specific example of one that wasn't caught, but that
should have been? Thanks.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1615700&group_…
Patches item #1898128, was opened at 2008-02-20 14:57
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898128&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Roberto Zanon (qualc1)
Assigned to: Nobody/Anonymous (nobody)
Summary: unprotect page doesn't work
Initial Comment:
Method
protect(self, edit='sysop', move='sysop', create='sysop', unprotect=False, reason=None, prompt=True, throttle=True)
in Page class (in wikipedia.py) doesn't work when is used to unprotect a page.
For example:
page.protect(unprotect=True) doesn't unprotect the page (actually it protect).
When unprotect is True the function use "unprotect" action instead of "protect" but unprotect is only an alias for protect (see http://www.mediawiki.org/wiki/Manual:Page_action#Actions)
I've attached a patch.
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2008-02-21 09:03
Message:
Logged In: YES
user_id=855050
Originator: NO
Applied in r5065.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898128&group_…
Bugs item #1833448, was opened at 2007-11-16 18:55
Message generated for change (Settings changed) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1833448&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Invalid
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: bloodice.spelled(a)gmail.com
Initial Comment:
Hello,
in replace.py there is a problem with exception->title.
When I tried to do a replacement, the bot exits with the following:
<pre>
C:\....pywikipedia>replace.py -xml:dump.xml -fix:spellcheck -namespace:0 -always
Checked for running processes. 1 processes currently running, including the current process.
Reading XML dump...
find
</pre>
I have the same problem when I am trying predefined fixes which are including 'title'. This does not happen with anything else from exceptions options. It is on bg.wikipedia.org (unicode). I tried to switch on and off regex, r'...' or u'....' in title.
Does anyone has a clue what it could be?
Thanks,
bloodIce
----------------------------------------------------------------------
Comment By: Russell Blau (russblau)
Date: 2007-12-10 12:33
Message:
Logged In: YES
user_id=855050
Originator: NO
This is a bug report, not a patch; and the bug report is not sufficiently
clear to allow fixing. What is in the "fix:spellcheck"?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1833448&group_…
Patches item #1773928, was opened at 2007-08-14 11:05
Message generated for change (Settings changed) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1773928&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: None
Priority: 5
Private: No
Submitted By: Aurimas Fischer (ebola_rulez)
Assigned to: Daniel Herding (wikipedian)
Summary: bug in redirect.py -xml
Initial Comment:
redirect.py double|broken -xml
misses pages containing spaces in their titles.
Attached patch fixes this bug.
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2007-09-13 16:33
Message:
Logged In: YES
user_id=880694
Originator: NO
Fixed. There's still a namespace bug, as entry.title only gives the title
without namespace.
----------------------------------------------------------------------
Comment By: Aurimas Fischer (ebola_rulez)
Date: 2007-09-13 15:14
Message:
Logged In: YES
user_id=959303
Originator: YES
Indeed, there is.
redirect.py double -xml:db.xml:
File "C:\pywikipedia\redirect.py", line 143, in get_redirects_from_dump
print len(pageTitles)
UnboundLocalError: local variable 'pageTitles' referenced before
assignment
redirect.py broken -xml:db.xml:
File "C:\pywikipedia\redirect.py", line 170, in
retrieve_broken_redirects
if value not in pagetitles:
NameError: global name 'pagetitles' is not defined
----------------------------------------------------------------------
Comment By: Daniel Herding (wikipedian)
Date: 2007-09-13 15:12
Message:
Logged In: YES
user_id=880694
Originator: NO
I applied redirect_py.patch and fixed the two references to 'target'. The
script still needs some testing, maybe there are more bugs.
----------------------------------------------------------------------
Comment By: Aurimas Fischer (ebola_rulez)
Date: 2007-09-13 14:50
Message:
Logged In: YES
user_id=959303
Originator: YES
P.P.S.
revision 4275 broke deouble redirect fixing.
"NameError: global name 'target' is not defined" in
fix_double_redirects(self)
lines ~245, ~257
----------------------------------------------------------------------
Comment By: Aurimas Fischer (ebola_rulez)
Date: 2007-09-13 14:34
Message:
Logged In: YES
user_id=959303
Originator: YES
redirect.py double -xml:db.xml
doesn't find double redirects with spaces in page title.
Updated patch to work with latest revision.
P.S.
redirect.py broken -xml:db.xml
is broken in revision 4275.
File Added: redirect_py.patch
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1773928&group_…
Patches item #1463133, was opened at 2006-04-02 14:38
Message generated for change (Settings changed) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1463133&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Resolution: None
Priority: 5
Private: No
Submitted By: Rich Morin (rich_morin)
Assigned to: Nobody/Anonymous (nobody)
Summary: misc nits
Initial Comment:
General Comments
================
Many of the files to use lines that extend beyond 80
columns. This is awkward for viewing, editing, and
printing.
The readability of the code could benefit from
"Verbose Regular Expressions", which allow white
space, comments, etc. For example, see the RE in
line 533 of wikipedia.py.
http://pywikipediabot.sourceforge.net/
======================================
... get the lastest version ...
latest version
interwiki-graphs/README
=======================
... get the pagacke.
package.
login-data/README
=================
... get the pagacke.
package.
----------------------------------------------------------------------
Comment By: Merlijn S. van Deen (valhallasw)
Date: 2007-07-01 05:14
Message:
Logged In: YES
user_id=687283
Originator: NO
Regular Expressions are write-only anyway. A tool like regexpbuddy (or, is
it exists, its open-source brother) gives a much better insight than
splitting the regexp to multiple lines - and when splitting into multiple
lines, using such a tool gets impossible.
I fixed the spelling in the /README files; 80 characters should be used
but often you'll run into some trouble with several indented blocks. It
does not have a big priority however as most people use an editor that can
handle longer lines.
----------------------------------------------------------------------
Comment By: siebrand (siebrand)
Date: 2007-04-26 15:22
Message:
Logged In: YES
user_id=1107255
Originator: NO
Please let us know if this patch is still applicable to the current code.
If no response is given, the patch will be denied and the issue will be
closed. This message was added in an effort to reduce the number of open
issues on this project. Siebrand
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1463133&group_…
Bugs item #1898707, was opened at 2008-02-21 13:36
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1898707&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Dashiva (magnusrk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Infinity loop on missing api.php
Initial Comment:
If api.php isn't at the default /w/api.php, or apipath() is defined but gives the wrong value, the bot keeps requesting the page over and over with no notification to the user.
Eventually python reaches the maximum recursion depth, throws, and the script stops a minute before starting again. At each pause it prints a warning saying it couldn't find <url>, but none of the warnings appeared until I stopped the script with ctrl-c (might be windows/py2.4 specific).
I think the best solution would be to distinguish between "Could not load api.php" and "Successfully loaded some page, but it doesn't look like api.php". In the latter case, there isn't much to gain from trying over and over.
(Off topic: How about a scriptpath() for setting path/querypath/apipath all in one? They're usually in the same directory.)
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1898707&group_…
Hi
OK. I have recently been tearing my hair out trying to understand why
an character set encoding bug suddenly popped up in my application.
After a bit of debugging I think I may have an explanation. Let me
introduce the problem...
Consider the simple fragment:
page = wikipedia.Page(wikipedia.getSite(), 'Main_Page')
content = page.get()
The error I was getting was:
AssertionError: charset for mylocalwiki:en changed from UTF-8 to iso-8859-1
So what is happening here?
1) A call to get() is made on the page object
2) A call to _getEditPage() is made on the page object
3) A call to getURL() is made on the Site object with path argument =
"/wiki/index.php?title=Main_Page&action=edit".
4) The full URL is created adding a protocol and hostname etc.
5) A connection to the URL is opened. The resulting HTTP headers and
html text is buffered
6) The content type is read from the resulting header. The charset
variable is found from the content type header. (UTF-8 in my case)
7) The character set is checked to make sure it matches the family
file setting. It does. The Site object charset instance variable is
set to the same value.
8) The resulting text is converted to unicode using the found charset.
9) A call to _getUserData() is made on the Site object
10) A call to isBlocked() is made on the Site object
11) A call to api_address() is made on the Site object
12) A call to api_address() is made on the Family object
13) A call to apipath() is made on the Family object
14) apipath() returns the relative hardcoded URL="/w/api.php" back to
the caller!!! However this URL is invalid! The wiki prefix should be
"/wiki" in my case, not "/w".
15) The relative URL is now:
"/w/api.php?action=query&meta=userinfo&uiprop=blockinfo"
16) The method getURL() proceeds to call this URL prefixing the URL
with the protocol and server host name
17) The URL is not found. A 404 is returned from the underlying Apache
server. The returned content type is once again parsed, but now the
Apache server is returning a charset of ISO-8859-1, not UTF-8.
18) checkCharset() is once again called on the Site object leading to
the AssertionError.
So how to solve it? One might argue that the Apache server should also
be returning UTF-8, but that will most likely only solve half the
problem.
Instead, I created a new method in my family file, overriding the
default version in wikipedia.py:
def apipath(self, code):
return '/wiki/api.php'
Maybe someone should look at the Family class. It seems both
querypath() and path() also use the hardcoded "/w" prefix.
I am surprised no one else has experienced this error before - unless
everyone is overriding the methods from Family.py.
Comments?
Regards
Lee Francis
--
_____
In theory, there is no difference between theory and practice. But, in
practice, there is.
-- Jan L.A. van de Snepscheut
Patches item #1898557, was opened at 2008-02-21 10:12
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898557&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: NicDumZ — Nicolas Dumazet (nicdumz)
Assigned to: Nobody/Anonymous (nobody)
Summary: noreferences.py : better ident for the references section
Initial Comment:
When adding a <references/> tag, watch for the ident of the next section, instead of using a simple == %s == ident.
Before :
== See Also ==
+ == References ==
+
+ <references/>
=== External links ===
Now :
== See Also ==
+ === References ===
+
+ <references/>
=== External links ===
Cheers :)
Nicolas Dumazet.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898557&group_…
Patches item #1898418, was opened at 2008-02-21 05:24
Message generated for change (Settings changed) made by rotemliss
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898418&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Terrance Rengarasu (trengarasu)
Assigned to: Nobody/Anonymous (nobody)
Summary: Interwiki.py Tamil language localization
Initial Comment:
I have added Tamil language localization codes. Did not touch any other codes. Thank you in advance.
----------------------------------------------------------------------
>Comment By: Rotem Liss (rotemliss)
Date: 2008-02-21 09:01
Message:
Logged In: YES
user_id=1327030
Originator: NO
Applied to r5064.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1898418&group_…