I want to add a certain category ([[Category:Wiki Green
pages]]<http://www.appropedia.org/Category:Wiki_Green_pages>)
to a list of pages. The problem is, the list is old and many of them have
been moved since then.
Is there a way I can make the bot detect a redirect, follow it, then add the
category on the new page? Or at least some easier way than looking each page
up separately and adding the category?
Many thanks!
--
Chris Watkins (a.k.a. Chriswaterguy)
Appropedia.org - Sharing knowledge to build rich, sustainable lives.
blogs.appropedia.org
I like this: five.sentenc.es
Revision: 6112
Author: btongminh
Date: 2008-11-22 19:42:46 +0000 (Sat, 22 Nov 2008)
Log Message:
-----------
Add unicode BIDI chars to whitespace list.
Modified Paths:
--------------
trunk/pywikipedia/commonsdelinker/delinker.py
Modified: trunk/pywikipedia/commonsdelinker/delinker.py
===================================================================
--- trunk/pywikipedia/commonsdelinker/delinker.py 2008-11-21 14:15:01 UTC (rev 6111)
+++ trunk/pywikipedia/commonsdelinker/delinker.py 2008-11-22 19:42:46 UTC (rev 6112)
@@ -47,7 +47,10 @@
import wikipedia
import config
-
+
+# FIXME: They should be defined *somewhere* in the Python library, not?
+WHITESPACE = u' \t\u200e\u200f\u202a\u202a\u202b\u202c\u202d\u202e'
+
def wait_callback(object):
output(u'%s Connection has been lost in %s. Attempting reconnection.' % (threading.currentThread(), repr(object)), False)
if hasattr(object, 'error'):
@@ -255,7 +258,7 @@
if prev in ('', '\r', '\n') and replacement is None:
# Kill all spaces after end
while (end + 1) < len(new_text):
- if new_text[end + 1] in (' ', '\t'):
+ if new_text[end + 1] in WHITESPACE:
end += 1
else:
break
Feature Requests item #2326967, was opened at 2008-11-22 22:48
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2326967&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Woo-Jin Kim (kwj2772)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki.py -wiktionary
Initial Comment:
interwiki.py -wiktionary take too much time to process.
Currently, interwiki bot retreives 60 en article, next 60 fr article. (language code is just an example)
I suggest for bot to retreive all wiki entries with same name at the same time to reduce time to retreive.
Thank you.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=2326967&group_…
Bugs item #2198717, was opened at 2008-10-26 20:34
Message generated for change (Comment added) made by sf-robot
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2198717&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
>Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: maksim j (maksimpp)
Assigned to: Nobody/Anonymous (nobody)
Summary: Cannot read AllPages
Initial Comment:
Cannot read all categories from it.wikipedia
nsp=14
start=u''
for page in mysite.allpages(start = start, namespace = nsp):
wikipedia.output(page.title())
After Categoria:Progetto:Biografie/Tabella monitoraggio automatico - scrittura nc
It get Categoria:Birmingham
with infinite loop.
----------------------------------------------------------------------
>Comment By: SourceForge Robot (sf-robot)
Date: 2008-11-22 02:20
Message:
This Tracker item was closed automatically by the system. It was
previously set to a Pending status, and the original submitter
did not respond within 14 days (the time period specified by
the administrator of this Tracker).
----------------------------------------------------------------------
Comment By: Andre Engels (a_engels)
Date: 2008-11-03 10:44
Message:
Claimed to have been corrected in the MediaWiki code (might not yet be life
on Wikipedia, but will be soon)
----------------------------------------------------------------------
Comment By: Andre Engels (a_engels)
Date: 2008-11-03 07:38
Message:
This seems to be a problem not with the bug code, but with the Mediawiki
API. I have submitted a bug report at
https://bugzilla.wikimedia.org/show_bug.cgi?id=16225 which, when resolved,
should correct this issue.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2198717&group_…
Feature Requests item #1876050, was opened at 2008-01-20 22:58
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1876050&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: Interwiki.py - earlier abort of a page wanted
Initial Comment:
When interwiki.py says:
WARNING: [[zxx:pagetitle1]] doesn't seem to be a disambiguation page, but [[und:pagetitle2]] is one. Follow it anyway? ([y]es, [n]o, [a]dd an alternative)
, one may be unable to decide, e.g. having insufficient skills reading the languages or scripts. It does not matter which answer one gives here, since later, when the same pages comes up to be finalized, one would usually have to give up on it.
I suggest, to add a choice to abandon processing the page at the earlier stage already, when the above message is shown. So, it could be extended as:
... Follow it anyway? ([y]es, [n]o, [a]dd an alternative, [g]ive up)
----
The same holds for the message:
WARNING: [[zxx:namespace1:pagetitle1]] is in namespace 1, but [[und:namespace2:pagetitle2]] is in namespace 2. Follow it anyway? ([y]es, [n]o)
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2008-11-21 14:16
Message:
2nd suggestion only: done with r6111.
(1st suggestion had been implemented earlier by someone else)
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2008-11-21 12:05
Message:
The 1st suggestion (disambiguation mismatch) has already been taken care
of.
I am working now on the 2nd suggestion (namespace mismatch).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1876050&group_…
Revision: 6111
Author: purodha
Date: 2008-11-21 14:15:01 +0000 (Fri, 21 Nov 2008)
Log Message:
-----------
Can give up on a namespace conflict now.
See: https://sourceforge.net/tracker2/?func=detail&aid=1876050&group_id=93107&at…
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2008-11-21 10:43:34 UTC (rev 6110)
+++ trunk/pywikipedia/interwiki.py 2008-11-21 14:15:01 UTC (rev 6111)
@@ -603,7 +603,7 @@
Also remembers where we found the page, regardless of whether it had
already been found before or not.
- Returns True iff the page is new.
+ Returns True if the page is new.
"""
if self.forcedStop:
return False
@@ -622,12 +622,12 @@
counter.plus(page.site())
return True
- def namespaceMismatch(self, linkingPage, linkedPage):
+ def namespaceMismatch(self, linkingPage, linkedPage, counter):
"""
Checks whether or not the given page has another namespace
- as the origin page.
+ than the origin page.
- Returns True iff the namespaces are different and the user
+ Returns True if the namespaces are different and the user
has selected not to follow the linked page.
"""
if self.foundIn.has_key(linkedPage):
@@ -651,11 +651,14 @@
wikipedia.output(u"NOTE: Ignoring link from page %s in namespace %i to page %s in namespace %i because page %s in the correct namespace has already been found." % (self.originPage.aslink(True), self.originPage.namespace(), linkedPage.aslink(True), linkedPage.namespace(), preferredPage.aslink(True)))
return True
else:
- choice = wikipedia.inputChoice('WARNING: %s is in namespace %i, but %s is in namespace %i. Follow it anyway?' % (self.originPage.aslink(True), self.originPage.namespace(), linkedPage.aslink(True), linkedPage.namespace()), ['Yes', 'No'], ['y', 'n'])
+ choice = wikipedia.inputChoice('WARNING: %s is in namespace %i, but %s is in namespace %i. Follow it anyway?' % (self.originPage.aslink(True), self.originPage.namespace(), linkedPage.aslink(True), linkedPage.namespace()), ['Yes', 'No', 'give up'], ['y', 'n', 'g'])
if choice != 'y':
# Fill up foundIn, so that we will not ask again
self.foundIn[linkedPage] = [linkingPage]
- wikipedia.output(u"NOTE: ignoring %s and its interwiki links" % linkedPage.aslink(True))
+ if choice == 'g':
+ self.makeForcedStop(counter)
+ else:
+ wikipedia.output(u"NOTE: ignoring %s and its interwiki links" % linkedPage.aslink(True))
return True
else:
# same namespaces, no problem
@@ -678,7 +681,7 @@
Returns a tuple (skip, alternativePage).
- skip is True iff the pages have mismatching statuses and the bot
+ skip is True if the pages have mismatching statuses and the bot
is either in autonomous mode, or the user chose not to use the
given page.
@@ -805,7 +808,7 @@
elif not globalvar.followredirect:
wikipedia.output(u"NOTE: not following redirects.")
else:
- if not (self.isIgnored(redirectTargetPage) or self.namespaceMismatch(page, redirectTargetPage) or self.wiktionaryMismatch(redirectTargetPage) or (page.site().family != redirectTargetPage.site().family)):
+ if not (self.isIgnored(redirectTargetPage) or self.namespaceMismatch(page, redirectTargetPage, counter) or self.wiktionaryMismatch(redirectTargetPage) or (page.site().family != redirectTargetPage.site().family)):
if self.addIfNew(redirectTargetPage, counter, page):
if config.interwiki_shownew:
wikipedia.output(u"%s: %s gives new redirect %s" % (self.originPage.aslink(), page.aslink(True), redirectTargetPage.aslink(True)))
@@ -865,7 +868,7 @@
self.done.remove(page)
iw = ()
for linkedPage in iw:
- if not (self.isIgnored(linkedPage) or self.namespaceMismatch(page, linkedPage) or self.wiktionaryMismatch(linkedPage)):
+ if not (self.isIgnored(linkedPage) or self.namespaceMismatch(page, linkedPage, counter) or self.wiktionaryMismatch(linkedPage)):
if globalvar.followinterwiki or page == self.originPage:
if self.addIfNew(linkedPage, counter, page):
# It is new. Also verify whether it is the second on the
Bugs item #2137018, was opened at 2008-09-29 19:06
Message generated for change (Comment added) made by russblau
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2137018&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Closed
Resolution: Fixed
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Purodha B Blissenbach (purodha)
Summary: special_page_limit not honored
Initial Comment:
In the config, I have:
special_page_limit = 5000
but my recentechanges bot only operates on 1000 entries, then it stops. Maybe, the limitation
is within MediaWiki, or the bot must be run with
admin powers or something else, I have not
investigated.
I do not really know, when this was changed, but
some months ago, it would work as expected, and
nothing was changed in the config file since.
----------------------------------------------------------------------
>Comment By: Russell Blau (russblau)
Date: 2008-11-21 07:31
Message:
Did this change really fix your problem? I don't see how changing
redirect.py would have any effect at all on other bots, and you originally
posted that the problem was with a recentchanges bot. redirect.py reads
off Special:DoubleRedirects, and that page is limited to 1000 entries by a
hard-coded limit inside MediaWiki, so changing the special_page_limit won't
allow you to retrieve any more pages.
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2008-11-21 05:51
Message:
Fixed in r...
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2137018&group_…
Feature Requests item #1876050, was opened at 2008-01-20 22:58
Message generated for change (Comment added) made by purodha
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1876050&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Priority: 5
Private: No
Submitted By: Purodha B Blissenbach (purodha)
Assigned to: Nobody/Anonymous (nobody)
Summary: Interwiki.py - earlier abort of a page wanted
Initial Comment:
When interwiki.py says:
WARNING: [[zxx:pagetitle1]] doesn't seem to be a disambiguation page, but [[und:pagetitle2]] is one. Follow it anyway? ([y]es, [n]o, [a]dd an alternative)
, one may be unable to decide, e.g. having insufficient skills reading the languages or scripts. It does not matter which answer one gives here, since later, when the same pages comes up to be finalized, one would usually have to give up on it.
I suggest, to add a choice to abandon processing the page at the earlier stage already, when the above message is shown. So, it could be extended as:
... Follow it anyway? ([y]es, [n]o, [a]dd an alternative, [g]ive up)
----
The same holds for the message:
WARNING: [[zxx:namespace1:pagetitle1]] is in namespace 1, but [[und:namespace2:pagetitle2]] is in namespace 2. Follow it anyway? ([y]es, [n]o)
----------------------------------------------------------------------
Comment By: Purodha B Blissenbach (purodha)
Date: 2008-11-21 12:05
Message:
The 1st suggestion (disambiguation mismatch) has already been taken care
of.
I am working now on the 2nd suggestion (namespace mismatch).
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603141&aid=1876050&group_…