Revision: 4462
Author: a_engels
Date: 2007-10-17 14:55:24 +0000 (Wed, 17 Oct 2007)
Log Message:
-----------
Re-instating the -array command line option (number of pages to work on at once) and creating the -query command line option (number of pages to load at once)
Modified Paths:
--------------
trunk/pywikipedia/interwiki.py
Modified: trunk/pywikipedia/interwiki.py
===================================================================
--- trunk/pywikipedia/interwiki.py 2007-10-17 13:35:27 UTC (rev 4461)
+++ trunk/pywikipedia/interwiki.py 2007-10-17 14:55:24 UTC (rev 4462)
@@ -171,6 +171,16 @@
will be changed if there are that number or more links to
change or add
+The following arguments influence how many pages the bot works on at once:
+ -array: The number of pages the bot tries to be working on at once.
+ If the number of pages loaded is lower than this number,
+ a new set of pages is loaded from the starting wiki. The
+ default is 100, but can be changed in the config variable
+ interwiki_min_subjects
+
+ -query: The maximum number of pages that the bot will load at once.
+ Default value is 60.
+
Some configuration option can be used to change the working of this robot:
interwiki_min_subjects: the minimum amount of subjects that should be processed
@@ -354,6 +364,7 @@
bracketonly = False
rememberno = False
followinterwiki = True
+ minsubjects = config.interwiki_min_subjects
class Subject(object):
"""
@@ -1253,7 +1264,7 @@
# Do we still have enough subjects to work on for which the
# home language has been retrieved? This is rough, because
# some subjects may need to retrieve a second home-language page!
- if len(self.subjects) - mycount < config.interwiki_min_subjects:
+ if len(self.subjects) - mycount < globalvar.minsubjects:
# Can we make more home-language queries by adding subjects?
if self.pageGenerator and mycount < globalvar.maxquerysize:
timeout = 60
@@ -1510,6 +1521,10 @@
globalvar.bracketonly = True
elif arg == '-localright':
globalvar.followinterwiki = False
+ elif arg.startswith('-array:'):
+ globalvar.minsubjects = int(arg[7:])
+ elif arg.startswith('-query:'):
+ globalvar.maxquerysize = int(arg[7:])
else:
generator = genFactory.handleArg(arg)
if generator:
Patches item #1809629, was opened at 2007-10-08 19:26
Message generated for change (Settings changed) made by wikipedian
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1809629&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: shizhao (wikishizhao)
Assigned to: Nobody/Anonymous (nobody)
Summary: welcome.py update
Initial Comment:
add zh wikipedia message. and change timeselected.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1809629&group_…
Patches item #1809474, was opened at 2007-10-08 15:47
Message generated for change (Settings changed) made by wikipedian
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1809474&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Accepted
Priority: 5
Private: No
Submitted By: Filnik (filnik)
Assigned to: Nobody/Anonymous (nobody)
Summary: Found a bug on welcome.py
Initial Comment:
Although I've tested a lot welcome.py, seems to be a very little bug caused by the massive rewrite. The bug is caused by an old variable "project" now renamed as wsite.family.name. So the bot, if it need to create the report page, crashed because it doesn't find this variable.
I've uploaded also a patch. I've tested three times the bot to check if now works, and it does. If you want to test other time that path, be sure to be a sysop because you need to delete the page.
Bye, Filnik.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603140&aid=1809474&group_…
Revision: 4460
Author: wikipedian
Date: 2007-10-17 13:33:31 +0000 (Wed, 17 Oct 2007)
Log Message:
-----------
applied patch [ 1809474 ] Found a bug on welcome.py by Filnik
Modified Paths:
--------------
trunk/pywikipedia/welcome.py
Modified: trunk/pywikipedia/welcome.py
===================================================================
--- trunk/pywikipedia/welcome.py 2007-10-17 12:30:51 UTC (rev 4459)
+++ trunk/pywikipedia/welcome.py 2007-10-17 13:33:31 UTC (rev 4460)
@@ -400,7 +400,8 @@
if another_page.exists():
text_get = another_page.get()
else:
- text_get = u'This is a report page for the Bad-username, please translate me. --[[User:%s|%s]]' % (config.usernames[project], config.usernames[project])
+ nameBot = config.usernames[wsite.family.name][wsite.lang]
+ text_get = u'This is a report page for the Bad-username, please translate me. --[[User:%s|%s]]' % (nameBot, nameBot)
pos = 0
# The talk page includes "_" between the two names, in this way i replace them to " ".
username = wikipedia.url2link(username, wsite, wsite)
Revision: 4459
Author: wikipedian
Date: 2007-10-17 12:30:51 +0000 (Wed, 17 Oct 2007)
Log Message:
-----------
console_encoding can be autodetected
Modified Paths:
--------------
trunk/pywikipedia/config.py
Modified: trunk/pywikipedia/config.py
===================================================================
--- trunk/pywikipedia/config.py 2007-10-17 12:25:57 UTC (rev 4458)
+++ trunk/pywikipedia/config.py 2007-10-17 12:30:51 UTC (rev 4459)
@@ -86,9 +86,10 @@
# The encoding that's used in the user's console, i.e. how strings are encoded
# when they are read by raw_input(). On Windows systems' DOS box, this should
# be 'cp850' ('cp437' for older versions). Linux users might try 'iso-8859-1'
-# or 'utf-8'. If this variable is set to None, the default is 'cp850' on
-# windows, and iso-8859-1 on other systems
-console_encoding = None
+# or 'utf-8'.
+# This default code should work fine, so you don't have to think about it.
+# TODO: consider getting rid of this config variable.
+console_encoding = __sys.stdout.encoding
# The encoding in which textfiles are stored, which contain lists of page titles.
textfile_encoding = 'utf-8'
@@ -452,13 +453,6 @@
else:
print "WARNING: Configuration variable %r is defined but unknown. Misspelled?"%_key
-# Fix up default console_encoding
-if console_encoding == None:
- if __sys.platform=='win32':
- console_encoding = 'cp850'
- else:
- console_encoding = 'iso-8859-1'
-
# Save base_dir for use by other modules
base_dir = _base_dir
#
Revision: 4458
Author: wikipedian
Date: 2007-10-17 12:25:57 +0000 (Wed, 17 Oct 2007)
Log Message:
-----------
fixed bug [ 1803037 ] windows console encoding problem
Modified Paths:
--------------
trunk/pywikipedia/wikipedia.py
Modified: trunk/pywikipedia/wikipedia.py
===================================================================
--- trunk/pywikipedia/wikipedia.py 2007-10-17 09:01:52 UTC (rev 4457)
+++ trunk/pywikipedia/wikipedia.py 2007-10-17 12:25:57 UTC (rev 4458)
@@ -4926,12 +4926,13 @@
moduleName = calledModuleName()
nonGlobalArgs = []
for arg in args[1:]:
- if sys.platform=='win32':
- # Windows gives parameters encoded as windows-1252,
- # regardless of console encoding
+ if sys.platform=='win32' and config.console_encoding == 'cp850':
+ # Western Windows versions give parameters encoded as windows-1252
+ # even though the console encoding is cp850.
arg = unicode(arg, 'windows-1252')
else:
- # Linux uses the same encoding for both
+ # Linux uses the same encoding for both.
+ # I don't know how non-Western Windows versions behave.
arg = unicode(arg, config.console_encoding)
if arg == '-help':
showHelp(moduleName)
Bugs item #1803037, was opened at 2007-09-26 22:22
Message generated for change (Comment added) made by wikipedian
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1803037&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: General
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Nobody/Anonymous (nobody)
Summary: windows console encoding problem
Initial Comment:
there is a code in wikipedia.py:
if sys.platform=='win32':
# Windows gives parameters encoded as windows-1252,
# regardless of console encoding
arg = unicode(arg, 'windows-1252')
It seems to fix console bug.
but unfortunately, for example if you use cp949 as console encoding, then you will get all non-ascii text cracked. it is because Windows gives gives parameters encoded as cp949, not cp1252.
so I've used a workaround patch from
if sys.platform=='win32':
to
if sys.platform=='win32' and config.console_encoding != 'cp949':
to prevent encoding error. I guess some of other non-latin codepages have same trouble but I'm not sure.
-- [[:ko:User:Klutzy]]
----------------------------------------------------------------------
>Comment By: Daniel Herding (wikipedian)
Date: 2007-10-17 14:24
Message:
Logged In: YES
user_id=880694
Originator: NO
Changed in SVN. Now it only converts to Windows-1252 when the codepage is
cp850. If there are any problems with non-Western Windows versions, please
report them, as I can't test it.
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2007-09-26 23:07
Message:
Logged In: NO
By cracked, User:Klutzy means garbled.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1803037&group_…
Bugs item #1802910, was opened at 2007-09-26 18:29
Message generated for change (Settings changed) made by wikipedian
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1802910&group_…
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: interwiki
Group: None
>Status: Closed
>Resolution: Fixed
Priority: 5
Private: No
Submitted By: Alleborgo (alleborgo)
Assigned to: Nobody/Anonymous (nobody)
Summary: interwiki.py - sv.wiki page starting with "S:t something"
Initial Comment:
On sv.wiki there are many articles with the name "S:t something" and when the bot find that goes to search "T something and not "S:t something" which IS the page name you can see the problem here:
http://en.wikipedia.org/w/index.php?title=St._Bernard_%28dog%29&diff=159614…
and here: http://en.wikipedia.org/w/index.php?title=Marttila&diff=152227566&oldid=137…
----------------------------------------------------------------------
Comment By: Francesco Cosoleto (cosoleto)
Date: 2007-09-27 16:28
Message:
Logged In: YES
user_id=181280
Originator: NO
A fix in revision #4371. This was a general bug, not only interwiki.py
related.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=603138&aid=1802910&group_…
Hi all! As many bot owner I receive every week many warning like:
"Your bot is making wrong edits" or "You are a vandal" or something
like that. Sometimes they are very irritant. I think we have to
introduce a template like {{nointerwikibots}} which does nothing, just
to say to the bot all over the world to not work on that article where
the template is in. For admins will be easy to check how many and what
pages are "disabled" and we finally can say to everyone: "Put the
template and shut up!".
I hope to find your support. My Best regards to all.
Alleborgo