Bugs item #2114223, was opened at 2008-09-16 16:46 Message generated for change (Comment added) made by silvonen You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2114223...
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: interwiki Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: André Malafaya Baptista (malafaya) Assigned to: Nobody/Anonymous (nobody) Summary: Socket timeout breaks out
Initial Comment: VERSION.PY ========== Pywikipedia [svn+ssh] wikimedia/svnroot/pywikipedia/trunk/pywikipedia (r5898, Se p 16 2008, 11:50:17) Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)]
DESCRIPTION =========== It's been happening in the past days that a socket timeout interrupts the bot. I believe the stack trace below is self-explanatory. I used the command line:
interwiki.py -family:wiktionary -autonomous -start:Category:! -lang:io
OUTPUT ====== NOTE: The first unfinished subject is [[io:Kategorio:Albaniana vorti]] NOTE: Number of pages queued is 59, trying to add 60 more. Sleeping for 4.1 seconds, 2008-09-16 14:31:06 Dump io (wiktionary) saved Traceback (most recent call last): File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1735, in <module> bot.run() File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1486, in run self.queryStep() File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1460, in queryStep self.oneQuery() File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1428, in oneQuery site = self.selectQuerySite() File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1402, in selectQuerySite self.generateMore(globalvar.maxquerysize - mycount) File "D:\Work\pywikipediabot-HEAD\pywikipedia\interwiki.py", line 1336, in generateMore page = self.pageGenerator.next() File "D:\Work\pywikipediabot-HEAD\pywikipedia\pagegenerators.py", line 688, in DuplicateFilterPageGenerator for page in generator: File "D:\Work\pywikipediabot-HEAD\pywikipedia\pagegenerators.py", line 239, in AllpagesPageGenerator for page in site.allpages(start = start, namespace = namespace, includeredir ects = includeredirects): File "D:\Work\pywikipediabot-HEAD\pywikipedia\wikipedia.py", line 5166, in allpages text = self.getUrl(api_url) File "D:\Work\pywikipediabot-HEAD\pywikipedia\wikipedia.py", line 4485, in getUrl text = f.read() File "D:\Program Files\Python\lib\socket.py", line 291, in read data = self._sock.recv(recv_size) socket.timeout: timed out
----------------------------------------------------------------------
Comment By: Mikko Silvonen (silvonen) Date: 2008-11-24 21:31
Message: My autonomous run was interrupted twice today because of a socket timeout. I think the problem is server-related, as I have a 110 Mbps / 5 Mbps connection.
Traceback (most recent call last): File "interwiki.py", line 1769, in <module> bot.run() File "interwiki.py", line 1518, in run self.queryStep() File "interwiki.py", line 1492, in queryStep self.oneQuery() File "interwiki.py", line 1488, in oneQuery subject.workDone(self) File "interwiki.py", line 792, in workDone iw = page.interwiki() File "c:\svn\pywikipedia\wikipedia.py", line 1691, in interwiki ll = getLanguageLinks(self.get(), insite=self.site(), File "c:\svn\pywikipedia\wikipedia.py", line 668, in get self._contents = self._getEditPage(get_redirect = get_redirect, throttle = throttle, sysop = sysop) File "c:\svn\pywikipedia\wikipedia.py", line 712, in _getEditPage text = self.site().getUrl(path, sysop = sysop) File "c:\svn\pywikipedia\wikipedia.py", line 4589, in getUrl text = f.read() File "C:\Python25\lib\socket.py", line 291, in read data = self._sock.recv(recv_size) socket.timeout: timed out
C:\svn\pywikipedia>python version.py Pywikipedia [http] trunk/pywikipedia (r6114, Nov 23 2008, 12:41:02) Python 2.5.1 (r251:54863, May 1 2007, 17:47:05) [MSC v.1310 32 bit (Intel)]
----------------------------------------------------------------------
Comment By: NicDumZ — Nicolas Dumazet (nicdumz) Date: 2008-09-20 04:41
Message: the pagegenerator, even with the new api implementation, seems to be working, I'm currently listing the pages of eo.wikt without any timeout. Your connection might just be slower than usual ? Or does it timeout when the WM websites are under heavy load ? You can tweak the socket timeout in user-config.py, setting socket_timeout to the number of seconds to wait (default is 120 seconds, quite long...)
----------------------------------------------------------------------
You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=603138&aid=2114223...