https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
Web browser: --- Bug ID: 55127 Summary: core hangs on killed http(s) connection Product: Pywikibot Version: unspecified Hardware: All OS: All Status: NEW Severity: normal Priority: Unprioritized Component: General Assignee: Pywikipedia-bugs@lists.wikimedia.org Reporter: legoktm.wikipedia@gmail.com Classification: Unclassified Mobile Platform: ---
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/1657/ Reported by: valhallasw Created on: 2013-08-22 15:14:10.237000 Subject: core hangs on killed http(s) connection Original description: Steps to reproduce:
p = pywikibot.Page(pywikibot.getSite('nl', 'wikipedia'), 'Wikipedia') p.get()
# now kill the connection, e.g. using tcpview in windows
p = pywikibot.Page(pywikibot.getSite('nl', 'wikipedia'), 'Wikipedia') p.get()
# hangs
Expected result: pywikibot reconnects and executes the .get()
Actual result: complete hang (deadlock?)
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #1 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Stack trace during hang:
*** STACKTRACE - START ***
# ThreadID: 140208457426688 File: "/usr/lib/python2.7/threading.py", line 524, in __bootstrap self.__bootstrap_inner() File: "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner self.run() File: "connectionkilltest.py", line 16, in run for filename, lineno, name, line in traceback.extract_stack(stack):
# ThreadID: 140208472311552 File: "/usr/lib/python2.7/threading.py", line 524, in __bootstrap self.__bootstrap_inner() File: "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner self.run() File: "pywikibot/comms/threadedhttp.py", line 343, in run item.data = self.http.request(*item.args, **item.kwargs) File: "pywikibot/comms/threadedhttp.py", line 219, in request max_redirects, connection_type File: "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1444, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File: "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1196, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File: "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line 1166, in _conn_request response = conn.getresponse() File: "/usr/lib/python2.7/httplib.py", line 1030, in getresponse response.begin() File: "/usr/lib/python2.7/httplib.py", line 407, in begin version, status, reason = self._read_status() File: "/usr/lib/python2.7/httplib.py", line 365, in _read_status line = self.fp.readline() File: "/usr/lib/python2.7/socket.py", line 430, in readline data = recv(1) File: "/usr/lib/python2.7/ssl.py", line 241, in recv return self.read(buflen) File: "/usr/lib/python2.7/ssl.py", line 160, in read return self._sslobj.read(len)
# ThreadID: 140208510514944 File: "pwb.py", line 103, in <module> run_python_file(fn, args) File: "pwb.py", line 49, in run_python_file exec compile(source, filename, "exec") in main_mod.__dict__ File: "connectionkilltest.py", line 36, in <module> p.put_async('test edit') File: "pywikibot/page.py", line 887, in put_async async=True, callback=callback, **kwargs) File: "pywikibot/page.py", line 872, in put async=async, callback=callback, **kwargs) File: "pywikibot/page.py", line 782, in save if not force and not self.botMayEdit(): File: "pywikibot/page.py", line 710, in botMayEdit username = self.site.user() File: "pywikibot/site.py", line 206, in user if self.logged_in(True): File: "pywikibot/site.py", line 842, in logged_in if sysop and 'sysop' not in self.userinfo['groups']: File: "pywikibot/site.py", line 910, in getuserinfo uidata = uirequest.submit() File: "pywikibot/data/api.py", line 284, in submit body=paramstring) File: "pywikibot/comms/http.py", line 118, in request request.lock.acquire() File: "/usr/lib/python2.7/threading.py", line 323, in acquire self.__cond.wait() File: "/usr/lib/python2.7/threading.py", line 243, in wait waiter.acquire()
*** STACKTRACE - END ***
So: probably not a deadlock, just the ssl library hanging.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #2 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **Priority**: 5 --> 2
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #3 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- Works without any issues in windows - could be either related to linux (v. windows) or killing the connection locally or upstream (the original was run from a VM). Reducing priority.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #4 from Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com --- - **Group**: confirmed --> rewrite
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
Kunal Mehta (Legoktm) legoktm.wikipedia@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- See Also| |https://sourceforge.net/p/p | |ywikipediabot/bugs/1657
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #5 from xqt info@gno.de --- I do not know whether this is similar this bug reported above and am unable to investigate the reason of that bug. I only found the last output where the bot checks a wiki page every 15 seconds:
defendant: Textkontrolle accuser: Neozoon time: 2013 Okt 20 21:26
05:20:37:
defendant: Textkontrolle accuser: Neozoon time: 2013 Okt 20 21:26
05:20:54:
... a lot of lines deleted ...
05:42:19: 05:42:33:
...
05:53:41: 05:53:55: 05:54:10: 05:54:24: 05:54:38: 05:54:53: 05:55:07: 05:55:21:
now it stops, CTL-C does not work, I have to kill the process via task manager. I checked my router and found it had an autoreset of the internet connection 5 minutes before 6 a.m. this morning and I guess this was also yesterday. On the other hand the same bot worked well the three days before.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #6 from xqt info@gno.de --- Additional comment: pwb.py version
Pywikibot: pywikibot/__init__.py (r-1 (unknown), ???????, 2013/09/21, 15:53:03, OUTDATED) Release version: 2.0b1 Python: 2.7.3 (default, Apr 10 2012, 23:24:47) [MSC v.1500 64 bit (AMD64)] unicode test: ok
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
xqt info@gno.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|Unprioritized |High Version|unspecified |core (2.0) Severity|normal |major
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #7 from xqt info@gno.de --- Yes my bug depends on a killed internet connection while my router was off for 20 seconds!
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #8 from xqt info@gno.de --- Here are the last log entries before the bot hanged:
2013-11-20 05:50:37 api.py, 685 in __iter__: DEBUG PropertyGenerator received [u'7165976']; limit=-1 2013-11-20 05:50:51 vm-auto-erl-neu.py, 79 in output: INFO >> 05:50:51: 2013-11-20 05:50:51 threadedhttp.py, 97 in pop_connection: DEBUG Retrieved connection from 'https:de.wikipedia.org' pool. 2013-11-20 05:50:51 threadedhttp.py, 215 in request: DEBUG ('https://de.wikipedia.org/w/api.php', 'POST', 'maxlag=5000&format=json&bkprop=user%7Cby%7Ctimestamp%7Cexpiry%7Creason&list=blocks&bkusers=&meta=userinfo&action=query&bkend=20131120005959&uiprop=blockinfo%7Chasmsg', {'cookie': 'centralauth_Token=8c1df3ec61f7a92677f4c0b561c30585; centralauth_User=Xqbot; forceHTTPS=1; forceHTTPS=true; dewikiUserName=Xqbot; dewikiSession=3144518b40881659814792e3c0fc34cc; dewikiUserID=627628', 'connection': 'keep-alive', 'Content-Type': 'application/x-www-form-urlencoded', 'user-agent': 'pwb/r-1 (unknown) Pywikipediabot/2.0'}, 5, None) 2013-11-20 05:50:51 api.py, 308 in submit: DEBUG API response received: {"query":{"blocks":[],"userinfo":{"id":627628,"name":"Xqbot"}}} 2013-11-20 05:50:51 api.py, 611 in update_limit: DEBUG PropertyGenerator: Set query_limit to 5000. 2013-11-20 05:50:51 threadedhttp.py, 97 in pop_connection: DEBUG Retrieved connection from 'https:de.wikipedia.org' pool. 2013-11-20 05:50:51 threadedhttp.py, 215 in request: DEBUG ('https://de.wikipedia.org/w/api.php', 'POST', 'maxlag=5000&format=json&rvprop=ids%7Cflags%7Ctimestamp%7Cuser%7Ccomment%7Ccontent&prop=info%7Crevisions&titles=Wikipedia%3AVandalismusmeldung&meta=userinfo&indexpageids=&action=query&uiprop=blockinfo%7Chasmsg', {'cookie': 'centralauth_Token=8c1df3ec61f7a92677f4c0b561c30585; centralauth_User=Xqbot; forceHTTPS=1; forceHTTPS=true; dewikiUserName=Xqbot; dewikiSession=3144518b40881659814792e3c0fc34cc; dewikiUserID=627628', 'connection': 'keep-alive', 'Content-Type': 'application/x-www-form-urlencoded', 'user-agent': 'pwb/r-1 (unknown) Pywikipediabot/2.0'}, 5, None)
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
xqt info@gno.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Blocks| |58392
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #9 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 108325 had a related patch set uploaded by Ladsgroup: [BUGFIX] fixing bug 55127
https://gerrit.wikimedia.org/r/108325
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
Gerrit Notification Bot gerritadmin@wikimedia.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |PATCH_TO_REVIEW
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #10 from Merlijn van Deen valhallasw@arctus.nl --- Minimal test case for the SSL issue:
import httplib2
h = httplib2.Http() resp, content = h.request("https://en.wikipedia.org/", "GET") raw_input("kill the connection...") resp, content = h.request("https://en.wikipedia.org/", "GET") print "done"
So there are three issues: - A connection can hang indefinitely if the connection is killed. There should be some timeout for receiving some data here. - The framework does not respond to ctrl-c when a thread doesn't return. This can be solved by changing pywikibot/comms/http.py to be non-blocking (e.g. in a loop with a 100ms time.sleep) - Threads do not receive KeyboardInterrupts (which is OK!) but this means a non-returning call will hang forever (i.e. will not be killed when python exits)
Amirs solution might be useful for the last issue, but I'm not 100% sure about that. I think I'd rather hook the atexit handler to kill all processing threads. Maybe we can also start threads as daemons, which should kill them automatically when the main thread exits...
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #11 from xqt info@gno.de --- I tried this on _windows_ but I didn't got it hanging:
Traceback (most recent call last): File "<pyshell#4>", line 1, in <module> resp, content = h.request("https://en.wikipedia.org/", "GET") File "C:\Pywikipedia\ssh\pywikibot\core\externals\httplib2\python2\httplib2__init__.py", line 1591, in request (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey) File "C:\Pywikipedia\ssh\pywikibot\core\externals\httplib2\python2\httplib2__init__.py", line 1333, in _request (response, content) = self._conn_request(conn, request_uri, method, body, headers) File "C:\Pywikipedia\ssh\pywikibot\core\externals\httplib2\python2\httplib2__init__.py", line 1289, in _conn_request response = conn.getresponse() File "C:\Python27\lib\httplib.py", line 1030, in getresponse response.begin() File "C:\Python27\lib\httplib.py", line 407, in begin version, status, reason = self._read_status() File "C:\Python27\lib\httplib.py", line 365, in _read_status line = self.fp.readline() File "C:\Python27\lib\socket.py", line 430, in readline data = recv(1) File "C:\Python27\lib\ssl.py", line 241, in recv return self.read(buflen) File "C:\Python27\lib\ssl.py", line 160, in read return self._sslobj.read(len) error: [Errno 10060] Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstelle nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergestellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #12 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 110853 had a related patch set uploaded by Merlijn van Deen: (bug 55127) Implement network timeouts
https://gerrit.wikimedia.org/r/110853
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #13 from Merlijn van Deen valhallasw@arctus.nl --- This solves the basic issue, but does not fix the other two issues (the main thread will not respond to ctrl-c until the timeout is reached; also the thread will not be stopped until the timeout is raised)
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #14 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 110866 had a related patch set uploaded by Merlijn van Deen: (bug 55127) Allow user to cut off pending network requests at shutdown
https://gerrit.wikimedia.org/r/110866
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #15 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 110867 had a related patch set uploaded by Merlijn van Deen: (bug 55127) Allow user to abort during http request
https://gerrit.wikimedia.org/r/110867
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #16 from Merlijn van Deen valhallasw@arctus.nl --- And with the latest two patches, this should also be fixed.
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #17 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 110853 merged by jenkins-bot: (bug 55127) Implement network timeouts
https://gerrit.wikimedia.org/r/110853
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #18 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 110866 merged by jenkins-bot: (bug 55127) Allow user to cut off pending network requests at shutdown
https://gerrit.wikimedia.org/r/110866
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #19 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 110867 merged by jenkins-bot: (bug 55127) Allow user to abort during http request
https://gerrit.wikimedia.org/r/110867
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
--- Comment #20 from Gerrit Notification Bot gerritadmin@wikimedia.org --- Change 108325 abandoned by Xqt: [BUGFIX] fixing bug 55127
Reason: https://gerrit.wikimedia.org/r/110853 https://gerrit.wikimedia.org/r/110866 https://gerrit.wikimedia.org/r/110867
https://gerrit.wikimedia.org/r/108325
https://bugzilla.wikimedia.org/show_bug.cgi?id=55127
xqt info@gno.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|PATCH_TO_REVIEW |RESOLVED Resolution|--- |FIXED
pywikipedia-bugs@lists.wikimedia.org