I have recently begun using pywikipediabot again and for some reason it is appallingly slow.
Logging into my wiki take upwards of a minute and each and ever single action, e.g. pages moves, takes about this length of time as well. My wiki is otherwise quite quick and previously pywikipediabot was reasonably fast.
Has anyone else encountered issues like this? What could be causing these huge delays?
I found the problem. It was a problem with python socket library. Apparently the socket wasn't being buffered properly, or at least that's my understanding.
The problem was in the socket.py library, found in Ubuntu 9.10 under /usr/lib/python2.6/socket.py
At around line 500, replace this line for res in getaddrinfo(host, port, 0, SOCK_STREAM): with this on for res in getaddrinfo(host, port, AF_INET, SOCK_STREAM):
Apparently now the socket will be properly buffered. With the old line, wiki login took about 90 seconds for me. Now it takes less than a second.
Hope that helps someone else.
ObsessiveMathsFreak wrote:
I have recently begun using pywikipediabot again and for some reason it is appallingly slow.
Logging into my wiki take upwards of a minute and each and ever single action, e.g. pages moves, takes about this length of time as well. My wiki is otherwise quite quick and previously pywikipediabot was reasonably fast.
Has anyone else encountered issues like this? What could be causing these huge delays?
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
2009/12/16 ObsessiveMathsFreak obsessivemathsfreak@obsessivemathsfreak.org:
I found the problem. It was a problem with python socket library. Apparently the socket wasn't being buffered properly, or at least that's my understanding.
The problem was in the socket.py library, found in Ubuntu 9.10 under /usr/lib/python2.6/socket.py
At around line 500, replace this line for res in getaddrinfo(host, port, 0, SOCK_STREAM): with this on for res in getaddrinfo(host, port, AF_INET, SOCK_STREAM):
Apparently now the socket will be properly buffered. With the old line, wiki login took about 90 seconds for me. Now it takes less than a second.
Hope that helps someone else.
While it might be interesting for some pywikipedia users, I think that this error should be reported upstream. Can you please do it, explaining more thoroughly what the problem was and how you diagnosed it?
2009/12/16 Nicolas Dumazet nicdumz@gmail.com:
2009/12/16 ObsessiveMathsFreak obsessivemathsfreak@obsessivemathsfreak.org:
I found the problem. It was a problem with python socket library. Apparently the socket wasn't being buffered properly, or at least that's my understanding.
The problem was in the socket.py library, found in Ubuntu 9.10 under /usr/lib/python2.6/socket.py
At around line 500, replace this line for res in getaddrinfo(host, port, 0, SOCK_STREAM): with this on for res in getaddrinfo(host, port, AF_INET, SOCK_STREAM):
Apparently now the socket will be properly buffered. With the old line, wiki login took about 90 seconds for me. Now it takes less than a second.
Hope that helps someone else.
While it might be interesting for some pywikipedia users, I think that this error should be reported upstream. Can you please do it, explaining more thoroughly what the problem was and how you diagnosed it?
Looking again at this, I think that your fix is quite suspicious.
0 == AF_UNSPEC, and UNSPEC means "A value of AF_UNSPEC for ai_family means that the caller shall accept any address family." If create_connection(...) is significantly slower than manual ways to create a socket connections because of this generic parameter (aka finding with Address family should be used is a slow operation), then we should not rely on create_connection.
And if we dont call directly create_connection, then the library responsible for this issue should be modified (urllib/httplib/etc...)
But your fix will break your Python installation, I'm afraid. In particular, you wont be able to call create_connection() on UNIX sockets, which may or may not break several installed softwares.
So, again, can you tell us more about that issue? How did you narrow down the problem to that specific call? Can you include caller information?
Nicolas Dumazet wrote:
2009/12/16 Nicolas Dumazet nicdumz@gmail.com:
2009/12/16 ObsessiveMathsFreak obsessivemathsfreak@obsessivemathsfreak.org:
I found the problem. It was a problem with python socket library. Apparently the socket wasn't being buffered properly, or at least that's my understanding.
The problem was in the socket.py library, found in Ubuntu 9.10 under /usr/lib/python2.6/socket.py
At around line 500, replace this line for res in getaddrinfo(host, port, 0, SOCK_STREAM): with this on for res in getaddrinfo(host, port, AF_INET, SOCK_STREAM):
Apparently now the socket will be properly buffered. With the old line, wiki login took about 90 seconds for me. Now it takes less than a second.
Hope that helps someone else.
While it might be interesting for some pywikipedia users, I think that this error should be reported upstream. Can you please do it, explaining more thoroughly what the problem was and how you diagnosed it?
Looking again at this, I think that your fix is quite suspicious.
0 == AF_UNSPEC, and UNSPEC means "A value of AF_UNSPEC for ai_family means that the caller shall accept any address family." If create_connection(...) is significantly slower than manual ways to create a socket connections because of this generic parameter (aka finding with Address family should be used is a slow operation), then we should not rely on create_connection.
And if we dont call directly create_connection, then the library responsible for this issue should be modified (urllib/httplib/etc...)
But your fix will break your Python installation, I'm afraid. In particular, you wont be able to call create_connection() on UNIX sockets, which may or may not break several installed softwares.
So, again, can you tell us more about that issue? How did you narrow down the problem to that specific call? Can you include caller information?
Everything still seems to be working for me. I don't think the python installation has been affected.
I narrowed it down by Ctrl-C ings the running application and finding that it was always hanging on that one line in socket.py A little extra googling revealed a buffer problem with socket.py in python. Beyond that, I really don't know.
I don't know where to go to inform upstream python maintainers of the problem, and to my knowladge they're already ware of it. Could just be a Ubuntu thing.
pywikipediabot is working now and I really don't use python so this issue is closed for me. Unless python stops working, in which case I'll probably end up passing it on.
pywikipedia-l@lists.wikimedia.org