Hello!
It apparently happens often that some people install Python 3 and try to run pywikipedia. pywikipedia then fails ungracefully, with unclear Syntax Errors, because of syntax changes:
1) Calling login.py $ python3 login.py File "login.py", line 61 'en': u'Wikipedia:Registered bots', ^ SyntaxError: invalid syntax
2) Importing wikipedia $ python3 Python 3.0.1+ (r301:69556, Apr 15 2009, 15:59:22) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import wikipedia
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "wikipedia.py", line 354 raise InvalidTitle(u'Bad page title : %s' % t) ^ SyntaxError: invalid syntax
Beginner users then understand that pywikipedia is broken :) I tried to fix this issue today, to at least be able to output some meaningful info when Python3 dies. Problem: Because Python3 fails on Syntax Errors, no imports are processed, and no code is ever interpreted: the parser first parses the code, raising Syntax Errors, and then only starts interpreting the code. It makes this issue troublesome.
I came up with this patch:
Index: login.py =================================================================== --- login.py (revision 6857) +++ login.py (working copy) @@ -50,7 +50,11 @@ # __version__='$Id$'
-import re +import re, sys + +if sys.version >= '3': + print 'Python 3.x is _not_ supported. Use Python 2.x' + import urllib2 import wikipedia, config
Index: wikipedia.py =================================================================== --- wikipedia.py (revision 6858) +++ wikipedia.py (working copy) @@ -123,6 +123,10 @@ __version__ = '$Id$'
import os, sys + +if sys.version >= '3': + print 'Python 3.x is _not_ supported. Use Python 2.x' + import httplib, socket, urllib import traceback import time, threading, Queue
1) Calling login.py
Behavior after: $ python3 login.py File "login.py", line 56 print 'Python 3.0 is _not_ supported. Use Python 2.x' ^ SyntaxError: invalid syntax
2) Importing wikipedia:
$ python3 Python 3.0.1+ (r301:69556, Apr 15 2009, 15:59:22) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information.
import wikipedia
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "wikipedia.py", line 128 print 'Python 3.0 is _not_ supported. Use Python 2.x' ^ SyntaxError: invalid syntax
As you see, Python3 will still fail on a syntax error. The hackish, bad-looking "trick" is only to make it fail on a line that shows, in a text message, what's wrong.
Do you have any better suggestions on how to do this?
I need to document the "hack" in the code of course, I just included a minimal patch for the mailing list. I thought adding this check in wikipedia.py ; and in login.py : they seem to be the first entry points. Should I add it somewhere else? test.py maybe?
Thanks!
Nicolas Dumazet ha scritto:
Index: login.py
[...]>
-import re +import re, sys
+if sys.version >= '3':
- print 'Python 3.x is _not_ supported. Use Python 2.x'
[...]
Index: wikipedia.py
[...]
+if sys.version >= '3':
- print 'Python 3.x is _not_ supported. Use Python 2.x'
This is a hack, it is very ugly, and it "works" only for chance. Just update the home page and the CONTENTS file (which should be renamed to README) or provide a branch for Python 3, looks for me as better solutions.
is it ok to move everything else to another module, and import them after checking python version?
# wikipedia.py import sys if sys.version_info[0] >= 3: print('python 3.x is not supported') sys.exit(1) from wikipedia2 import *
# wikipedia2.py # (everything in wikipedia.py originally)
On 5/9/09, Francesco Cosoleto cosoleto@gmail.com wrote:
Nicolas Dumazet ha scritto:
Index: login.py
[...]>
-import re +import re, sys
+if sys.version >= '3':
- print 'Python 3.x is _not_ supported. Use Python 2.x'
[...]
Index: wikipedia.py
[...]
+if sys.version >= '3':
- print 'Python 3.x is _not_ supported. Use Python 2.x'
This is a hack, it is very ugly, and it "works" only for chance. Just update the home page and the CONTENTS file (which should be renamed to README) or provide a branch for Python 3, looks for me as better solutions.
-- Francesco Cosoleto
"Noi esercitiamo il potere con l'amore, non con le armi" (J. P. Goebbels, ministro della cultura del governo nazionalsocialista tedesco, 1936)
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Liangent ha scritto:
is it ok to move everything else to another module, and import them after checking python version?
# wikipedia.py import sys if sys.version_info[0] >= 3: print('python 3.x is not supported') sys.exit(1) from wikipedia2 import *
# wikipedia2.py # (everything in wikipedia.py originally)
It doesn't work if scripts that import wikipedia module aren't Python 3 syntax compatible.
if python 3.x is running, the script will exit before import is executed. so there will be no problem about syntax.
On 5/9/09, Francesco Cosoleto cosoleto@gmail.com wrote:
Liangent ha scritto:
is it ok to move everything else to another module, and import them after checking python version?
# wikipedia.py import sys if sys.version_info[0] >= 3: print('python 3.x is not supported') sys.exit(1) from wikipedia2 import *
# wikipedia2.py # (everything in wikipedia.py originally)
It doesn't work if scripts that import wikipedia module aren't Python 3 syntax compatible.
-- Francesco Cosoleto
«Non dee l'uomo, per maggiore amico, dimenticare li servigi ricevuti dal minore». (Dante Alighieri, Convivio, II, 65)
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Liangent ha scritto:
is it ok to move everything else to another module, and import them after checking python version?
if python 3.x is running, the script will exit before import is executed. so there will be no problem about syntax.
The import statement isn't executed due to syntax error. Isn't that the problem?
Besides wikipedia module is a library so it hasn't to close Python.
gettext probably it's a good idea to improve bots compatibility with 3.0.
Francesco Cosoleto cosoleto@gmail.com wrote:
The import statement isn't executed due to syntax error. Isn't that the problem?
Besides wikipedia module is a library so it hasn't to close Python.
gettext probably it's a good idea to improve bots compatibility with 3.0.
The syntax error will not occur if Python exits before importing.
Python 3.0 includes major changes that are not backwards-compatible, so it would be hard to make the framework 3.0-compatible. For example, the u'string' syntax is no longer valid, because all strings are now Unicode by default; also, many modules no longer function in Python 3.0. Maintaining two versions of the code (one for Python 2.x and one for 3.0) would involve a lot of duplication of effort for little gain, since Python 3.0 is not widely adopted yet.
I think a warning that pywikipediabot is not Python 3.0-compatible may be better for now, unless someone wants to dedicate their free time to maintaining a Python 3.0 branch.
Jesse (Pathoschild) ha scritto:
The import statement isn't executed due to syntax error. Isn't that the problem?
Besides wikipedia module is a library so it hasn't to close Python.
gettext probably it's a good idea to improve bots compatibility with 3.0.
The syntax error will not occur if Python exits before importing.
The syntax error occur always because all our scripts (except maybe udp-log.py) aren't compatible with Python 3.0 syntax. And Nicolas' patch raises syntax error intentionally just on the line he has added.
Liagent patch would work only with scripts Python 3.0 compatible that try to import wikipedia module. As I have already said, we should/can do more Python 3.0 compatible scripts. It isn't hard.
2009/5/10 Francesco Cosoleto cosoleto@gmail.com:
The syntax error occur always because all our scripts (except maybe udp-log.py) aren't compatible with Python 3.0 syntax. And Nicolas' patch raises syntax error intentionally just on the line he has added.
Yes. I dont think that anyone is planning to port pywikipedia for python 3, so that ugly hack looked like the best option to me.
On the other hand, the rewrite branch is meant for Python > 2.5, and separation between bytes and unicode is much more cleaner. If anyone was to maintain a Python3 branch for pywikipedia, I would strongly suggest him to branch from the pywikibot (rewrite) branch. Honestly, I think that Russell already did quite a lot of work to ease the migration to Python3 on that branch.
Please do not forget that our short-term goal should be to update the rewrite branch so it can be considered clean, and so that it can replace the old text-scraping pywikipedia framework that we have in trunk. Having this in mind, I won't spend time to branch the trunk to support Python3. It's a loss of time.
Nicolas Dumazet ha scritto:
replace the old text-scraping pywikipedia framework that we have in trunk. Having this in mind, I won't spend time to branch the trunk to support Python3. It's a loss of time.
Bot scripts shouldn't depend on library used (current trunk one, rewrote branch, or other), so if somebody want improve Python 3.0 compatibilty should feel free to do it.
I think the problem is documentation only. We should fix that (manual, manual translations, home page, README), and only if all this effort isn't enough then to use our time to fix the code.
On Sat, May 9, 2009 6:02 am, Nicolas Dumazet wrote:
Because Python3 fails on Syntax Errors, no imports are processed, and no code is ever interpreted: the parser first parses the code, raising Syntax Errors, and then only starts interpreting the code. It makes this issue troublesome.
Please note that this means we will have to use this 'patch' on /all/ files. I have to say I find it slightly disturbing there is no nice way to signal things like this :(
--valhallasw
Nicolas Dumazet ha scritto:
-import re +import re, sys
+if sys.version >= '3':
- print 'Python 3.x is _not_ supported. Use Python 2.x'
Anyway, this is a bit less ugly:
u'Python 3.x is _not_ supported. Use Python 2.x'
No 'if' and 'sys' module.