Hi! First, I do realize that the problem that I'm gonna show you following is probably easy to fix, but, well, I'm not so in to python yet. Second, it appears to me that it is a problem that could affect more people than me, since the task I'm trying to execute is not that uncommon, I think, so if it were fixed it would help... more than one. =)
Here is the deal:
I'm trying to upload all the images from this page:
http://pt.wikipedia.org/wiki/Portal:Anarquia/Acervo_de_imagens
to another wiki website.
The bot seems to recognize the images, but instead of go to "pt.wikipedia.org/Ficheiro:name-of-image.xxx" where they actually are, he retypes the given url of the portal and add the "ficheiro:name-of-image.xxx" part. For example, the first image of the archive must be found at http://pt.wikipedia.org/wiki/Ficheiro:Cutters1.preview.jpg, but instead the bot tries to upload from http://pt.wikipedia.org/wiki/Portal:Anarquia//wiki/ficheiro:cutters1.preview.jpg, what obviously does not work. Here is the result in my terminal: "HTTP Error 404: Not Found"
I'm wondering what should I do to fix it, which script to edit and what change in it, since I am a Python newbie.
Thank you!
* "Ficheiro" is the portuguese equivalent to "file", in this case
estrangeiro estrangeiro@riseup.net wrote:
Hi! First, I do realize that the problem that I'm gonna show you following is probably easy to fix, but, well, I'm not so in to python yet. Second, it appears to me that it is a problem that could affect more people than me, since the task I'm trying to execute is not that uncommon, I think, so if it were fixed it would help... more than one. =)
Here is the deal:
I'm trying to upload all the images from this page:
http://pt.wikipedia.org/wiki/Portal:Anarquia/Acervo_de_imagens
This script was pretty broken. I have fixed this a bit so you can use "-justshown" option to fetch images from your wikipage.
This script does not understand the concept of description pages (it thinks that everything ending in .png and .jpeg is an image), so if you try to use it with "-shown" or without an option it will try to download "/wiki/File:Image.png" description pages too.
Anyway, r9525 should work at least to let you download the images displayed on this page - which is probably what you want.
//Saper
On 09/14/2011 07:57 PM, Marcin Cieslak wrote:
estrangeiro estrangeiro@riseup.net wrote:
Hi! First, I do realize that the problem that I'm gonna show you following is probably easy to fix, but, well, I'm not so in to python yet. Second, it appears to me that it is a problem that could affect more people than me, since the task I'm trying to execute is not that uncommon, I think, so if it were fixed it would help... more than one. =)
Here is the deal:
I'm trying to upload all the images from this page:
http://pt.wikipedia.org/wiki/Portal:Anarquia/Acervo_de_imagens
This script was pretty broken. I have fixed this a bit so you can use "-justshown" option to fetch images from your wikipage.
This script does not understand the concept of description pages (it thinks that everything ending in .png and .jpeg is an image), so if you try to use it with "-shown" or without an option it will try to download "/wiki/File:Image.png" description pages too.
Anyway, r9525 should work at least to let you download the images displayed on this page - which is probably what you want.
//Saper
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Thank you very much, Saper, the script almost done it. But it is not there, yet. The bot does everything fine, except that, when it seems to be at the final step, an unicode problem message is shown: "'ascii' codec can't decode byte 0xff in position 768: ordinal not in range(128)".
Well, I've looked at the Python documentation http://docs.python.org/howto/unicode.html to see more or less what I have to do to solve it, but without any "harvest".
Sorry if I'm being too insistent.
estrangeiro estrangeiro@riseup.net wrote:
On 09/14/2011 07:57 PM, Marcin Cieslak wrote: Thank you very much, Saper, the script almost done it. But it is not there, yet. The bot does everything fine, except that, when it seems to be at the final step, an unicode problem message is shown: "'ascii' codec can't decode byte 0xff in position 768: ordinal not in range(128)".
Can you paste a copy of the script output as well as the parameters used? Did the script upload some pictures to the wiki? I tried the link you gave me but only for first few pictures, and I have successfully uploaded them to my wiki.
//Saper
On 09/15/2011 02:47 PM, Marcin Cieslak wrote:
estrangeiro estrangeiro@riseup.net wrote:
On 09/14/2011 07:57 PM, Marcin Cieslak wrote: Thank you very much, Saper, the script almost done it. But it is not there, yet. The bot does everything fine, except that, when it seems to be at the final step, an unicode problem message is shown: "'ascii' codec can't decode byte 0xff in position 768: ordinal not in range(128)".
Can you paste a copy of the script output as well as the parameters used? Did the script upload some pictures to the wiki? I tried the link you gave me but only for first few pictures, and I have successfully uploaded them to my wiki.
//Saper
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Of course!
One note, first: I really don't believe that both the server and my connection are the cause, but this script's coding in relation with the site, since I can run other scripts on the wiki. But whatever, these are just thoughts, not sure (!) of any of it.
---
usuario@linux-x5qr:~> cd ~/Documents/_Programming/Python/pywikipedia usuario@linux-x5qr:~/Documents/_Programming/Python/pywikipedia> python
login.py unicode test: triggers problem #3081100 Password for user Robopioneiro on protopia:pt-br: Logging in to protopia:pt-br as Robopioneiro via API. Should be logged in now
usuario@linux-x5qr:~/Documents/_Programming/Python/pywikipedia> python
imageharvest.py -justshown unicode test: triggers problem #3081100
From what URL should I get the images?
http://pt.wikipedia.org/wiki/Portal:Anarquia/Acervo_de_imagens
What text should be added at the end of the description of each image
from this url? imagem retirada do Acervo de imagens do Portal Anarquia da Wikipedia lusofona Include image http://upload.wikimedia.org/wikipedia/commons/thumb/3/30/Anarc-Foto.png/80px... ([y]es, [N]o, [s]top) n Include image http://upload.wikimedia.org/wikipedia/commons/thumb/3/34/Wikipedia-logo_A_pt... ([y]es, [N]o, [s]top) n # [...] Include image http://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Anarc-Data.png/40px... ([y]es, [N]o, [s]top) n
Include image
http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/Cutters1.preview.jp... ([y]es, [N]o, [s]top) y Give the description of this image: imagem cedida por awalls.org Specify a category (or press enter to end adding categories) Reading file http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/Cutters1.preview.jp... The filename on the target wiki will default to: 400px-Cutters1.preview.jpg Enter a better name, or press enter to accept: The suggested description is: imagem cedida por awalls.org
imagem retirada do Acervo de imagens do Portal Anarquia da Wikipedia lusofona
Do you want to change this description? ([y]es, [N]o) n Uploading file to protopia:pt-br via API.... 'ascii' codec can't decode byte 0xff in position 768: ordinal not in range(128) WARNING: Could not open 'http://pt.protopia.at/api.php'. Maybe the server or your connection is down. Retrying in 1 minutes... ---- - Keeps retrying until I enter Ctrl + Z - I've already tried to edit the script replacing iso8859-1 for iso8859-15 and utf-8; no effect.
Thank you!
On 09/15/2011 05:13 PM, estrangeiro wrote:
On 09/15/2011 02:47 PM, Marcin Cieslak wrote:
estrangeiro estrangeiro@riseup.net wrote:
On 09/14/2011 07:57 PM, Marcin Cieslak wrote: Thank you very much, Saper, the script almost done it. But it is not there, yet. The bot does everything fine, except that, when it seems to be at the final step, an unicode problem message is shown: "'ascii' codec can't decode byte 0xff in position 768: ordinal not in range(128)".
Can you paste a copy of the script output as well as the parameters used? Did the script upload some pictures to the wiki? I tried the link you gave me but only for first few pictures, and I have successfully uploaded them to my wiki.
//Saper
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Of course!
One note, first: I really don't believe that both the server and my connection are the cause, but this script's coding in relation with the site, since I can run other scripts on the wiki. But whatever, these are just thoughts, not sure (!) of any of it.
usuario@linux-x5qr:~> cd ~/Documents/_Programming/Python/pywikipedia usuario@linux-x5qr:~/Documents/_Programming/Python/pywikipedia> python
login.py unicode test: triggers problem #3081100 Password for user Robopioneiro on protopia:pt-br: Logging in to protopia:pt-br as Robopioneiro via API. Should be logged in now
usuario@linux-x5qr:~/Documents/_Programming/Python/pywikipedia> python
imageharvest.py -justshown unicode test: triggers problem #3081100
From what URL should I get the images?
http://pt.wikipedia.org/wiki/Portal:Anarquia/Acervo_de_imagens
What text should be added at the end of the description of each image
from this url? imagem retirada do Acervo de imagens do Portal Anarquia da Wikipedia lusofona Include image http://upload.wikimedia.org/wikipedia/commons/thumb/3/30/Anarc-Foto.png/80px... ([y]es, [N]o, [s]top) n Include image http://upload.wikimedia.org/wikipedia/commons/thumb/3/34/Wikipedia-logo_A_pt... ([y]es, [N]o, [s]top) n # [...] Include image http://upload.wikimedia.org/wikipedia/commons/thumb/e/e6/Anarc-Data.png/40px... ([y]es, [N]o, [s]top) n
Include image
http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/Cutters1.preview.jp... ([y]es, [N]o, [s]top) y Give the description of this image: imagem cedida por awalls.org Specify a category (or press enter to end adding categories) Reading file http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/Cutters1.preview.jp... The filename on the target wiki will default to: 400px-Cutters1.preview.jpg Enter a better name, or press enter to accept: The suggested description is: imagem cedida por awalls.org
imagem retirada do Acervo de imagens do Portal Anarquia da Wikipedia lusofona
Do you want to change this description? ([y]es, [N]o) n Uploading file to protopia:pt-br via API.... 'ascii' codec can't decode byte 0xff in position 768: ordinal not in range(128) WARNING: Could not open 'http://pt.protopia.at/api.php'. Maybe the server or your connection is down. Retrying in 1 minutes...
- Keeps retrying until I enter Ctrl + Z
- I've already tried to edit the script replacing iso8859-1 for
iso8859-15 and utf-8; no effect.
Thank you!
Pywikipedia-l mailing list Pywikipedia-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
oh yeah, the triggers problem #3081100... that is shown since forever and is certainly the reason.
pywikipedia-l@lists.wikimedia.org