pywikibot June 2017

pywikibot@lists.wikimedia.org

10 participants
9 discussions

[Pywikipedia-l] Urlencoded section titles
by Bináris 13 Sep '18

13 Sep '18

Happy Monday, There are strange people who make such links (kindof urlencoded?): [[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban .28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]] So the section title must have been copied from the URL. Do we have a ready tool to fix these? -- Bináris

3 11

Fwd: Re: Which username for Wikidata test ?
by Yongmin H. 29 Jun '17

29 Jun '17

Forgot to "send to list". -------- Forwarded Message -------- Subject: Re: [pywikibot] Which username for Wikidata test ? Date: Fri, 30 Jun 2017 00:30:56 +0900 From: Yongmin H. <lists(a)revi.pe.kr> Organization: Wikimedia To: Jean-Baptiste Pressac <Jean-Baptiste.Pressac(a)univ-brest.fr> Hi, Are you sure you have account on test.wikidata.org? I cannot find it in [[Special:ListUsers]][1]. Even if you have SUL, you need to visit the wiki once to get it auto-created. [1]: https://test.wikidata.org/w/index.php?title=Special%3AListUsers&username=Tr… Thanks, PS: Mailing list archive is available here. https://lists.wikimedia.org/pipermail/pywikibot/ On 2017-06-30 00:15, Jean-Baptiste Pressac wrote: > Hello, > > I created a /user-config.py/ with /generate_user_files.py/ to use > Wikidata test (mylang = 'test'). But as I try to login I have this error > message: > > pywikibot.exceptions.NoUsername: Username 'trucmuche' does not exist on > wikidata:test > > Where trucmuche is my usual account on Wikidata. Is there a special > username for Wikidata test ? > > What is the URL of Wikidata test ? > > Thanks, > > PS : Is there a way to search in the forum archives ? > > -- > Jean-Baptiste Pressac > > Traitement et analyse de bases de données > Production et diffusion de corpus numériques > > Centre de Recherche Bretonne et Celtique > Unité mixte de service (UMS) 3554 > 20 rue Duquesne > CS 93837 > 29238 Brest cedex 3 > > tel : +33 (0)2 98 01 68 95 > fax : +33 (0)2 98 01 63 93 > > > > _______________________________________________ > pywikibot mailing list > pywikibot(a)lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/pywikibot > -- Yongmin Hong https://wp.revi.blog Please note that this address is list-only address and any non-mailing list mails will be treated as spam. Please use https://encrypt.to/0x947f156f16250de39788c3c35b625da5beff197a

1 0

Which username for Wikidata test ?
by Jean-Baptiste Pressac 29 Jun '17

29 Jun '17

Hello, I created a /user-config.py/ with /generate_user_files.py/ to use Wikidata test (mylang = 'test'). But as I try to login I have this error message: pywikibot.exceptions.NoUsername: Username 'trucmuche' does not exist on wikidata:test Where trucmuche is my usual account on Wikidata. Is there a special username for Wikidata test ? What is the URL of Wikidata test ? Thanks, PS : Is there a way to search in the forum archives ? -- Jean-Baptiste Pressac Traitement et analyse de bases de données Production et diffusion de corpus numériques Centre de Recherche Bretonne et Celtique Unité mixte de service (UMS) 3554 20 rue Duquesne CS 93837 29238 Brest cedex 3 tel : +33 (0)2 98 01 68 95 fax : +33 (0)2 98 01 63 93

1 0

Răspuns: Decoding strings issue in PWB
by Dan 25 Jun '17

25 Jun '17

Thank you very much for your answer, Merlijn.I don't have in plans to switch to Python 3 in the near future:) Your second and third solutions works fine. I'm sticking to the third solution for now.Thanks again for your help.Regards.Dan Saturday, 24 June 2017 18:43:46, Merlijn van Deen (valhallasw) <valhallasw(a)arctus.nl> wrote: Hi Dan, On 23 June 2017 at 19:41, Dan <dan15i(a)yahoo.com> wrote: Hi. Do PWB has issues with decoding URL strings? Nothing in your example suggests it does: test1 = urllib.unquote(m) test2 = urllib.unquote_plus(m) test3 = m.decode('utf8') test4 = m.encode('utf8') These are all questions of what the Python built-in urllib module does. In the case of Python 2, the behavior is a bit odd, and I think this is what is causing your issue. In your example, m = u'%c3%85', i.e., a unicode string with the text "%C3%85". Urldecoding this should yield two bytes: the bytes C3 and 85, i.e, the UTF-8 representation of Å. However, what Python 2 does is it interprets u'%c3%85' to mean 'a unicode string with characters U+00C3 U+0085', i.e., the characters Ã and [unprintable]. There is no clean way to fix the situation after we have ended up there. Now -- how to solve this? - The most obvious solution is 'Use Python 3', where the unquote function correctly processes the string.- Another option is to turn your URL into a bytestring first, i.e., m = m.encode('utf-8'), then call unquote, then decode the string again.- As you already have a dependency on pywikibot, the last option is to use the pywikibot.page.url2unicode, which works correctly, even on Python 2. Best,Merlijn

1 0

Decoding strings issue in PWB
by Dan 24 Jun '17

24 Jun '17

Hi. Do PWB has issues with decoding URL strings? Try this script: from __future__ import absolute_import, unicode_literals import re, urllib import pywikibot mylist = \ [ u"Åge Hovengen", u"Åge Konradsen", u"Åge Ramberg", ] for a in mylist: ssite = pywikibot.getSite("en") spage = pywikibot.Page(ssite, a) text = spage.get() m0 = re.search(ur"\{\{\s*Stortingetbio\s*\|\s*(?:id=)?\s*([^\s}\|]+)\s*[\|\}]", text, flags=re.IGNORECASE) if m0: m = m0.group(1) test1 = urllib.unquote(m) test2 = urllib.unquote_plus(m) test3 = m.decode('utf8') test4 = m.encode('utf8') pywikibot.output(test1) pywikibot.output(test2) pywikibot.output(test3) pywikibot.output(test4) It doesn't decode for me %c3%85 to ÅWhile on http://repl.it/Izdw/2 you can see that pure python can decode that string sequence with urllib.unquote and urllib.unquote_plus.Is this a PWB bug or what?

2 1

archivebot.py improvements
by MarcoAurelio 23 Jun '17

23 Jun '17

Hello, I wonder if some of you could maybe take a look at https://phabricator.wikimedia.org/T119791 and the archivebot.py script in general? It'd be good if the script supported some other functions such as different n=x archiving and immediate archiving if a template is there in a thread. Best regards, M.

4 3

Scripts which adds template to articles created by ContentTranslation tool do not work on the grid
by Martin Urbanec 16 Jun '17

16 Jun '17

Hello, I have a script which should add a template to articles which are created by the ContentTranslation tool (the template has parameters which depends on language and revision which were used as the source one; this is the reason why I use separate script). It may be found at https://github.com/urbanecm/addPrekladCT/blob/master/addmissing.py. The script work perfectly on my local PC and on bastion host but I can't get it work on the grid. The script itself is run by *python3 addmissing.py -always -file:pages.txt -search:'-insource:/\{\{[Pp]řeklad/'* and require pages.txt file and preklads.txt file at https://tools.wmflabs.org/urbanecmbot/test/preklads.txt. The first contains pages that should be processed and act as the generator, the second one is something like a database with exact templates which should be inserted. Both files are as an example in the attachments. When I try to run it at toollabs bastion, all works as it should. When I send the script to grid, it do not work (see sample output below). Why? Can somebody help me with it? Thank you in advance, Martin Urbanec / Urbanecm ; Output urbanecm@tools-bastion-02 ~/Documents/cswiki/addPrekladCT $ cat test.sh python3 addmissing.py -always -file:pages.txt -search:'-insource:/\{\{[Pp]řeklad/' urbanecm@tools-bastion-02 ~/Documents/cswiki/addPrekladCT $ jsub bash test.sh Your job 6201363 ("bash") has been submitted urbanecm@tools-bastion-02 ~/Documents/cswiki/addPrekladCT $ qstat job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 6201363 0.30000 bash urbanecm r 06/16/2017 18:14:42 task(a)tools-exec-1404.eqiad.wmf 1 urbanecm@tools-bastion-02 ~/Documents/cswiki/addPrekladCT $ ls ~/bash.* /home/urbanecm/bash.err /home/urbanecm/bash.out urbanecm@tools-bastion-02 ~/Documents/cswiki/addPrekladCT $ cat ~/bash.* Traceback (most recent call last): File "addmissing.py", line 223, in <module> main() File "addmissing.py", line 183, in main local_args = pywikibot.handle_args(args) File "/shared/pywikipedia/core/pywikibot/bot.py", line 954, in handle_args writeToCommandLogFile() File "/shared/pywikipedia/core/pywikibot/bot.py", line 1128, in writeToCommandLogFile command_log_file.write(s + os.linesep) File "/usr/lib/python3.4/codecs.py", line 711, in write return self.writer.write(data) File "/usr/lib/python3.4/codecs.py", line 368, in write data, consumed = self.encode(object, self.errors) UnicodeEncodeError: 'utf-8' codec can't encode character '\udcc5' in position 67: surrogates not allowed CRITICAL: Closing network session. <class 'UnicodeEncodeError'> urbanecm@tools-bastion-02 ~/Documents/cswiki/addPrekladCT $

1 0

Getting code review flowing again
by Maarten Dammers 05 Jun '17

05 Jun '17

Hi everyone, On the recent hackathon in Vienna we talked about the large number of changes still open and how to get the flow back. We currently have over 300 open changes going back to 2014 ( https://gerrit.wikimedia.org/r/#/q/status:open+project:pywikibot/core ). A change is in Gerrit because the developer wants code review to get it merged. Code review might not be a lot of fun and this is made worse by this huge backlog. A lot of the changes have issues preventing this: * Merge conflict, needs to be rebased * Not verified, tests fail * Code review -1, -2 My proposal is to abandon the changes we're not going to work on anyway and focus our attention on the changes we do want to get merged. I understand that some changes in which people invested a lot of time and effort will get abandoned, but I think the benefit of getting the code review process back on track is higher. Abandoned changes are not gone, we can always open them again. I ask everyone who has (a lot of) old open changes to have a look at them and make the decision: Pick it up or abandon. If the change is linked to a phabricator task, it would be nice to update the task too. Thank you, Maarten

4 4

EventStreams
by info＠gno.de 02 Jun '17

02 Jun '17

Hi folks, I've added a patch [1] for the new EventStreams web service [2] which will replace RCStream soon. The new library part is ready to review (and two of my scripts use it for a long term test). I've added a test suite but unfortunately this fails due to missing installation of the needed sseclient. Could anybody give a hint how to setup the nose test to install this needed library. Thanks a lot xqt [1] https://gerrit.wikimedia.org/r/#/c/346164/ [2] https://wikitech.wikimedia.org/wiki/EventStreams

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot June 2017