pywikibot June 2014

pywikibot@lists.wikimedia.org

27 participants
44 discussions

[Pywikipedia-l] Urlencoded section titles
by Bináris 13 Sep '18

13 Sep '18

Happy Monday, There are strange people who make such links (kindof urlencoded?): [[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban .28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]] So the section title must have been copied from the URL. Do we have a ready tool to fix these? -- Bináris

3 11

[Pywikipedia-l] Template parsing code
by Hannes Röst 02 Sep '16

02 Sep '16

Hello all >From one of my assignments as a bot operator I have some code which does template parsing and general text parsing (e.g. Image/File tags). It is not using regex and thus able to correctly parse nested templates and other such nasty things. I have written those as library classes and written tests for them which cover almost all of the code. I would now really like to contribute that code back to the community. Would you be interested in adding this code to the pywikibot framework? If yes, can I send the code to someone for code review or how do you usually operate? Greetings Hannes PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st

5 13

[Pywikipedia-l] pep257 docstring style checker
by Ricordisamoa 05 Nov '14

05 Nov '14

You must see this! https://github.com/GreenSteam/pep257 Just run it on our codebase, it gives thousands of errors...

5 6

[Pywikipedia-l] reflinks.py under GPL?
by Ricordisamoa 01 Aug '14

01 Aug '14

I found this in the source code of scripts/reflinks.py: Distributed under the terms of the GPL This seems to be the single case in the whole repository. Is it compatible with our license conventions? It even doesn't have a full GNU-style license header.

8 10

[Pywikipedia-l] Pywikibot and Wikimania
by Amir Ladsgroup 23 Jul '14

23 Jul '14

Hello all, The submission <https://wikimania2014.wikimedia.org/wiki/Submissions/Bots_and_Pywikibot> about pywikibot got accepted and it'll be given in August 10, 11:30-12:00 And I think that would be awesome if we can work on bugs or developing new features during the wikimania hackathon. Off-topic: Just to make you smile: Things Programmers Say Vs What They Mean <http://www.tickld.com/cdn_image_article/a_538_20140602111023.png> Best -- Amir

4 5

[Pywikipedia-l] Resolve(?) templates (get final wikitext) from a string with pywikibot
by Frank Wein 29 Jun '14

29 Jun '14

Hi all, I want to do the following: I want to extract all templates from a Wikipedia page with pywikibot.extract_templates_and_params(pagetext), that works fine. Now for certain fields in certain templates I want to parse the parameter itself. Or better said: I want to resolve (is this the correct term?) any included templates in such parameters, so basically I want to get the wikitext after parsing any templates within such parameters. As an example if my description is a bit too vague: I have this template {{Infobox number number = 4 following-number = {{add_one_and_link_it|4}} }} The wikitext returned by the "add_one_and_link_it" template would be "[[5]]". Now, can I do this with pywikibot, too? Pass some string (in my case: extracted from a template) to a function and the bot will pass this to Wikipedia to get the final wikitext (I want to parse that wikitext)? Frank

1 1

[Pywikipedia-l] Several questions (mainly about guideline)
by Sorawee Porncharoenwase 27 Jun '14

27 Jun '14

1) What is the preferable shebang? What I have seen are: - #!/usr/bin/python (such as scripts/add_text.py) - #!/usr/bin/env python (such as scripts/archivebot.py) - no shebang (such as scripts/catall.py) 2) Why some scripts (such as blockpageschecker.py) are executable , while the others (such as archivebot.py) are not? 3) Why don't we always use async=True with all scripts? I admit that I don't know whether async has a negative impact, so I asked it here. Anyway, I think that we can add async=True to touch.py, for example, without causing any problem. 4) Which one is preferable between "summary" and "reason"? I think this topic was previously discussed before, but I can't remember the final resolution. /^ *def [^)]*?(reason|summary)/ page.py - reason def move(self, newtitle, reason=None, movetalkpage=True, sysop=False, def delete(self, reason=None, prompt=True, mark=False): def protect(self, edit='sysop', move='sysop', create=None, upload=None, unprotect=False, reason=None, prompt=True, expiry=None): def block(self, expiry, reason, anononly=True, nocreate=True, - summary def removeImage(self, image, put=False, summary=None, safe=True): def replaceImage(self, image, replacement=None, put=False, summary=None, site.py - reason def blockuser(self, user, expiry, reason, anononly=True, nocreate=True, def unblockuser(self, user, reason): - summary def editpage(self, page, summary, minor=True, notminor=False, def movepage(self, page, newtitle, summary, movetalk=True, def deletepage(self, page, summary): def protect(self, page, protections, summary, expiry=None): scripts in scripts directory - summary def add_text(page=None, addText=None, summary=None, regexSkip=None, def update(self, summary, sort_threads=False): def __init__(self, generator, always, summary=None): def __init__(self, generator, summary, always=False, undelete=True): def __init__(self, generator, oldImage, newImage=None, summary='', def __init__(self, generator, addprefix, noredirect, movetalkpage, always, skipredirects, summary): def __init__(self, reader, force, append, summary, minor, autosummary, def __init__(self, generator, summary, always=False, unprotect=False, def __init__(self, generator, acceptall=False, limit=None, ignorepdf=False, summary=None): def __init__(self, generator, replacements, exceptions={}, acceptall=False, allowoverlap=False, recursive=False, addedCat=None, sleep=None, summary='', site=None): Sorawee Porncharoenwase

4 3

[Pywikipedia-l] How to: Categories and articles with same name
by Jan Dudík 26 Jun '14

26 Jun '14

Hello, I have list of names, which exists both in article and category namespace Foo | Category:Foo Bar | Category:Bar I want to link them together: To every category I want to add {{Catmore}}, so I use: add_text.py -file:skwiki.txt -up -text:"{{Catmore|{{subst:PAGENAME}}}}" -except:"\{\{[Cc]atmore(.*?)" -lang:sk And I want add to every article "[[Category:{{subst:PAGENAME}}| ]]", ideally as first category. But I didn't found any suitable script for this. I can add it without checking existence as last category, but this will lead to duplicate categories in article: add_text.py -file:skwiki1.txt -text:"[[Category:{{subst:PAGENAME}}| ]]" -lang:sk Have you any idea how to make it? JAnD

2 1

[Pywikipedia-l] Compat: pywikibot.output() forces everything to original stdout and stderr
by Jonathan Goble 25 Jun '14

25 Jun '14

I've given up trying to solve a bug that popped up in my scripts a couple days ago. I run a bot for Wookieepedia, over at Wikia, and run three simple scripts on a daily basis. They are set up to run automatically through Windows Task Scheduler. Since they run automatically, they run in the background through pythonw.exe, i.e. without a console, and therefore I need a means of getting the output. My solution for the past two months has been to redirect sys.stdout and sys.stderr to the same StringIO() instance, then at the end call getvalue() on that and email it to myself. This worked perfectly until a couple days ago. Suddenly, I stopped receiving anything sent through pywikibot.output() or its cousins, although I continued to receive my own output that was produced by print statements. After some experimenting in the interactive interpreter, I determined that somehow pywikibot.ui (the interface instance) is not storing the correct stdout and stderr, but I don't know what's causing this. Nothing in my scripts changed around the time this started happening, and I had not updated pywikibot or python itself in quite a while. I did update pywikibot to the newest nightly version, but the bug persists. I'm asking here since this is directly connected to pywikibot. Any idea what could be going on? (By the way, the answer is NOT "switch to core". I have tried to get core to run on my system and failed miserably after two hours of repeated attempts without even getting it to talk to the wiki. Compat worked perfectly on the first try. Until such time as core can be installed by a beginner, it is not for me.) Jonathan Goble

3 6

[Pywikipedia-l] title parts on the command line
by John Mark Vandenberg 24 Jun '14

24 Jun '14

Many scripts accept page titles spanning multiple command line arguments, usually put into an array called titleParts and joined together. It is redundant to pagegenerators argument -page:"..." , and a poor equivalent as only one page can be specified with titleParts. Also not using quotes on the command line allows the interpreter to mangle the command line arguments before they are given to the script. We have one changeset proposing to remove that functionality in core. https://gerrit.wikimedia.org/r/#/c/137354/ And I vaguely recall that a similar change by Ricordi Samoa to another script has already been merged. I agree with Ricordi that the titleParts pattern isnt a very good one, and 'should' be removed, but .. do users find it convenient? Is it mostly for Windows? If it is desirable, we could build this functionality into pagegenerators, and able to be enabled/disabled in the config. -- John Vandenberg

1 0

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

pywikibot June 2014