Happy Monday,
There are strange people who make such links (kindof urlencoded?):
[[Második világháború#Partrasz.C3.A1ll.C3.A1s Szic.C3.ADli.C3.A1ban
.28Huskey hadm.C5.B1velet.29|Huskey hadműveletben]]
So the section title must have been copied from the URL.
Do we have a ready tool to fix these?
--
Bináris
Hello all
>From one of my assignments as a bot operator I have some code which
does template parsing and general text parsing (e.g. Image/File tags).
It is not using regex and thus able to correctly parse nested
templates and other such nasty things. I have written those as library
classes and written tests for them which cover almost all of the code.
I would now really like to contribute that code back to the community.
Would you be interested in adding this code to the pywikibot
framework? If yes, can I send the code to someone for code review or
how do you usually operate?
Greetings
Hannes
PS: wiki userpage is http://en.wikipedia.org/wiki/User:Hannes_R%C3%B6st
I found this in the source code of scripts/reflinks.py:
Distributed under the terms of the GPL
This seems to be the single case in the whole repository. Is it
compatible with our license conventions?
It even doesn't have a full GNU-style license header.
Hi all,
I want to do the following: I want to extract all templates from a
Wikipedia page with pywikibot.extract_templates_and_params(pagetext),
that works fine. Now for certain fields in certain templates I want to
parse the parameter itself. Or better said: I want to resolve (is this
the correct term?) any included templates in such parameters, so
basically I want to get the wikitext after parsing any templates within
such parameters. As an example if my description is a bit too vague: I
have this template
{{Infobox number
number = 4
following-number = {{add_one_and_link_it|4}}
}}
The wikitext returned by the "add_one_and_link_it" template would be
"[[5]]". Now, can I do this with pywikibot, too? Pass some string (in my
case: extracted from a template) to a function and the bot will pass
this to Wikipedia to get the final wikitext (I want to parse that wikitext)?
Frank
Hello,
I have list of names, which exists both in article and category namespace
Foo | Category:Foo
Bar | Category:Bar
I want to link them together:
To every category I want to add {{Catmore}}, so I use:
add_text.py -file:skwiki.txt -up
-text:"{{Catmore|{{subst:PAGENAME}}}}" -except:"\{\{[Cc]atmore(.*?)"
-lang:sk
And I want add to every article "[[Category:{{subst:PAGENAME}}| ]]",
ideally as first category. But I didn't found any suitable script for
this.
I can add it without checking existence as last category, but this
will lead to duplicate categories in article:
add_text.py -file:skwiki1.txt -text:"[[Category:{{subst:PAGENAME}}| ]]" -lang:sk
Have you any idea how to make it?
JAnD
I've given up trying to solve a bug that popped up in my scripts a couple
days ago. I run a bot for Wookieepedia, over at Wikia, and run three simple
scripts on a daily basis. They are set up to run automatically through
Windows Task Scheduler. Since they run automatically, they run in the
background through pythonw.exe, i.e. without a console, and therefore I
need a means of getting the output. My solution for the past two months has
been to redirect sys.stdout and sys.stderr to the same StringIO() instance,
then at the end call getvalue() on that and email it to myself.
This worked perfectly until a couple days ago. Suddenly, I stopped
receiving anything sent through pywikibot.output() or its cousins, although
I continued to receive my own output that was produced by print statements.
After some experimenting in the interactive interpreter, I determined that
somehow pywikibot.ui (the interface instance) is not storing the correct
stdout and stderr, but I don't know what's causing this.
Nothing in my scripts changed around the time this started happening, and I
had not updated pywikibot or python itself in quite a while. I did update
pywikibot to the newest nightly version, but the bug persists. I'm asking
here since this is directly connected to pywikibot. Any idea what could be
going on?
(By the way, the answer is NOT "switch to core". I have tried to get core
to run on my system and failed miserably after two hours of repeated
attempts without even getting it to talk to the wiki. Compat worked
perfectly on the first try. Until such time as core can be installed by a
beginner, it is not for me.)
Jonathan Goble
Many scripts accept page titles spanning multiple command line
arguments, usually put into an array called titleParts and joined
together. It is redundant to pagegenerators argument -page:"..." ,
and a poor equivalent as only one page can be specified with
titleParts. Also not using quotes on the command line allows the
interpreter to mangle the command line arguments before they are given
to the script.
We have one changeset proposing to remove that functionality in core.
https://gerrit.wikimedia.org/r/#/c/137354/
And I vaguely recall that a similar change by Ricordi Samoa to another
script has already been merged.
I agree with Ricordi that the titleParts pattern isnt a very good one,
and 'should' be removed, but .. do users find it convenient? Is it
mostly for Windows? If it is desirable, we could build this
functionality into pagegenerators, and able to be enabled/disabled in
the config.
--
John Vandenberg