I found this in the source code of scripts/reflinks.py:
Distributed under the terms of the GPL
This seems to be the single case in the whole repository. Is it
compatible with our license conventions?
It even doesn't have a full GNU-style license header.
Hi all,
I want to do the following: I want to extract all templates from a
Wikipedia page with pywikibot.extract_templates_and_params(pagetext),
that works fine. Now for certain fields in certain templates I want to
parse the parameter itself. Or better said: I want to resolve (is this
the correct term?) any included templates in such parameters, so
basically I want to get the wikitext after parsing any templates within
such parameters. As an example if my description is a bit too vague: I
have this template
{{Infobox number
number = 4
following-number = {{add_one_and_link_it|4}}
}}
The wikitext returned by the "add_one_and_link_it" template would be
"[[5]]". Now, can I do this with pywikibot, too? Pass some string (in my
case: extracted from a template) to a function and the bot will pass
this to Wikipedia to get the final wikitext (I want to parse that wikitext)?
Frank
Hello,
I have list of names, which exists both in article and category namespace
Foo | Category:Foo
Bar | Category:Bar
I want to link them together:
To every category I want to add {{Catmore}}, so I use:
add_text.py -file:skwiki.txt -up
-text:"{{Catmore|{{subst:PAGENAME}}}}" -except:"\{\{[Cc]atmore(.*?)"
-lang:sk
And I want add to every article "[[Category:{{subst:PAGENAME}}| ]]",
ideally as first category. But I didn't found any suitable script for
this.
I can add it without checking existence as last category, but this
will lead to duplicate categories in article:
add_text.py -file:skwiki1.txt -text:"[[Category:{{subst:PAGENAME}}| ]]" -lang:sk
Have you any idea how to make it?
JAnD
I've given up trying to solve a bug that popped up in my scripts a couple
days ago. I run a bot for Wookieepedia, over at Wikia, and run three simple
scripts on a daily basis. They are set up to run automatically through
Windows Task Scheduler. Since they run automatically, they run in the
background through pythonw.exe, i.e. without a console, and therefore I
need a means of getting the output. My solution for the past two months has
been to redirect sys.stdout and sys.stderr to the same StringIO() instance,
then at the end call getvalue() on that and email it to myself.
This worked perfectly until a couple days ago. Suddenly, I stopped
receiving anything sent through pywikibot.output() or its cousins, although
I continued to receive my own output that was produced by print statements.
After some experimenting in the interactive interpreter, I determined that
somehow pywikibot.ui (the interface instance) is not storing the correct
stdout and stderr, but I don't know what's causing this.
Nothing in my scripts changed around the time this started happening, and I
had not updated pywikibot or python itself in quite a while. I did update
pywikibot to the newest nightly version, but the bug persists. I'm asking
here since this is directly connected to pywikibot. Any idea what could be
going on?
(By the way, the answer is NOT "switch to core". I have tried to get core
to run on my system and failed miserably after two hours of repeated
attempts without even getting it to talk to the wiki. Compat worked
perfectly on the first try. Until such time as core can be installed by a
beginner, it is not for me.)
Jonathan Goble
Many scripts accept page titles spanning multiple command line
arguments, usually put into an array called titleParts and joined
together. It is redundant to pagegenerators argument -page:"..." ,
and a poor equivalent as only one page can be specified with
titleParts. Also not using quotes on the command line allows the
interpreter to mangle the command line arguments before they are given
to the script.
We have one changeset proposing to remove that functionality in core.
https://gerrit.wikimedia.org/r/#/c/137354/
And I vaguely recall that a similar change by Ricordi Samoa to another
script has already been merged.
I agree with Ricordi that the titleParts pattern isnt a very good one,
and 'should' be removed, but .. do users find it convenient? Is it
mostly for Windows? If it is desirable, we could build this
functionality into pagegenerators, and able to be enabled/disabled in
the config.
--
John Vandenberg
The following might be a bit unclear, as it's a bit of a brain dump. It's
mainly meant as a response to
https://gerrit.wikimedia.org/r/#/c/137904/2/tests/l10n_tests.py and
https://gerrit.wikimedia.org/r/#/c/137924/ and as 'food for thought'.
Basically, the question is how we can let the i18n not depend on the
hardcoded 'scripts.i18n' import - this is problematic for tests, is
problematic for pywikibot-installed-as-a-package (because there is no
scripts.i18n then) and is problematic for third party authors (because they
*have* to use the scripts.i18n folder to store their translations). I have
some thoughts on this, and maybe we can make something cool out of it.
Essentially, we would want a script to be able to indicate /where/ it's
i18n file is located. There's a few ways to do this, but I guess the
cleanest option is something like this:
- pywikibot.i18n gets an 'I18N' class which contains the current
twtranslate functions,
- this I18N class takes a parameter: the filename of the i18n translation
file (which, at some point, could also be a JSON file)
- maybe more filenames, if more translation files need to be loaded?
- or maybe a directory that contains translation files?
- we add a simple wrapper that would allow the current scripts to do
something like
import pywikibot.i18n
i18n = pywikibot.i18n.forScript(__file__)
where 'forScript' does some path parsing to change __file__ (= the filename
of the current file) from /path/to/original/file to
/path/to/original/i18n/file
which is the setup we are currently using.
I'm not sure about backwards compatibility, but I guess we could have a
pre-prepared pywikibot.i18n.twtranslate doing what it does now, via the
I18N class (listing all files, maybe?)
Please let me know if this sounds like a good idea to implement.
Merlijn
In https://gerrit.wikimedia.org/r/#/c/139792
I noticed that Page.delete() and page.protect() have a lot of user
interaction logic which would normally be in a script. e.g. asking a
user what actions to take. Also, they set a flag in the site object.
i.e. site._noDeletePrompt = True and site._noProtectPrompt = True
Is there any reason for it being in Page?
--
John Vandenberg