Folks, I really made a lot of effort, even asked somebody to help IRL, but
I am tired.
I want develop Pywikibot, instead I am struggling with the working
environment. Although git is hundred times as complicated as SVN and gerrit
is a nightmare, my main problem is with git installing i18n submodule.
See https://phabricator.wikimedia.org/T329452
It causes two main problems:
- I cannot run tests. I have another copy of Pywikibot from downloaded
zip, I can run tests there, but the same command fails in git copy.
- I cannot push my commits. For some reason an i18n part is always
included which makes Jenkins fail. Now I can remove it after pushing, but
Jenkins fails again, see
https://gerrit.wikimedia.org/r/c/pywikibot/core/+/888745
Error message:
https://integration.wikimedia.org/ci/job/pywikibot-core-tox-doctest-docker/…
: FAILURE in 33s
I am very frustrated and disappointed, but I cannot do anything until
T329452 is solved somehow.
--
Bináris
https://www.mediawiki.org/wiki/Manual:Pywikibot/Development/Guidelines#Misc…
says:
" Prefer f-strings over string.format(). Modulo operator % for string
formatting should be avoided."
I tried to rewrite a modulo-formatted regex to f-string, but than realized,
that in f-strings all curly braces must be doubled, which makes regexes
very hard to read and easy to misspell.
What is the best practice when you substitute a variable into a regex?
--
Bináris
I tried to list delete log entries:
from pywikibot.data.api import LogEntryListGenerator
gen = LogEntryListGenerator('delete')
print(next(gen))
I get a wirning and an error:
WARNING: c:\Pywikibot\pywikibot\data\api\_generators.py:857: *RuntimeWarning:
LogEntryListGenerator invoked without a site*
super().__init__('logevents', **kwargs)
Traceback (most recent call last):
File "pwb.py", line 39, in <module>
sys.exit(main())
File "pwb.py", line 35, in main
runpy.run_path(str(path), run_name='__main__')
File "C:\Python37\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Python37\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "pywikibot\scripts\wrapper.py", line 516, in <module>
main()
File "pywikibot\scripts\wrapper.py", line 500, in main
if not execute():
File "pywikibot\scripts\wrapper.py", line 487, in execute
run_python_file(filename, script_args, module)
File "pywikibot\scripts\wrapper.py", line 148, in run_python_file
main_mod.__dict__)
File "scripts\userscripts\reasonstat.py", line 141, in <module>
main()
File "scripts\userscripts\reasonstat.py", line 112, in main
print(next(gen))
File "C:\Python37\lib\_collections_abc.py", line 317, in __next__
return self.send(None)
File "c:\Pywikibot\pywikibot\tools\collections.py", line 275, in send
return next(self._started_gen)
File "c:\Pywikibot\pywikibot\data\api\_generators.py", line 628, in
generator
yield from self._extract_results(resultdata)
File "c:\Pywikibot\pywikibot\data\api\_generators.py", line 572, in
_extract_results
result = self.result(item)
File "c:\Pywikibot\pywikibot\data\api\_generators.py", line 864, in result
return self.entryFactory.create(pagedata)
File "c:\Pywikibot\pywikibot\logentries.py", line 339, in create
return self._creator(logdata)
File "c:\Pywikibot\pywikibot\logentries.py", line 329, in <lambda>
self._creator = lambda data: logclass(data, self._site)
File "c:\Pywikibot\pywikibot\logentries.py", line 42, in __init__
.format(expected_type, self.type()))
*pywikibot.exceptions.Error: Wrong log type! Expecting delete, received
review instead.CRITICAL: Exiting due to uncaught exception <class
'pywikibot.exceptions.Error'>*
What's wrong?
--
Bináris
I want to turn wikitext into HTML for display on a web front-end I'm building. For what I'm doing, all I need is a few constructs like wiklinks, bold, and italic, which I'm able to do with a smallish amount of mwparserfromhell code.
The one annoyance I've got now is I'm using bootstrap <https://getbootstrap.com/> in a web front-end, so I don't want <b> and <i> HTML tags. I want is <span class="fw-bold"> <https://getbootstrap.com/docs/5.0/utilities/text/#font-weight-and-italics> (and likewise class="fst-italic") Is there some way to tell mwparserfromhell.nodes.Tag to use that alternate markup when it processes bold or italic wikicode?
[[:en:Template:Did you know/Queue/NextPrep]] contains:
> 4<noinclude> <!-- There is no Prep 8! -->
> {{documentation|content=This number indicates the next DYK prep set to move into the queue.}}
> </noinclude>
What I want to get is just the "4". Is Page.extract() what I'm looking for? Experimentally, it does what I want, but it's not clear if this is actually the intended use case.
I'm trying to parse DYK prep area templates, for example Template:Did you know/Preparation area 3 <https://en.wikipedia.org/wiki/Template:Did_you_know/Preparation_area_3>. Unfortunately, these are more like flat text files than any kind of nicely structured data. The stuff of interest is everything between two HTML comments:
> <!--Hooks-->
> {{main page image/DYK|image=Melissa Ong.webp|caption=Selfie of Ong, commonly replicated by the Step Chickens<!--the caption length is intentional, it highlights that this image is there for a specific purpose and isn't just any image of Ong – please don't shorten it! Same for the ''(shown)'' –leek -->}}
> * ... that "Step Chickens" on TikTok replace their profile pictures with an image ''(shown)'' of '''[[Melissa Ong]]''', whom they call "Mother Hen"?
> * ... that '''[[interfaith greetings in Indonesia]]''' include phrases from Islam, Christianity, Hinduism, Buddhism, and Confucianism?
> * ... that '''[[Kimmo Leinonen]]''' helped establish both the [[Finnish Hockey Hall of Fame]] and the [[IIHF Hall of Fame]]?
> * ... that the [[Pulitzer Prize for Fiction|Pulitzer Prize]]-winning novel '''''[[All the Light We Cannot See]]''''' contains a sympathetic [[Nazism|Nazi]]?
> * ... that a {{Convert|10|ft|m|adj=mid|-tall|0}} '''[[Lady Rainier|statue of a woman]]''' in [[Seattle]] was commissioned by a local brewery in 1903?
> * ... that ...
> * ... that prior to entering politics, '''[[Herbert Salvatierra]]''' led a troupe of [[carnival]] ''[[comparsa]]s''?
> * ... that [[Winston Churchill]] published '''[[Are There Men on the Moon?|an essay on extraterrestrial life]]''' during the Second World War?
> <!--HooksEnd-->
I can find the comments with Wikicode.filter_comments(). But once I've found the two delimiting comments, how do I grab the text between them? Or is the parser the wrong tool? Would I do better to treat the content of the page as flat text and just iterate over it line by line, teasing it apart with regexes?