Without the screenshot this time....
> On Mar 28, 2023, at 2:01 PM, Roy Smith <roy(a)panix.com> wrote:
>
> Hmmm. What I'm doing requires Page.expand_text(), which looks like it does a Page.get() followed by a Site.expand_text(), and it's the later which actually takes most of the time. That becomes an action=expandtemplates API call <https://www.mediawiki.org/w/api.php?action=help&modules=expandtemplates>, which I don't see any way to batch.
>
>
>
>
> <Screen Shot 2023-03-28 at 1.55.55 PM.png>
>
>> On Mar 28, 2023, at 1:04 PM, Kunal Mehta <legoktm(a)debian.org <mailto:legoktm@debian.org>> wrote:
>>
>> Hi,
>>
>> On 3/27/23 15:57, Roy Smith wrote:
>>> I need to issue a bunch of Page.get() requests in parallel.
>>
>> Please don't. From <https://www.mediawiki.org/wiki/API:Etiquette#Request_limit <https://www.mediawiki.org/wiki/API:Etiquette#Request_limit>>:
>>
>> "Making your requests in series rather than in parallel, by waiting for one request to finish before sending a new request, should result in a safe request rate."
>>
>> Instead of making parallel requests, you should make batched requests, which is how the preloading stuff Xqt mentioned works.
>>
>> -- Kunal / Legoktm
>> _______________________________________________
>> pywikibot mailing list -- pywikibot(a)lists.wikimedia.org <mailto:pywikibot@lists.wikimedia.org>
>> Public archives at https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/m… <https://lists.wikimedia.org/hyperkitty/list/pywikibot@lists.wikimedia.org/m…>
>> To unsubscribe send an email to pywikibot-leave(a)lists.wikimedia.org <mailto:pywikibot-leave@lists.wikimedia.org>
>>
>
I need to issue a bunch of Page.get() requests in parallel. My understanding is that pywikibot uses the requests library which is incompatible with async_io, so that's out. So what do people use? Threading <https://docs.python.org/3.9/library/threading.html>? Or, I see there's an async_io friendly requests port <https://github.com/rdbhost/yieldfromRequests>. Is there a way to make pywikibot use that?
Today in wikipedia:hu namespaces 118, 119 were set. See T33308.
https://phabricator.wikimedia.org/T333083
Now I have problems with the new namespaces.
Message: KeyError: '118 is not a known namespace. Maybe you should clear
the api cache.'
Source: Pywikibot\pywikibot\site\_namespace.py
>>> list(site.namespaces())
[-2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 90, 91, 92,
93, 100, 101, 710, 711, 828, 829, 2300, 2301, 2302, 2303]
How to clear the API cache? Should Pywikibot automatically recognaize the
new namespaces or shall I alter the code somewhere?
--
Bináris
To my surprise, wikicode.get_parent() does not get you the section a node is part of:
> import mwparserfromhell as mwp
>
> text = """==foo==
> {{Template:Foo}}
> """
> wikicode = mwp.parse(text)
> print(wikicode.get_tree())
>
> print('++++++++++')
>
> node = wikicode.nodes[-2]
> print(f"{node=}")
> print(f"{wikicode.get_parent(node)=}")
prints:
> ==
> foo
> ==
> \n
> {{
> Template:Foo
> }}
> \n
> ++++++++++
> node='{{Template:Foo}}'
> wikicode.get_parent(node)=None
Am I just doing this wrong?
I've got some code which is essentially:
> wikicode = mwp.parse(self.page.get())
> for node in wikicode.filter_templates(recursive=False, matches=title):
> wikicode.remove(node)
> self.page.text = str(wikicode)
> self.page.save()
which works, but it leaves an extra blank line behind where the template used to be. This is intended to be run on [[:en:Template talk:Did you know/Approved]], i.e. one template per line.
What's the best way to get rid of the blank lines? I'm trying to avoid just running a regex replacement on the raw text because that's fragile, but maybe theres really no good alternative here?
I'm gearing up to do some work (hopefully dive into fixing https://phabricator.wikimedia.org/T326650). I've gotten as far as closing the repo and running the existing unit tests. I get 4 failures:
FAILED tests/make_dist_tests.py::TestMakeDist::test_handle_args - AssertionError: '/Users/roy/pywikibot/pywikibot-git/tests/make_dist_tests.py' != '/Users/roy/pywikibot/venv/bin/pytest'
FAILED tests/make_dist_tests.py::TestMakeDist::test_handle_args_empty - AssertionError: '/Users/roy/pywikibot/pywikibot-git/tests/make_dist_tests.py' != '/Users/roy/pywikibot/venv/bin/pytest'
FAILED tests/make_dist_tests.py::TestMakeDist::test_handle_args_nodist - AssertionError: '/Users/roy/pywikibot/pywikibot-git/tests/make_dist_tests.py' != '/Users/roy/pywikibot/venv/bin/pytest'
FAILED tests/site_detect_tests.py::MediaWikiSiteTestCase::test_proofreadwiki - RuntimeError: Unsupported url: https://www.proofwiki.org/wiki/ <https://www.proofwiki.org/wiki/>
Are these known issues? Or something wrong with my environment?
I'm working on MacOS Monterey, with Python 3.9.
Hi,
I just noticed that I cannot fetch anymore information from Wikidata. I’m not editing, I’m just reading:
wikidata_item = pywikibot.ItemPage(wikidata_repo, arg)
and i get:
Sleeping for 25.8 seconds, 2023-03-07 11:15:43
Is there some problem on Wikidata infrastructure?
Salut
Dennis