Hi folks,
Some thoughts on the following from someone who worked on Pywikibot
tests a /very/ long time (>10 years...) ago:
On 26/06/2023 00:47, Roy Smith wrote:
I'm concerned that we're writing tests which
depend on knowing the
current contents of a live production wiki.
On 15/05/2023 21:11, YiFei Zhu wrote:
This way, if a unit test fails, you know
it's Pywikibot's code that is to blame and not something like a
MediaWiki change or the environment setup.
First of all: tests that randomly fail, and tests that require some
not-so-well documented set of steps to run at all are /a pain/. So
there's certainly room for improvement there.
On thing to keep in mind is that Pywikibot is a bit weird, as there is
only a limited set of things it does /on itself/. The value of Pywikibot
lies in the interaction with Mediawiki, and by and large /with the
specific version of Mediawiki deployed on Wikimedia wikis (including all
the weird extensions)/. And because of that, there is a massive value in
tests that test /the integration with Wikimedia wikis/: it's great if
mocked API calls work, but if it breaks against the real API, Pywikibot
is not doing its job for the user. Unfortunately that does mean the
/contents/ of the wiki now are a dependency for the test as well.
The alternative is mocking bits of Pywikbot: either the API requests or
the Site object itself. The former is something I've played with in the
past by recording (caching) the requests sent to the actual site.
Although that works, there are two main issues: you regularly need to
update the 'logged' requests (as some may be missing etc), and you run
the risk of leaking private data. The second method sounds more
appealing, but in practice, you'll be mocking /every/ method. In effect,
at that point you're re-implementing the Mediawiki API in Python. Might
as well talk to the real one then?
With Docker containers being a fairly standard thing (although still a
pain under Windows) spinning up a test container with a well-known MW
version and well-known content might be the best of both worlds. A
stable development environment while still being able to run the tests
against both a 'latest' MW container and a live site (but that can then
just be done on the CI server rather than trying to run all local tests
on a live site?).
In the end, all approaches have their own share of pain. Live testing
requires a fiddly live-site-setup, mocking is brittle, and docker....
has its own steep learning curve. The only true advice I can give here
is 'choose your battles' and 'document everything you can, and then
some' :-)
Cheers,
Merlijn