Sure, the main problem I see with the proposed set up
[1] is that you
have no way of ensuring the MediaWiki you are hitting is in a consistent
state; having tests fail because of edit conflicts, modified pages,
users that already exist, have been blocked, etc. as a result of other
tests, is very annoying; (tests can't be relied on to clean up after
themselves, and one failure should not cause the rest of the suite to
fail until it is manually fixed)
The diagram was created with manual testing against the usability
prototypes in mind. I'll update it to include automated testing with a
test runner and code review when we work out the plan.
To some extent this can be worked around by using
carefully selected
random parameters for many things, but that is a horrible hack, and
requires extra work in writing test scripts; though as I assume/hope
they will be written in PHP, not a huge difficultly, providing we teach
everyone how.
Much cleaner is to have a set up like the current parser tests where
each test can specify which articles it expects to exist with what
content (selenium tests may also wish to specify which users exist with
what privileges/preferences as well) in addition to being able to tweak
configuration settings (otherwise we're going to need a fair few
MediaWiki's even to test configurations that are live at WikiMedia).
This is quite readily doable, if you run a MediaWiki instance on the
same machine as the test runner, and I imagine it would also be possible
to do by building a communication protocol between the two, though that
seems like a waste of effort.
This is what I was hoping for. I think the test runner should
reconfigure the wiki for each test. If we want to be able to run
multiple tests in parallel, we should have multiple wikis that can be
reconfigured, and tested against independently.
Do the parser tests only test core parser functionality, or do they
also test extensions, like ParserFunctions, and SyntaxHighlight GeSHi?
It is likely we'll have tests that will need to dynamically include
extensions, and configure them dynamically as well.
This won't be a problem for local developers, the
test runner, the
browser and MediaWiki are all on localhost, the tests can be written in
PHP (or exported from selenium IDE into PHP) and run with a wrapper
script (maintenance/seleniumTests.php - or w/e) that handles the
configuration, output is handled by PHPUnit, all happy.
For a selenium-grid setup, it's not so obvious how to do it, I'd suggest
that, instead of having developers run scripts against the grid
themselves, they simply request a run on a server designed for this
task, which runs the test through the grid using a hostname that will
resolve back to the runner. This allows easy local control over the
MediaWiki, and makes it reasonably easy to write an interface for normal
developers who won't/can't run selenium to run tests against MediaWiki.
For the grid setup, we were exploring the possibility of a test runner
that automatically tests commits, and reports them to Code Review,
like the parser tests do now. For the most part, people shouldn't be
hitting the grid, only bots, unless we have a QA team that is doing
something special.
I don't think reconfiguring MediaWiki per
test-script, or per set of
test-scripts is an outrageous overhead, Selenium is a very "enterprise"
tool, and booting virtual machines with browsers in is likely much more
costly than that. The advantages it gives are obvious, tests should not
fail because of faults in the testing environment, that just wastes time.
Yeah, depending on the browser, OS, and hardware specs of the machine,
browsers can take 10-70 seconds to run even simple scripts. The
overhead of re-configuring the wiki is nothing in comparison.
Cleaning the state of the browsers is probably not so
critical here, but
it's another "gotcha", if one test leaves the user logged in, and the
next test tries to click the "Login" link, it explodes, and vice/versa.
Selenium can helps somewhat here (if you persuade it to, and to varying
extents in various browser versions), but it's likely easier to cleanse
the database.
When selenium launches a browser, it does so using a clean profile. It
launches a fresh browser from a new profile every test it runs. This
shouldn't be an issue.
Respectfully,
Ryan Lane