<quote name="George Herbert" date="2014-03-06" time="15:53:50 -0800">
To take a couple of steps back...
This happened because testing isn't robust enough?
That should be discussed and followed up on.
Ish.
The suite of automatic tests caught this bug, actually. It's how the mobile team found out about it as they got to work this morning. So the testing is quite robust.
I'd argue, that the revert didn't happen fast enough.
"Whoa! Greg! Don't be incendiary!"
What I mean by that is:
I want us (where 'us' == any developer writing MediaWiki or extension code) to get to the point where we reject and revert any commit which breaks the test suite. Basically, when a test fails after a commit you (as the developer) should:
A) fix it right away (like, now) or B) revert your commit that broke it and work on a fix
B enables other people to continue working with a good state of the software.
Doing C, which is what happened here, makes *your work stop other people's productivity*. Period.
This should happen no matter where the test fails; if it's "your code" or not. Your code caused it (in the sense that the test didn't fail before), so you should work on fixing it.
There's a bit more nuance here:
A test can fail for any number of reasons including badly written tests or the test infrastructure failing somehow. Part of the above decision tree should include determining what actually broke. If the test was badly written, rewrite it or remove it if it no longer applies.
We aren't to this point yet; we need a bit more test coverage and we need to speed up the feedback cycle for auto browser tests, but we're headed in this direction. On purpose.
We need to get in the habit of making every commit deployable. No more breaking beta cluster for a day while we work something out. No more breaking other parts of the code base and ignoring it because 'it's not core'.
Greg