Brion Vibber wrote:
Tim Starling wrote:
Is there any way to obtain regular updates of the database of test results? Since Brion removed the "TODO" tags, it appears to be very difficult to distinguish expected fail tests from newly failing tests without it.
The big old "[Has never passed]" seems like a clue to me. :)
Here is the output from my working copy:
This is MediaWiki version 1.10alpha (r21715).
Reading tests from "maintenance\parserTests.txt"... Running test URL-encoding in URL functions (single parameter)... FAILED! Running test URL-encoding in URL functions (multiple parameters)... FAILED! Running test Table security: embedded pipes (http://mail.wikipedia.org/pipermail/wikitech-l/2006-April/034637.html)... FAILED! Running test Link containing double-single-quotes '' (bug 4598)... FAILED! Running test message transform: <noinclude> in transcluded template (bug 4926)... FAILED! Running test message transform: <onlyinclude> in transcluded template (bug 4926)... FAILED! Running test BUG 1887, part 2: A <math> with a thumbnail- math enabled... FAILED! Running test HTML bullet list, unclosed tags (bug 5497)... FAILED! Running test HTML ordered list, unclosed tags (bug 5497)... FAILED! Running test HTML nested bullet list, open tags (bug 5497)... FAILED! Running test HTML nested ordered list, open tags (bug 5497)... FAILED! Running test Fuzz testing: image with bogus manual thumbnail... FAILED! Running test Inline HTML vs wiki block nesting... FAILED! Running test Mixing markup for italics and bold... FAILED! Running test dt/dd/dl test... FAILED! Running test Images with the "|" character in the comment... FAILED! Running test Parents of subpages, two levels up, without trailing slash or name.... FAILED! Running test Parents of subpages, two levels up, with lots of extra trailing slashes.... FAILED! Reading tests from "......\C:\htdocs\w2\extensions\Cite\citeParserTests.txt"... Running test Simple <ref>, with <references/>... FAILED! Running test <ref> with a simple template... FAILED! Running test <ref> with a <nowiki>... FAILED! Running test <ref> in a <nowiki>... FAILED! Running test <ref> in a <!--comment-->... FAILED! Running test <!--comment--> in a <ref> (bug 5384)... FAILED! Running test <references> after <gallery> (bug 6164)... FAILED! Running test {{REVISIONID}} on page with <ref> (bug 6299)... FAILED! Running test Blank ref followed by ref with content... FAILED! Running test Regression: non-blank ref "0" followed by ref with content... FAILED! Running test Regression sanity check: non-blank ref "1" followed by ref with content... FAILED! Passed 476 of 505 tests (94.26%)... 29 tests failed!
OK, 29 tests failed, 18 failed in the email, which ones are different? I don't see any [Has never passed] notes. I could put this test output side by side with the email and manually mark off the differences, but avoiding that tedious task was the reason I put the TODO annotations in the test names in the first place. I could make a copy of my working directory, revert any relevant changes, run parserTests.php on each and do a diff, but that's hardly easier either. I could set up a script to run parserTests.php say once a day and save the results to the database, but then new checked-in expected failures would be difficult to distinguish from working copy changes. I could run it every time I do an svn up, but half a dozen complications with that approach immediately come to mind.
Your reason for removing the TODO notes was "The test runner can now indicate items which have never passed, and twiddling with the title constantly means they can't be tracked, so it's better not to have such markings."
They could be tracked if we added a numeric identifier to each test, as has been suggested several times. Then we could tweak the description in all sorts of ways and still be able to track them. And we wouldn't have to try to think of a unique regex to refer to a test when we want to run it.
The only person who knows whether a new test is an expected pass or an expected fail is the person who adds the test to parserTests.txt. If someone commits a new failing test, you have to ask them whether it was intended. It makes perfect sense to me to annotate parserTests.txt in some way (not necessarily in the test title), to indicate expected fails.
Not that I understand why we have expected failing tests at all. But that's a crusade for another day.
-- Tim Starling