I have never been a QA engineer. However, it doesn't require great experience to see that the MW software development process is broken. I provide the following comments not in a destructive spirit. The success of the MW software is obvious. However, in my view unless the development process introduces some QA procedures, the code eventually will collapse and its reputation will degrade.
My interest in MW (the software, not the organization) is driven by a desire to provide an enhancement in the form of an extension. So, I began by building a small development environment on my machine (a work in progress). Having developed software for other organizations, I intuitively sought out what I needed in terms of testing in order to provide a good quality extension. This meant I needed to develop unit tests for my extension and also to perform regression testing on the main code base after installing it. Hence some of my previous questions to this email list.
It soon became apparent that the MW development process has little or no testing procedures. Sure, there are the parser tests, but I couldn't find any requirement that developers had to run them before submitting patches.
Out of curiosity, I decided to download 1.16a (r52088), use the LocalSettings file from my local installation (1.14) and run some parser tests. This is not a scientific experiment, since the only justification for using these extensions in the tests is I had them installed in my personal wiki. However, there is at least one thing to learn from them. The results are:
Mediawiki 52088 Parser Tests
Extensions : 1) Nuke, 2) Renameuser, 3) Cite, 4) ParserFunctions, 5) CSS Style Sheets, 6) ExpandTemplates, 7) Gadgets, 8) Dynamic Page List, 9) Labeled Section Transclusion. The last extension has 3 require_once files: a) lst.php, b) lsth.php, and c) compat.php.
Test Extensions ParserTests Test Fails
1 1,2,3,4,5,6,7,8,9 19 2 1 14 3 2 14 4 3 14 5 4 14 6 5 14 7 6 14 8 7 14 9 8 14 10 9 (abc) 19 11 9 (a) 18 12 9 (ab) 19 13 1,2,3,4,6,7 14
Note that the extension that introduces all of the unexpected parser test failures is Labeled Section Transclusion. According to its documentation, it is installed on *.wikisource.org, test.wikipedia.org, and en.wiktionary.org.
I am new to this development community, but my guess is since there are no testing requirements for extensions, its author did not run parser tests before publishing it. (I don't mean to slander him and I am open to the correction that it ran without unexpected errors on the MW version he tested against.)
This rather long preamble leads me to the point of this email. The MW software development process needs at least some rudimentary QA procedures. Here are some thoughts on this. I offer these to initiate debate on this issue, not as hard positions.
* Before a developer commits a patch to the code base, he must run parser tests against the change. The patch should not be committed if it increases the number of parser test failures. He should document the results in the bugzilla bug report.
* If a developer commits a patch without running parser tests or commits a patch that increases the number of parser test failures, he should be warned. If he does this another time with some time interval (say 6 months), his commit privileges are revoked for some period of time (say 6 months). The second time he becomes a candidate for commit privilege revocation, they will be revoked permanently.
* An extension developer also should run parser tests against a MW version with the extension installed. The results of this should be provided in the extension documentation. An extension should not be added to the extension matrix unless it provides this information.
* A test harness that performs regression tests (currently only parser tests) against every trunk versions committed in the last 24 hours should be run nightly. The installed extensions should be those used on the WMF machines. The results should be published on some page on the Mediawiki site. If any version increases the number of parser test failures, the procedure described above for developers is initiated.
* A group of developers should have the responsibility of reviewing the nightly test results to implement this QA process.
I am sure there are a whole bunch of other things that might be done to improve MW QA. The point of this message is to initiate a discussion on what those might be.
Please note that there are some parser tests which in theory should pass but never did in any version (thus they were not implemented in the software).
On Thu, Jul 16, 2009 at 5:55 PM, dan nessettdnessett@yahoo.com wrote:
I have never been a QA engineer. However, it doesn't require great experience to see that the MW software development process is broken. I provide the following comments not in a destructive spirit. The success of the MW software is obvious. However, in my view unless the development process introduces some QA procedures, the code eventually will collapse and its reputation will degrade.
My interest in MW (the software, not the organization) is driven by a desire to provide an enhancement in the form of an extension. So, I began by building a small development environment on my machine (a work in progress). Having developed software for other organizations, I intuitively sought out what I needed in terms of testing in order to provide a good quality extension. This meant I needed to develop unit tests for my extension and also to perform regression testing on the main code base after installing it. Hence some of my previous questions to this email list.
It soon became apparent that the MW development process has little or no testing procedures. Sure, there are the parser tests, but I couldn't find any requirement that developers had to run them before submitting patches.
Out of curiosity, I decided to download 1.16a (r52088), use the LocalSettings file from my local installation (1.14) and run some parser tests. This is not a scientific experiment, since the only justification for using these extensions in the tests is I had them installed in my personal wiki. However, there is at least one thing to learn from them. The results are:
Mediawiki 52088 Parser Tests
Extensions : 1) Nuke, 2) Renameuser, 3) Cite, 4) ParserFunctions, 5) CSS Style Sheets, 6) ExpandTemplates, 7) Gadgets, 8) Dynamic Page List, 9) Labeled Section Transclusion. The last extension has 3 require_once files: a) lst.php, b) lsth.php, and c) compat.php.
Test Extensions ParserTests Test Fails
1 1,2,3,4,5,6,7,8,9 19 2 1 14 3 2 14 4 3 14 5 4 14 6 5 14 7 6 14 8 7 14 9 8 14 10 9 (abc) 19 11 9 (a) 18 12 9 (ab) 19 13 1,2,3,4,6,7 14
Note that the extension that introduces all of the unexpected parser test failures is Labeled Section Transclusion. According to its documentation, it is installed on *.wikisource.org, test.wikipedia.org, and en.wiktionary.org.
I am new to this development community, but my guess is since there are no testing requirements for extensions, its author did not run parser tests before publishing it. (I don't mean to slander him and I am open to the correction that it ran without unexpected errors on the MW version he tested against.)
This rather long preamble leads me to the point of this email. The MW software development process needs at least some rudimentary QA procedures. Here are some thoughts on this. I offer these to initiate debate on this issue, not as hard positions.
Before a developer commits a patch to the code base, he must run parser tests against the change. The patch should not be committed if it increases the number of parser test failures. He should document the results in the bugzilla bug report.
If a developer commits a patch without running parser tests or commits a patch that increases the number of parser test failures, he should be warned. If he does this another time with some time interval (say 6 months), his commit privileges are revoked for some period of time (say 6 months). The second time he becomes a candidate for commit privilege revocation, they will be revoked permanently.
An extension developer also should run parser tests against a MW version with the extension installed. The results of this should be provided in the extension documentation. An extension should not be added to the extension matrix unless it provides this information.
A test harness that performs regression tests (currently only parser tests) against every trunk versions committed in the last 24 hours should be run nightly. The installed extensions should be those used on the WMF machines. The results should be published on some page on the Mediawiki site. If any version increases the number of parser test failures, the procedure described above for developers is initiated.
A group of developers should have the responsibility of reviewing the nightly test results to implement this QA process.
I am sure there are a whole bunch of other things that might be done to improve MW QA. The point of this message is to initiate a discussion on what those might be.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I understand. This was pointed out in a previous thread (see "Is this the right list to ask questions about parserTests").
--- On Thu, 7/16/09, Michael Rosenthal rosenthal3000@googlemail.com wrote:
From: Michael Rosenthal rosenthal3000@googlemail.com Subject: Re: [Wikitech-l] MW QA To: "Wikimedia developers" wikitech-l@lists.wikimedia.org Date: Thursday, July 16, 2009, 8:59 AM Please note that there are some parser tests which in theory should pass but never did in any version (thus they were not implemented in the software).
On Thu, Jul 16, 2009 at 5:55 PM, dan nessettdnessett@yahoo.com wrote:
I have never been a QA engineer. However, it doesn't
require great experience to see that the MW software development process is broken. I provide the following comments not in a destructive spirit. The success of the MW software is obvious. However, in my view unless the development process introduces some QA procedures, the code eventually will collapse and its reputation will degrade.
My interest in MW (the software, not the organization)
is driven by a desire to provide an enhancement in the form of an extension. So, I began by building a small development environment on my machine (a work in progress). Having developed software for other organizations, I intuitively sought out what I needed in terms of testing in order to provide a good quality extension. This meant I needed to develop unit tests for my extension and also to perform regression testing on the main code base after installing it. Hence some of my previous questions to this email list.
It soon became apparent that the MW development
process has little or no testing procedures. Sure, there are the parser tests, but I couldn't find any requirement that developers had to run them before submitting patches.
Out of curiosity, I decided to download 1.16a
(r52088), use the LocalSettings file from my local installation (1.14) and run some parser tests. This is not a scientific experiment, since the only justification for using these extensions in the tests is I had them installed in my personal wiki. However, there is at least one thing to learn from them. The results are:
Mediawiki 52088 Parser Tests
Extensions : 1) Nuke, 2) Renameuser, 3) Cite, 4)
ParserFunctions, 5) CSS Style Sheets, 6) ExpandTemplates, 7) Gadgets, 8) Dynamic Page List, 9) Labeled Section Transclusion. The last extension has 3 require_once files: a) lst.php, b) lsth.php, and c) compat.php.
Test Extensions
ParserTests Test Fails
1 1,2,3,4,5,6,7,8,9 19 2 1
14
3 2
14
4 3
14
5 4
14
6 5
14
7 6
14
8 7
14
9 8
14
10 9 (abc)
19
11 9 (a)
18
12 9 (ab)
19
13 1,2,3,4,6,7
14
Note that the extension that introduces all of the
unexpected parser test failures is Labeled Section Transclusion. According to its documentation, it is installed on *.wikisource.org, test.wikipedia.org, and en.wiktionary.org.
I am new to this development community, but my guess
is since there are no testing requirements for extensions, its author did not run parser tests before publishing it. (I don't mean to slander him and I am open to the correction that it ran without unexpected errors on the MW version he tested against.)
This rather long preamble leads me to the point of
this email. The MW software development process needs at least some rudimentary QA procedures. Here are some thoughts on this. I offer these to initiate debate on this issue, not as hard positions.
- Before a developer commits a patch to the code base,
he must run parser tests against the change. The patch should not be committed if it increases the number of parser test failures. He should document the results in the bugzilla bug report.
- If a developer commits a patch without running
parser tests or commits a patch that increases the number of parser test failures, he should be warned. If he does this another time with some time interval (say 6 months), his commit privileges are revoked for some period of time (say 6 months). The second time he becomes a candidate for commit privilege revocation, they will be revoked permanently.
- An extension developer also should run parser tests
against a MW version with the extension installed. The results of this should be provided in the extension documentation. An extension should not be added to the extension matrix unless it provides this information.
- A test harness that performs regression tests
(currently only parser tests) against every trunk versions committed in the last 24 hours should be run nightly. The installed extensions should be those used on the WMF machines. The results should be published on some page on the Mediawiki site. If any version increases the number of parser test failures, the procedure described above for developers is initiated.
- A group of developers should have the responsibility
of reviewing the nightly test results to implement this QA process.
I am sure there are a whole bunch of other things that
might be done to improve MW QA. The point of this message is to initiate a discussion on what those might be.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
dan nessett wrote:
I understand. This was pointed out in a previous thread (see "Is this the right list to ask questions about parserTests").
Dan, I certainly can say it'd be great to extend the parser test suite with a "known to fail" switch we can add to the tests which are known to fail. This would be helpful to folks trying to sort out their changes from longstanding minor issues without just losing the information in those tests [eg, that we know there are cases where we produce bad tag nesting, etc].
The test suite can definitely use some love; interested in poking at it?
-- brion
Sure. I'll give it a try. It would be a good way to learn more about the code base.
--- On Thu, 7/16/09, Brion Vibber brion@wikimedia.org wrote:
From: Brion Vibber brion@wikimedia.org Subject: Re: [Wikitech-l] MW QA To: "Wikimedia developers" wikitech-l@lists.wikimedia.org Date: Thursday, July 16, 2009, 9:40 AM dan nessett wrote:
I understand. This was pointed out in a previous
thread (see "Is this
the right list to ask questions about parserTests").
Dan, I certainly can say it'd be great to extend the parser test suite with a "known to fail" switch we can add to the tests which are known to fail. This would be helpful to folks trying to sort out their changes from longstanding minor issues without just losing the information in those tests [eg, that we know there are cases where we produce bad tag nesting, etc].
The test suite can definitely use some love; interested in poking at it?
-- brion
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Brion Vibber brion@wikimedia.org wrote:
I understand. This was pointed out in a previous thread (see "Is this the right list to ask questions about parserTests").
Dan, I certainly can say it'd be great to extend the parser test suite with a "known to fail" switch we can add to the tests which are known to fail. This would be helpful to folks trying to sort out their changes from longstanding minor issues without just losing the information in those tests [eg, that we know there are cases where we produce bad tag nesting, etc]. [...]
Perl's take on TAP provides "TODO" blocks for that purpose (and reports unexpected successes should they occur :-)); doesn't have PHP (PHPUnit?) something similar?
Tim
On Thu, Jul 16, 2009 at 6:18 PM, Tim Landscheidttim@tim-landscheidt.de wrote:
Perl's take on TAP provides "TODO" blocks for that purpose (and reports unexpected successes should they occur :-)); doesn't have PHP (PHPUnit?) something similar?
We don't use any standard testing framework. The parser tests were written by us from the ground up. In addition to not even attempting to cover a large majority of the software's operation.
At the developer's conference in Berlin this past spring, the void that is current our testing procedures was a common topic of conversation.
Put simply, QA is not exciting work for most people and our volunteer-oriented development process tends to result in very little or often no automated QA. We do have other ways of reviewing code, but until we have a staff QA person (which could happen in the near future) it's unlikely that this will change much.
That is, unless, all of a sudden, all of our volunteers get all excited about QA work and pitch in on developing and participating in a more robust QA process.
- Trevor
On 7/16/09 8:55 AM, dan nessett wrote:
I have never been a QA engineer. However, it doesn't require great experience to see that the MW software development process is broken. I provide the following comments not in a destructive spirit. The success of the MW software is obvious. However, in my view unless the development process introduces some QA procedures, the code eventually will collapse and its reputation will degrade.
My interest in MW (the software, not the organization) is driven by a desire to provide an enhancement in the form of an extension. So, I began by building a small development environment on my machine (a work in progress). Having developed software for other organizations, I intuitively sought out what I needed in terms of testing in order to provide a good quality extension. This meant I needed to develop unit tests for my extension and also to perform regression testing on the main code base after installing it. Hence some of my previous questions to this email list.
It soon became apparent that the MW development process has little or no testing procedures. Sure, there are the parser tests, but I couldn't find any requirement that developers had to run them before submitting patches.
Out of curiosity, I decided to download 1.16a (r52088), use the LocalSettings file from my local installation (1.14) and run some parser tests. This is not a scientific experiment, since the only justification for using these extensions in the tests is I had them installed in my personal wiki. However, there is at least one thing to learn from them. The results are:
Mediawiki 52088 Parser Tests
Extensions : 1) Nuke, 2) Renameuser, 3) Cite, 4) ParserFunctions, 5) CSS Style Sheets, 6) ExpandTemplates, 7) Gadgets, 8) Dynamic Page List, 9) Labeled Section Transclusion. The last extension has 3 require_once files: a) lst.php, b) lsth.php, and c) compat.php.
Test Extensions ParserTests Test Fails
1 1,2,3,4,5,6,7,8,9 19 2 1 14 3 2 14 4 3 14 5 4 14 6 5 14 7 6 14 8 7 14 9 8 14 10 9 (abc) 19 11 9 (a) 18 12 9 (ab) 19 13 1,2,3,4,6,7 14
Note that the extension that introduces all of the unexpected parser test failures is Labeled Section Transclusion. According to its documentation, it is installed on *.wikisource.org, test.wikipedia.org, and en.wiktionary.org.
I am new to this development community, but my guess is since there are no testing requirements for extensions, its author did not run parser tests before publishing it. (I don't mean to slander him and I am open to the correction that it ran without unexpected errors on the MW version he tested against.)
This rather long preamble leads me to the point of this email. The MW software development process needs at least some rudimentary QA procedures. Here are some thoughts on this. I offer these to initiate debate on this issue, not as hard positions.
Before a developer commits a patch to the code base, he must run parser tests against the change. The patch should not be committed if it increases the number of parser test failures. He should document the results in the bugzilla bug report.
If a developer commits a patch without running parser tests or commits a patch that increases the number of parser test failures, he should be warned. If he does this another time with some time interval (say 6 months), his commit privileges are revoked for some period of time (say 6 months). The second time he becomes a candidate for commit privilege revocation, they will be revoked permanently.
An extension developer also should run parser tests against a MW version with the extension installed. The results of this should be provided in the extension documentation. An extension should not be added to the extension matrix unless it provides this information.
A test harness that performs regression tests (currently only parser tests) against every trunk versions committed in the last 24 hours should be run nightly. The installed extensions should be those used on the WMF machines. The results should be published on some page on the Mediawiki site. If any version increases the number of parser test failures, the procedure described above for developers is initiated.
A group of developers should have the responsibility of reviewing the nightly test results to implement this QA process.
I am sure there are a whole bunch of other things that might be done to improve MW QA. The point of this message is to initiate a discussion on what those might be.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Thu, Jul 16, 2009 at 12:40 PM, Trevor Parscaltparscal@wikimedia.org wrote:
At the developer's conference in Berlin this past spring, the void that is current our testing procedures was a common topic of conversation.
Put simply, QA is not exciting work for most people and our volunteer-oriented development process tends to result in very little or often no automated QA. We do have other ways of reviewing code, but until we have a staff QA person (which could happen in the near future) it's unlikely that this will change much.
That is, unless, all of a sudden, all of our volunteers get all excited about QA work and pitch in on developing and participating in a more robust QA process.
- Trevor
On 7/16/09 8:55 AM, dan nessett wrote:
I have never been a QA engineer. However, it doesn't require great experience to see that the MW software development process is broken. I provide the following comments not in a destructive spirit. The success of the MW software is obvious. However, in my view unless the development process introduces some QA procedures, the code eventually will collapse and its reputation will degrade.
My interest in MW (the software, not the organization) is driven by a desire to provide an enhancement in the form of an extension. So, I began by building a small development environment on my machine (a work in progress). Having developed software for other organizations, I intuitively sought out what I needed in terms of testing in order to provide a good quality extension. This meant I needed to develop unit tests for my extension and also to perform regression testing on the main code base after installing it. Hence some of my previous questions to this email list.
It soon became apparent that the MW development process has little or no testing procedures. Sure, there are the parser tests, but I couldn't find any requirement that developers had to run them before submitting patches.
Out of curiosity, I decided to download 1.16a (r52088), use the LocalSettings file from my local installation (1.14) and run some parser tests. This is not a scientific experiment, since the only justification for using these extensions in the tests is I had them installed in my personal wiki. However, there is at least one thing to learn from them. The results are:
Mediawiki 52088 Parser Tests
Extensions : 1) Nuke, 2) Renameuser, 3) Cite, 4) ParserFunctions, 5) CSS Style Sheets, 6) ExpandTemplates, 7) Gadgets, 8) Dynamic Page List, 9) Labeled Section Transclusion. The last extension has 3 require_once files: a) lst.php, b) lsth.php, and c) compat.php.
Test Extensions ParserTests Test Fails
1 1,2,3,4,5,6,7,8,9 19 2 1 14 3 2 14 4 3 14 5 4 14 6 5 14 7 6 14 8 7 14 9 8 14 10 9 (abc) 19 11 9 (a) 18 12 9 (ab) 19 13 1,2,3,4,6,7 14
Note that the extension that introduces all of the unexpected parser test failures is Labeled Section Transclusion. According to its documentation, it is installed on *.wikisource.org, test.wikipedia.org, and en.wiktionary.org.
I am new to this development community, but my guess is since there are no testing requirements for extensions, its author did not run parser tests before publishing it. (I don't mean to slander him and I am open to the correction that it ran without unexpected errors on the MW version he tested against.)
This rather long preamble leads me to the point of this email. The MW software development process needs at least some rudimentary QA procedures. Here are some thoughts on this. I offer these to initiate debate on this issue, not as hard positions.
Before a developer commits a patch to the code base, he must run parser tests against the change. The patch should not be committed if it increases the number of parser test failures. He should document the results in the bugzilla bug report.
If a developer commits a patch without running parser tests or commits a patch that increases the number of parser test failures, he should be warned. If he does this another time with some time interval (say 6 months), his commit privileges are revoked for some period of time (say 6 months). The second time he becomes a candidate for commit privilege revocation, they will be revoked permanently.
An extension developer also should run parser tests against a MW version with the extension installed. The results of this should be provided in the extension documentation. An extension should not be added to the extension matrix unless it provides this information.
A test harness that performs regression tests (currently only parser tests) against every trunk versions committed in the last 24 hours should be run nightly. The installed extensions should be those used on the WMF machines. The results should be published on some page on the Mediawiki site. If any version increases the number of parser test failures, the procedure described above for developers is initiated.
A group of developers should have the responsibility of reviewing the nightly test results to implement this QA process.
I am sure there are a whole bunch of other things that might be done to improve MW QA. The point of this message is to initiate a discussion on what those might be.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I wrote up a post last week about regression testing and updates in MW last week.
I don't feel like repeating myself, so I'll just link: http://lists.wikimedia.org/pipermail/wikitech-l/2009-July/043967.html
-Chad
wikitech-l@lists.wikimedia.org