<quote name="Martijn Hoekstra" date="2014-03-10" time="08:51:38 +0100">
If the test infrastructure can't handle running every test on every commit (why can't they by the way, and could that be remedied?) Would it be possible and desireable to make sure a more elaborate test suite with all available tests in run as a gate before cutting the production branch?
I'm hesitant on making a suite of tests that only run once a week be a hard go/no-go.
I'm hesitant because that won't help us actually create better code. Sure, it'll prevent broken code from going out, but we're already pretty good about that (since we only deploy to testwikis initially and find breakages before it goes out to 'real' (minus mediawiki.org) wikis). Put another way, the breakages that the tests find are a subset of what Reedy finds and fixes/reverts when he deploys to the testwikis. And the breakages that somehow make it to 'real' wikis are by definition, not caught by the test suite (as the test suite has already run against production wikis).
But it won't make the developers writing bad code write better code. The only that that will do that is Jenkins reporting back into Gerrit "You shall not pass!". And the only way of being able to do that is if we (Jenkins) knows which change caused the breakage. That means per-commit tests. Not weekly tests.
We already have twice-daily tests running (and those are the ones that cause the change that started this thread) because they can't be run for every commit. I'd rather not just pick a random one of those (effectively) and say "you're more important that the other identical runs of yourself".
Feedback. Developer feedback is the only way to make this better.
The solution, honestly, is making all tests run per commit.
https://upload.wikimedia.org/wikipedia/en/7/74/Continuous_Delivery_process_d...
s/Delivery Team/Developers/ and that's what should happen for every commit.
This isn't easy, and it is what we (the WMF Release Engineering & QA Team) are working on.
Greg
On Mar 10, 2014 4:19 PM, "Greg Grossmeier" greg@wikimedia.org wrote:
<quote name="Martijn Hoekstra" date="2014-03-10" time="08:51:38 +0100"> > If the test infrastructure can't handle running every test on every
commit
(why can't they by the way, and could that be remedied?) Would it be possible and desireable to make sure a more elaborate test suite with
all
available tests in run as a gate before cutting the production branch?
I'm hesitant on making a suite of tests that only run once a week be a hard go/no-go.
I'm hesitant because that won't help us actually create better code. Sure, it'll prevent broken code from going out, but we're already pretty good about that (since we only deploy to testwikis initially and find breakages before it goes out to 'real' (minus mediawiki.org) wikis). Put another way, the breakages that the tests find are a subset of what Reedy finds and fixes/reverts when he deploys to the testwikis. And the breakages that somehow make it to 'real' wikis are by definition, not caught by the test suite (as the test suite has already run against production wikis).
But it won't make the developers writing bad code write better code. The only that that will do that is Jenkins reporting back into Gerrit "You shall not pass!". And the only way of being able to do that is if we (Jenkins) knows which change caused the breakage. That means per-commit tests. Not weekly tests.
We already have twice-daily tests running (and those are the ones that cause the change that started this thread) because they can't be run for every commit. I'd rather not just pick a random one of those (effectively) and say "you're more important that the other identical runs of yourself".
Feedback. Developer feedback is the only way to make this better.
The solution, honestly, is making all tests run per commit.
https://upload.wikimedia.org/wikipedia/en/7/74/Continuous_Delivery_process_d...
s/Delivery Team/Developers/ and that's what should happen for every commit.
This isn't easy, and it is what we (the WMF Release Engineering & QA Team) are working on.
Greg
That sounds great! If it proves too much of a hurdle I've been thinking of a second possible workflow that I think has the same results. It would be roughly as follows:
1. Do a first Jenkings run as it is done now. Mark a patch that passes it as Jenkins Considered.
2. While there are patches marked as Jenkins Considers: Accumulate and merge[1] all patches into a new integration branch, and run the expensive tests on the integration branch. If all tests pass, all patches in it are marked as passing by Jenkins. If it fails, split the patches in two, rinse, repeat, until the failing patches are known. Mark the failing patches as failed by Jenkins.
If most patches pass all tests, this requires far fewer test runs than running everything on every commit. If Zuul is capable of such a workflow, it could be considered. If not, maybe we could convince the Zuul maintainers to make it possible in a future version (possibly by donating needed patches)
[1] because git performs these merges automatically and correctly. Always. Right?
--Martijn
-- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @greg A18D 1138 8E47 FAC8 1C7D |
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
This would be a great place to get to. Even if we did this but every other commit or every 5 commits this would be a step in the right direction!
On Tue, Mar 11, 2014 at 2:44 AM, Martijn Hoekstra martijnhoekstra@gmail.com wrote:
On Mar 10, 2014 4:19 PM, "Greg Grossmeier" greg@wikimedia.org wrote:
<quote name="Martijn Hoekstra" date="2014-03-10" time="08:51:38 +0100"> > If the test infrastructure can't handle running every test on every
commit
(why can't they by the way, and could that be remedied?) Would it be possible and desireable to make sure a more elaborate test suite with
all
available tests in run as a gate before cutting the production branch?
I'm hesitant on making a suite of tests that only run once a week be a hard go/no-go.
I'm hesitant because that won't help us actually create better code. Sure, it'll prevent broken code from going out, but we're already pretty good about that (since we only deploy to testwikis initially and find breakages before it goes out to 'real' (minus mediawiki.org) wikis). Put another way, the breakages that the tests find are a subset of what Reedy finds and fixes/reverts when he deploys to the testwikis. And the breakages that somehow make it to 'real' wikis are by definition, not caught by the test suite (as the test suite has already run against production wikis).
But it won't make the developers writing bad code write better code. The only that that will do that is Jenkins reporting back into Gerrit "You shall not pass!". And the only way of being able to do that is if we (Jenkins) knows which change caused the breakage. That means per-commit tests. Not weekly tests.
We already have twice-daily tests running (and those are the ones that cause the change that started this thread) because they can't be run for every commit. I'd rather not just pick a random one of those (effectively) and say "you're more important that the other identical runs of yourself".
Feedback. Developer feedback is the only way to make this better.
The solution, honestly, is making all tests run per commit.
https://upload.wikimedia.org/wikipedia/en/7/74/Continuous_Delivery_process_d...
s/Delivery Team/Developers/ and that's what should happen for every commit.
This isn't easy, and it is what we (the WMF Release Engineering & QA Team) are working on.
Greg
That sounds great! If it proves too much of a hurdle I've been thinking of a second possible workflow that I think has the same results. It would be roughly as follows:
- Do a first Jenkings run as it is done now. Mark a patch that passes it
as Jenkins Considered.
- While there are patches marked as Jenkins Considers: Accumulate and
merge[1] all patches into a new integration branch, and run the expensive tests on the integration branch. If all tests pass, all patches in it are marked as passing by Jenkins. If it fails, split the patches in two, rinse, repeat, until the failing patches are known. Mark the failing patches as failed by Jenkins.
If most patches pass all tests, this requires far fewer test runs than running everything on every commit. If Zuul is capable of such a workflow, it could be considered. If not, maybe we could convince the Zuul maintainers to make it possible in a future version (possibly by donating needed patches)
[1] because git performs these merges automatically and correctly. Always. Right?
--Martijn
-- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @greg A18D 1138 8E47 FAC8 1C7D |
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
On Tue, Mar 11, 2014 at 8:23 PM, Jon Robson jdlrobson@gmail.com wrote:
This would be a great place to get to. Even if we did this but every other commit or every 5 commits this would be a step in the right direction!
On Tue, Mar 11, 2014 at 2:44 AM, Martijn Hoekstra martijnhoekstra@gmail.com wrote:
On Mar 10, 2014 4:19 PM, "Greg Grossmeier" greg@wikimedia.org wrote:
<quote name="Martijn Hoekstra" date="2014-03-10" time="08:51:38 +0100"> > If the test infrastructure can't handle running every test on every
commit
(why can't they by the way, and could that be remedied?) Would it be possible and desireable to make sure a more elaborate test suite with
all
available tests in run as a gate before cutting the production branch?
I'm hesitant on making a suite of tests that only run once a week be a hard go/no-go.
I'm hesitant because that won't help us actually create better code. Sure, it'll prevent broken code from going out, but we're already pretty good about that (since we only deploy to testwikis initially and find breakages before it goes out to 'real' (minus mediawiki.org) wikis).
Put
another way, the breakages that the tests find are a subset of what Reedy finds and fixes/reverts when he deploys to the testwikis. And the breakages that somehow make it to 'real' wikis are by definition, not caught by the test suite (as the test suite has already run against production wikis).
But it won't make the developers writing bad code write better code. The only that that will do that is Jenkins reporting back into Gerrit "You shall not pass!". And the only way of being able to do that is if we (Jenkins) knows which change caused the breakage. That means per-commit tests. Not weekly tests.
We already have twice-daily tests running (and those are the ones that cause the change that started this thread) because they can't be run for every commit. I'd rather not just pick a random one of those (effectively) and say "you're more important that the other identical runs of yourself".
Feedback. Developer feedback is the only way to make this better.
The solution, honestly, is making all tests run per commit.
https://upload.wikimedia.org/wikipedia/en/7/74/Continuous_Delivery_process_d...
s/Delivery Team/Developers/ and that's what should happen for every commit.
This isn't easy, and it is what we (the WMF Release Engineering & QA Team) are working on.
Greg
That sounds great! If it proves too much of a hurdle I've been thinking
of
a second possible workflow that I think has the same results. It would be roughly as follows:
- Do a first Jenkings run as it is done now. Mark a patch that passes it
as Jenkins Considered.
- While there are patches marked as Jenkins Considers: Accumulate and
merge[1] all patches into a new integration branch, and run the expensive tests on the integration branch. If all tests pass, all patches in it are marked as passing by Jenkins. If it fails, split the patches in two,
rinse,
repeat, until the failing patches are known. Mark the failing patches as failed by Jenkins.
If most patches pass all tests, this requires far fewer test runs than running everything on every commit. If Zuul is capable of such a
workflow,
it could be considered. If not, maybe we could convince the Zuul maintainers to make it possible in a future version (possibly by donating needed patches)
[1] because git performs these merges automatically and correctly.
Always.
Right?
--Martijn
I've been trying to get an overview of the current infrastructure, and I found http://www.mediawiki.org/wiki/Continuous_integration which is helping me a lot understanding the current setup. Is it fully up to date?
Which are the tests that failed in the incident that sparked this discussion? Are those the tests run on cloudbees, or are it other tests? Is this whole issue more or less https://bugzilla.wikimedia.org/show_bug.cgi?id=53697 ?
Also, is there an easy way to see how to see per Jenkins job how it's triggered in Zuul? (I suppose this is the inverse of the Zuul configuration setup)
-- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @greg A18D 1138 8E47 FAC8 1C7D |
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jon Robson
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
cc'ing the QA mailing list.
<quote name="Martijn Hoekstra" date="2014-03-12" time="11:04:23 +0100">
I've been trying to get an overview of the current infrastructure, and I found http://www.mediawiki.org/wiki/Continuous_integration which is helping me a lot understanding the current setup. Is it fully up to date?
Generally yes, but there's a lot of documentation there (awesome!) which makes it easier to get out of date (booo!).
See also: https://doc.wikimedia.org/mw-tools-releng/html/devdeployflow/index.html
(re-flowing that to not be so wide is on my list, along with figuring out the available fonts on the integration machine so I can pick a good one.)
Which are the tests that failed in the incident that sparked this discussion? Are those the tests run on cloudbees, or are it other tests? Is this whole issue more or less https://bugzilla.wikimedia.org/show_bug.cgi?id=53697 ?
There's also: https://bugzilla.wikimedia.org/show_bug.cgi?id=45499 and https://bugzilla.wikimedia.org/show_bug.cgi?id=51492 and https://bugzilla.wikimedia.org/show_bug.cgi?id=52424 and https://bugzilla.wikimedia.org/show_bug.cgi?id=58040 and https://bugzilla.wikimedia.org/show_bug.cgi?id=62144 and https://bugzilla.wikimedia.org/show_bug.cgi?id=52425 and https://bugzilla.wikimedia.org/show_bug.cgi?id=52424 and https://bugzilla.wikimedia.org/show_bug.cgi?id=60347 and https://bugzilla.wikimedia.org/show_bug.cgi?id=50576
(all related/different parts of the same issue)
Also, is there an easy way to see how to see per Jenkins job how it's triggered in Zuul? (I suppose this is the inverse of the Zuul configuration setup)
I'll let Antoine or Chris or Zeljko answer this one.
Greg
On Wed, Mar 12, 2014 at 5:22 PM, Greg Grossmeier greg@wikimedia.org wrote:
Also, is there an easy way to see how to see per Jenkins job how it's triggered in Zuul? (I suppose this is the inverse of the Zuul
configuration
setup)
I'll let Antoine or Chris or Zeljko answer this one.
Antoine takes care of Zuul as far as I know, so I will let him answer. :)
Željko
wikitech-l@lists.wikimedia.org