Failing MobileFrontend browser tests

List overview All Threads
Download

newer

older

Call for Papers: The CGMIP2014...

New Android beta release

Juliusz Gonera

9 Jul 2014 9 Jul '14

10:57 p.m.

Today I worked a bit on fixing failing browser tests. The good news is that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... same as above

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests. When will this happen? Is there any deadline?

Thanks,

-- Juliusz

Attachments:

attachment.htm (text/html — 3.1 KB)

Show replies by date

Jon Robson

10 Jul 10 Jul

12:54 a.m.

Indeed. The tests have been failing for a month now, and had been passing green before the move to integration.wikimedia.org It would be really good to get these back to being useful.

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

Thanks Juliusz for the good summary of the problems!

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera jgonera@wikimedia.org wrote:

...

Today I worked a bit on fixing failing browser tests. The good news is that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... same as above

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests. When will this happen? Is there any deadline?

Thanks,

Juliusz

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Jon Robson * http://jonrobson.me.uk * https://www.facebook.com/jonrobson * @rakugojon

Tomasz Finc

5:51 p.m.

ChrisMC,

Are these failures unique to mobile? They seem look to be at the infrastructure level so i'm guessing it would affect others.

What other information do you need from us to be able to remedy these?

--tomasz

On Wed, Jul 9, 2014 at 5:54 PM, Jon Robson jdlrobson@gmail.com wrote:

...

Indeed. The tests have been failing for a month now, and had been passing green before the move to integration.wikimedia.org It would be really good to get these back to being useful.

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

Thanks Juliusz for the good summary of the problems!

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera jgonera@wikimedia.org wrote:

...
Today I worked a bit on fixing failing browser tests. The good news is that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... same as above

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests. When will this happen? Is there any deadline?

Thanks,

Juliusz

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Jon Robson

http://jonrobson.me.uk

https://www.facebook.com/jonrobson

@rakugojon

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

Arthur Richards

6:08 p.m.

Head's up that Chris Mcmahon is on vacation all this week. That said, it would be great to hear from anyone in QA about this - it has been a long standing issue.

On Thu, Jul 10, 2014 at 10:51 AM, Tomasz Finc tfinc@wikimedia.org wrote:

...

ChrisMC,

Are these failures unique to mobile? They seem look to be at the infrastructure level so i'm guessing it would affect others.

What other information do you need from us to be able to remedy these?

--tomasz

On Wed, Jul 9, 2014 at 5:54 PM, Jon Robson jdlrobson@gmail.com wrote:

...
Indeed. The tests have been failing for a month now, and had been passing green before the move to integration.wikimedia.org It would be really good to get these back to being useful.

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

Thanks Juliusz for the good summary of the problems!

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera jgonera@wikimedia.org

wrote:

...
...
Today I worked a bit on fixing failing browser tests. The good news is

that

...
...
some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
getaddrinfo: Name or service not known (SocketError) - seems like a

problem

...
...
with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
Saucelabs recording shows "no data received" error in Chrome, either

beta

...
...
labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
same as above

Those are just a few examples from recent failures, but they make

tracking

...
...
regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests. When

will

...
...
this happen? Is there any deadline?

Thanks,

Juliusz

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Jon Robson

http://jonrobson.me.uk

https://www.facebook.com/jonrobson

@rakugojon

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Tomasz Finc

6:09 p.m.

RobLa, is this something that we should be pulling in Greg for?

--tomasz

On Thu, Jul 10, 2014 at 11:08 AM, Arthur Richards arichards@wikimedia.org wrote:

...

Head's up that Chris Mcmahon is on vacation all this week. That said, it would be great to hear from anyone in QA about this - it has been a long standing issue.

On Thu, Jul 10, 2014 at 10:51 AM, Tomasz Finc tfinc@wikimedia.org wrote:

...
ChrisMC,

Are these failures unique to mobile? They seem look to be at the infrastructure level so i'm guessing it would affect others.

What other information do you need from us to be able to remedy these?

--tomasz

On Wed, Jul 9, 2014 at 5:54 PM, Jon Robson jdlrobson@gmail.com wrote:

...
Indeed. The tests have been failing for a month now, and had been passing green before the move to integration.wikimedia.org It would be really good to get these back to being useful.

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

Thanks Juliusz for the good summary of the problems!

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera jgonera@wikimedia.org wrote:

...
Today I worked a bit on fixing failing browser tests. The good news is that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... same as above

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests. When will this happen? Is there any deadline?

Thanks,

Juliusz

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Jon Robson

http://jonrobson.me.uk

https://www.facebook.com/jonrobson

@rakugojon

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Rob Lanphier

7:06 p.m.

Hi all,

Antoine and Zeljko are the right people to talk about this while Chris is out, and it's late in the day for them. I'm sure they'll get back to you tomorrow. Greg may be able to say more about this, but honestly, the nature of this thread is a little bit like little kids in the backseat saying "are we there yet? are we there yet?" repeatedly :-)

Antoine's response seems to answer the substance of what y'all are asking about. We moved from Cloudbees to directly using Saucelabs so that we could debug these issues directly. Now that we're on Saucelabs, we have the info (see Antoine's mail). As he said, we have no plans to set up our own version of Saucelabs.

It may be that the very first thing we need to do is put some sort of environment health check prior to executing the actual test portion to avoid these false failures. Given that the team just completed the migration off of Cloudbees, give them a little time to figure things out.

Thanks Rob

On Thu, Jul 10, 2014 at 11:09 AM, Tomasz Finc tfinc@wikimedia.org wrote:

...

RobLa, is this something that we should be pulling in Greg for?

--tomasz

On Thu, Jul 10, 2014 at 11:08 AM, Arthur Richards arichards@wikimedia.org wrote:

...
Head's up that Chris Mcmahon is on vacation all this week. That said, it would be great to hear from anyone in QA about this - it has been a long standing issue.

On Thu, Jul 10, 2014 at 10:51 AM, Tomasz Finc tfinc@wikimedia.org

wrote:

...
...
ChrisMC,

Are these failures unique to mobile? They seem look to be at the infrastructure level so i'm guessing it would affect others.

What other information do you need from us to be able to remedy these?

--tomasz

On Wed, Jul 9, 2014 at 5:54 PM, Jon Robson jdlrobson@gmail.com wrote:

...
Indeed. The tests have been failing for a month now, and had been passing green before the move to integration.wikimedia.org It would be really good to get these back to being useful.

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

Thanks Juliusz for the good summary of the problems!

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera <jgonera@wikimedia.org

...
...
wrote:

...
Today I worked a bit on fixing failing browser tests. The good news

is

...
...
...
...
that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
same as above

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests.

When

...
...
...
...
will this happen? Is there any deadline?

Thanks,

Juliusz

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Jon Robson

http://jonrobson.me.uk

https://www.facebook.com/jonrobson

@rakugojon

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Greg Grossmeier

7:13 p.m.

1) Yeah, feel free to pull me in on these conversations. I'm here to help :)

2) What Robla said plus: Please feel free, actually, please feel encouraged to report bugs on these specific infrastructure-failure tests and assign to Chris or Zeljko and bring up in the SoS as appropriate. Any infrastructure migration will be bumpy when there are some odd black holes we need to debug. They'll debug/process them as fast as they can.

Greg

(PS: I trimmed cc's to just the lists, assuming all were on one of the two)

...

Hi all,

Antoine and Zeljko are the right people to talk about this while Chris is out, and it's late in the day for them. I'm sure they'll get back to you tomorrow. Greg may be able to say more about this, but honestly, the nature of this thread is a little bit like little kids in the backseat saying "are we there yet? are we there yet?" repeatedly :-)

Antoine's response seems to answer the substance of what y'all are asking about. We moved from Cloudbees to directly using Saucelabs so that we could debug these issues directly. Now that we're on Saucelabs, we have the info (see Antoine's mail). As he said, we have no plans to set up our own version of Saucelabs.

It may be that the very first thing we need to do is put some sort of environment health check prior to executing the actual test portion to avoid these false failures. Given that the team just completed the migration off of Cloudbees, give them a little time to figure things out.

Thanks Rob

On Thu, Jul 10, 2014 at 11:09 AM, Tomasz Finc tfinc@wikimedia.org wrote:

...
RobLa, is this something that we should be pulling in Greg for?

--tomasz

On Thu, Jul 10, 2014 at 11:08 AM, Arthur Richards arichards@wikimedia.org wrote:

...
Head's up that Chris Mcmahon is on vacation all this week. That said, it would be great to hear from anyone in QA about this - it has been a long standing issue.

On Thu, Jul 10, 2014 at 10:51 AM, Tomasz Finc tfinc@wikimedia.org

wrote:

...
...
ChrisMC,

Are these failures unique to mobile? They seem look to be at the infrastructure level so i'm guessing it would affect others.

What other information do you need from us to be able to remedy these?

--tomasz

On Wed, Jul 9, 2014 at 5:54 PM, Jon Robson jdlrobson@gmail.com wrote:

...
Indeed. The tests have been failing for a month now, and had been passing green before the move to integration.wikimedia.org It would be really good to get these back to being useful.

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

Thanks Juliusz for the good summary of the problems!

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera <jgonera@wikimedia.org

...
...
wrote:

...
Today I worked a bit on fixing failing browser tests. The good news

is

...
...
...
...
that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

...
...
...
...
same as above

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests.

When

...
...
...
...
will this happen? Is there any deadline?

Thanks,

Juliusz

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Jon Robson

http://jonrobson.me.uk

https://www.facebook.com/jonrobson

@rakugojon

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

...

Mobile-l mailing list Mobile-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mobile-l

-- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @greg A18D 1138 8E47 FAC8 1C7D |

Arthur Richards

8:38 p.m.

Thanks for the replies, everyone - however I haven't seen Antoine's response (beyond the bit quoted in Chris Steipp's reply), did it not get reply all'd or otherwise not sent to mobile-l? Further response inline.

On Thu, Jul 10, 2014 at 12:06 PM, Rob Lanphier robla@wikimedia.org wrote:

...

Antoine and Zeljko are the right people to talk about this while Chris is out, and it's late in the day for them. I'm sure they'll get back to you tomorrow. Greg may be able to say more about this, but honestly, the nature of this thread is a little bit like little kids in the backseat saying "are we there yet? are we there yet?" repeatedly :-)

Rob, the last time we inquired about this (to my knowledge) was one month ago (see email subject 'migrating MobileFrontend browser tests to WMF Jenkins'). We saw no substantive followup, and our browser test builds have been broken since the migration over one month ago as a result of infrastructural issues. This not only rendered the automated browser tests essentially useless for us but it has also become a significant drain on time and focus. This isn't a case of us asking 'are we there yet' - rather it's a case of us trying to understand when we'll be able to rely on our browser tests again and to see if there's any way we can help to improve the situation.

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Juliusz Gonera

11 Jul 11 Jul

12:08 a.m.

I think Arthur found the right words to describe what the problem is for us. If "we're not there yet" then we should disable all browser test notifications altogether because there's no point in getting several emails about failing tests on mobile-tech if we know they will be for sure failing.

On Thu, Jul 10, 2014 at 1:38 PM, Arthur Richards arichards@wikimedia.org wrote:

...

Thanks for the replies, everyone - however I haven't seen Antoine's response (beyond the bit quoted in Chris Steipp's reply), did it not get reply all'd or otherwise not sent to mobile-l? Further response inline.

On Thu, Jul 10, 2014 at 12:06 PM, Rob Lanphier robla@wikimedia.org wrote:

...
Antoine and Zeljko are the right people to talk about this while Chris is out, and it's late in the day for them. I'm sure they'll get back to you tomorrow. Greg may be able to say more about this, but honestly, the nature of this thread is a little bit like little kids in the backseat saying "are we there yet? are we there yet?" repeatedly :-)

Rob, the last time we inquired about this (to my knowledge) was one month ago (see email subject 'migrating MobileFrontend browser tests to WMF Jenkins'). We saw no substantive followup, and our browser test builds have been broken since the migration over one month ago as a result of infrastructural issues. This not only rendered the automated browser tests essentially useless for us but it has also become a significant drain on time and focus. This isn't a case of us asking 'are we there yet' - rather it's a case of us trying to understand when we'll be able to rely on our browser tests again and to see if there's any way we can help to improve the situation.

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Rob Lanphier

1:37 a.m.

On Thu, Jul 10, 2014 at 1:38 PM, Arthur Richards arichards@wikimedia.org wrote:

...

Thanks for the replies, everyone - however I haven't seen Antoine's response (beyond the bit quoted in Chris Steipp's reply), did it not get reply all'd or otherwise not sent to mobile-l? Further response inline.

On Thu, Jul 10, 2014 at 12:06 PM, Rob Lanphier robla@wikimedia.org wrote:

...
Antoine and Zeljko are the right people to talk about this while Chris is out, and it's late in the day for them. I'm sure they'll get back to you tomorrow. Greg may be able to say more about this, but honestly, the nature of this thread is a little bit like little kids in the backseat saying "are we there yet? are we there yet?" repeatedly :-)

Rob, the last time we inquired about this (to my knowledge) was one month ago (see email subject 'migrating MobileFrontend browser tests to WMF Jenkins'). We saw no substantive followup, and our browser test builds have been broken since the migration over one month ago as a result of infrastructural issues. This not only rendered the automated browser tests essentially useless for us but it has also become a significant drain on time and focus. This isn't a case of us asking 'are we there yet' - rather it's a case of us trying to understand when we'll be able to rely on our browser tests again and to see if there's any way we can help to improve the situation.

The completion of the migration to Saucelabs was announced July 3. If you aren't getting replies, it's probably because they are stuck in the mobile-l moderation queue. Antoine did send his email to the mobile-l list, but is probably not a member. In general, there has been a fair amount of conversation on the topic on the qa list, so I'd encourage you to check out the activity there if you haven't already seen it.

Sorry for being glib earlier. I understand the current situation must be frustrating for you all, and I hope Zeljko and Antoine are able to provide a satisfactory answer on this. Chris will also be back on Monday, though unfortunately Antoine will be gone at that point.

Rob

Tomasz Finc

5:41 p.m.

On Thu, Jul 10, 2014 at 6:37 PM, Rob Lanphier robla@wikimedia.org wrote:

...

The completion of the migration to Saucelabs was announced July 3. If you aren't getting replies, it's probably because they are stuck in the mobile-l moderation queue. Antoine did send his email to the mobile-l list, but is probably not a member. In general, there has been a fair amount of conversation on the topic on the qa list, so I'd encourage you to check out the activity there if you haven't already seen it.

There are zero messages being held in the moderation queue.

--tomasz

Željko Filipin

14 Jul 14 Jul

12:33 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Fri, Jul 11, 2014 at 7:41 PM, Tomasz Finc tfinc@wikimedia.org wrote:

...

There are zero messages being held in the moderation queue.

This might be the reason:

http://lists.wikimedia.org/pipermail/qa/2014-July/001706.html

Željko

Arthur Richards

11 Jul 11 Jul

7:35 p.m.

While the migration from Cloudbees was completed earlier this month, the MobileFrontend jobs were migrated off of Cloudbees over one month ago.

Chris McMahon sent an email announcing this on June 6 [0]. In addition, Chris said 'The tests for MF on beta labs running in headless Firefox under xvfb are reliably green as of today and we'll be working to keep them that way.'

However, those tests have been consistently failing since early June. After digging through MobileFrontend test failures done in particular by Jon and Juliusz, *it's clear that the consistency of the failures are related to architectural/infrastructural issues*. Jon brought this up on June 9 [1], with a response from Zeljko on June 28 [2] mentioning that the issue was known, but that they haven't had time to debug the problem. Fast forward two weeks, and Juliusz resurfaced the problem with this thread since we haven't heard any additional information in regards to resolving the issue, while dealing with a high degree of noise from build failures.

We really appreciate all of the hard work that release/qa/platform/etc has put into this and we understand that resolving issues takes time. When we had more reliable builds, we found the automated browser tests to be incredibly valuable. We want to regain that value so that we can again more reliably catch issues before they find their way to production.

*Is there anyone currently owning or willing to own digging into and resolving these issues? Can we get any kind of timeline for resolving this *- even if it's just in regards to when the issue will be able to be investigated? In the mean time, let's remove the mobile web team from the failure notifications until such time that the builds are reliable and we can depend on a better signal-to-noise ratio.

As an aside - I didn't know the migration work to Cloudbees was complete until it was addressed on this thread. It looks like the only announcement about it was made on the qa list, where I and many folks affected by this change are not subscribed. Please make announcements about things as significant as this on broad-reaching lists like wikitech-l.

[0] http://lists.wikimedia.org/pipermail/qa/2014-June/001515.html [1] http://lists.wikimedia.org/pipermail/qa/2014-June/001535.html [2] http://lists.wikimedia.org/pipermail/qa/2014-June/001615.html

On Thu, Jul 10, 2014 at 6:37 PM, Rob Lanphier robla@wikimedia.org wrote:

...

On Thu, Jul 10, 2014 at 1:38 PM, Arthur Richards arichards@wikimedia.org wrote:

...
Thanks for the replies, everyone - however I haven't seen Antoine's response (beyond the bit quoted in Chris Steipp's reply), did it not get reply all'd or otherwise not sent to mobile-l? Further response inline.

On Thu, Jul 10, 2014 at 12:06 PM, Rob Lanphier robla@wikimedia.org wrote:

...
Antoine and Zeljko are the right people to talk about this while Chris is out, and it's late in the day for them. I'm sure they'll get back to you tomorrow. Greg may be able to say more about this, but honestly, the nature of this thread is a little bit like little kids in the backseat saying "are we there yet? are we there yet?" repeatedly :-)

Rob, the last time we inquired about this (to my knowledge) was one month ago (see email subject 'migrating MobileFrontend browser tests to WMF Jenkins'). We saw no substantive followup, and our browser test builds have been broken since the migration over one month ago as a result of infrastructural issues. This not only rendered the automated browser tests essentially useless for us but it has also become a significant drain on time and focus. This isn't a case of us asking 'are we there yet' - rather it's a case of us trying to understand when we'll be able to rely on our browser tests again and to see if there's any way we can help to improve the situation.

The completion of the migration to Saucelabs was announced July 3. If you aren't getting replies, it's probably because they are stuck in the mobile-l moderation queue. Antoine did send his email to the mobile-l list, but is probably not a member. In general, there has been a fair amount of conversation on the topic on the qa list, so I'd encourage you to check out the activity there if you haven't already seen it.

Sorry for being glib earlier. I understand the current situation must be frustrating for you all, and I hope Zeljko and Antoine are able to provide a satisfactory answer on this. Chris will also be back on Monday, though unfortunately Antoine will be gone at that point.

Rob

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Željko Filipin

14 Jul 14 Jul

12:52 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Fri, Jul 11, 2014 at 9:35 PM, Arthur Richards arichards@wikimedia.org wrote:

...

Chris McMahon sent an email announcing this on June 6 [0]. In addition, Chris said 'The tests for MF on beta labs running in headless Firefox under xvfb are reliably green as of today and we'll be working to keep them that way.'

Running tests using xvfb proved to be more unstable that Sauce Labs. We have moved to Sauce until we have some time to investigate failures.

...

*Is there anyone currently owning or willing to own digging into and resolving these issues? Can we get any kind of timeline for resolving this *- even if it's just in regards to when the issue will be able to be investigated?

Rob, Chris, as far as I know, I have no big projects at the moment. Should I focus on this?

...

In the mean time, let's remove the mobile web team from the failure notifications until such time that the builds are reliable and we can depend on a better signal-to-noise ratio.

Done[1]. I have added everybody from this thread to the reviewers. (I could not find Tomasz in Gerrit.)

Željko -- 1: https://gerrit.wikimedia.org/r/#/c/146056/

Željko Filipin

11 Jul 11 Jul

4:54 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Thu, Jul 10, 2014 at 10:38 PM, Arthur Richards arichards@wikimedia.org wrote:

...

We saw no substantive followup, and our browser test builds have been broken since the migration over one month ago as a result of infrastructural issues. This not only rendered the automated browser tests essentially useless for us but it has also become a significant drain on time and focus. This isn't a case of us asking 'are we there yet' - rather it's a case of us trying to understand when we'll be able to rely on our browser tests again and to see if there's any way we can help to improve the situation.

I have took a quick look at MobileFrontend Jenkins jobs[1-3]. 2 out of 3 jobs jobs have failures that have age 1 (meaning the test failed just once, probably intermittent problem), but also tests that have been failing for the last 3-53 times (so the problem is stable).

I will start debugging the problems that happen every time. I have no answer on when the tests will be green (no failures) and sunny (no failures for the last 5 test runs) again.

Željko -- 1: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... 2: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... 3: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi...

Željko Filipin

5:17 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Thu, Jul 10, 2014 at 10:38 PM, Arthur Richards arichards@wikimedia.org wrote:

...

This isn't a case of us asking 'are we there yet' - rather it's a case of us trying to understand when we'll be able to rely on our browser tests again and to see if there's any way we can help to improve the situation.

I have noticed that pairing is a great way to share knowledge and get things done. As far as I know, the majority of the mobile team is in San Francisco, so it is not the easiest thing to arrange pairing with me.

I am free almost every Monday, Tuesday and Wednesday, 8-9am San Francisco time (5-6pm my time). Just saying. ;)

Željko

Željko Filipin

4:08 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Thu, Jul 10, 2014 at 7:51 PM, Tomasz Finc tfinc@wikimedia.org wrote:

...

Are these failures unique to mobile? They seem look to be at the infrastructure level so i'm guessing it would affect others.

We have noticed similar problems across all repositories.

...

What other information do you need from us to be able to remedy these?

All we need is time. We have finished migration from Cloudbees to Wikimedia Jenkins a few days ago, the next step is making the jobs as green as possible.

Željko

Željko Filipin

4:06 p.m.

On Thu, Jul 10, 2014 at 2:54 AM, Jon Robson jdlrobson@gmail.com wrote:

...

I'm not sure how our interaction with saucelabs changed during that move, but is there anything that can be done on the short term to get it back to how they were before when we were on cloudbees?

We have a new account now, instead of being able to run 2-3 parallel tests, we are now able to run them 10-15, but that should not cause any problems.

Nothing comes to my mind what could be done short term.

Željko

Željko Filipin

4:04 p.m.

New subject: [QA] Failing MobileFrontend browser tests

Hi Juliusz,

comments are inline.

On Thu, Jul 10, 2014 at 12:57 AM, Juliusz Gonera jgonera@wikimedia.org wrote:

...

Today I worked a bit on fixing failing browser tests. The good news is that some tests detected a regression in core that caused full text search on mobile to not work. The bad news is that many of the failures seem to be caused by problems with Saucelabs and/or beta labs, examples:

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Saucelabs recording shows "no data received" error in Chrome, either beta labs problem or saucelabs network problem

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... same as above

All links now lead to passing tests. When copy/pasting links from Jenkins, please make sure to use URLs with build number instead of URLs with "lastBuild".

...

Those are just a few examples from recent failures, but they make tracking regressions really tedious and time consuming. I know we are planning to move away from Saucelabs and use our own servers to run the tests.

We have tried moving away from third party services (Cloudbees, Sauce Labs) and we have succeeded to move all Jenkins jobs from Cloudbees to Wikimedia Jenkins.

We have tried running tests in local browsers (instead at Sauce Labs) but the tests were also sometimes failing for unclear reasons, so we are at the moment again using Sauce Labs. Better the devil you know than the devil you don't...[1]

I will continue testing and debugging tests with both local and Sauce Labs browsers and I will let you know the results.

...

When will this happen? Is there any deadline?

As far as I know, there is no deadline.

Željko -- 1: http://www.usingenglish.com/reference/idioms/better+the+devil+you+know.html

Dan Duvall

6:39 p.m.

New subject: [QA] Failing MobileFrontend browser tests

...

We have tried running tests in local browsers (instead at Sauce Labs) but the tests were also sometimes failing for unclear reasons, so we are at the moment again using Sauce Labs. Better the devil you know than the devil you don't...[1]

I have a lot of learning to do before I can be of much help with the Sauce Labs side of things, but if improving the state of MobileFrontend browser tests in mw-vagrant would be the best first step here, I can certainly help with that.

Juliusz (or anyone else on Mobile that has the time), let me know if you're available for pairing next week. I'm available during SF hours in the office Monday, Wednesday, Friday, and available for hangouts Tuesday and Thursday. Fair warning: I'm still getting up-to-speed with all things MediaWiki. That said, I feel I now understand enough about browser tests and the mw-vagrant environment to be helpful. For things still mysterious, I can always take notes and lean on Zeljko for answers. :)

Dan

-- Dan Duvall Automation Engineer Wikimedia Foundation http://wikimediafoundation.org

Željko Filipin

5:12 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Thu, Jul 10, 2014 at 12:57 AM, Juliusz Gonera jgonera@wikimedia.org wrote:

...

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

My guess is that you are talking about this failure[1]. Looking at the Sauce Labs screencast, it looks to me that labs was just slow to respond and the test failed after 5 seconds with pretty descriptive error message[2]. It is hard for me to say why labs is slow. As Antoine has suggested, looking at logs for that date/time could help.

If that happens a lot, a short workaround would be to make the test wait for 10 seconds instead of 5.

Željko -- 1: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... 2: timed out after 5 seconds, waiting for {:css=>".wikitext-editor", :tag_name=>"textarea"} to become present (Watir::Wait::TimeoutError)

Rob Lanphier

7:39 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Fri, Jul 11, 2014 at 10:12 AM, Željko Filipin zfilipin@wikimedia.org wrote:

...

On Thu, Jul 10, 2014 at 12:57 AM, Juliusz Gonera jgonera@wikimedia.org wrote:

...
https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

My guess is that you are talking about this failure[1]. Looking at the Sauce Labs screencast, it looks to me that labs was just slow to respond and the test failed after 5 seconds with pretty descriptive error message[2]. It is hard for me to say why labs is slow. As Antoine has suggested, looking at logs for that date/time could help.

If that happens a lot, a short workaround would be to make the test wait for 10 seconds instead of 5.

Thanks for the response, Zeljko. I'm going to make the dumb manager response and ask the question (not just to Zeljko, but to everyone): as a permanent fix, can we change the wait to 60 seconds and call it good? How was 5 seconds arrived at as the time for an automated test to fail? Labs was never intended for performance testing, and it's not suited for it, so if the rationale is because we're testing end-user performance, we should stop. SauceLabs also isn't designed for it, so any effort to use it will also end in sadness. 60 seconds may be a bit extreme, but really, let's set a number here (and everywhere we have timeouts) that's high enough to stop getting false positives, and leave it there. In cases where we want to automate a responsiveness test, let's make sure we're doing it against test2 or some production cluster machine, and that we're doing it from a client that isn't also likely to introduce random delays.

Rob

...

1: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... 2: timed out after 5 seconds, waiting for {:css=>".wikitext-editor", :tag_name=>"textarea"} to become present (Watir::Wait::TimeoutError)

Arthur Richards

10:19 p.m.

New subject: [QA] Failing MobileFrontend browser tests

Greg, Rob, Tomasz and I just had an IRL conversation about this. Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see. I'll followup with the mobile web team directly to start digging into this. Further, it was clarified that Greg G has ownership of getting the issues resolved. We also agreed that for the time being, mobile-tech will be removed from the list of recipients of the failure emails until the issues are resolved. However, no one in the room was sure how to actually do this - Zeljko, Chris, Dan, is this something one of you can help out with?

Finally, I'd like to mention that none of the conversation on this thread was intended to question the integrity or validity of the hard work that the QA team has put in to making improvements to the testing infrastructure. We're all on the same (figurative) team, and we understand that it takes time to iron out inevitable issues particularly when it pertains to complex systems, migrations, etc. At the end of the day, we're very eager to be able to fully leverage an automated test system to help us ship better quality stuff, and the heart of this conversation is about resolving the things currently standing in the way of that goal.

On Fri, Jul 11, 2014 at 12:39 PM, Rob Lanphier robla@wikimedia.org wrote:

...

On Fri, Jul 11, 2014 at 10:12 AM, Željko Filipin zfilipin@wikimedia.org wrote:

...
On Thu, Jul 10, 2014 at 12:57 AM, Juliusz Gonera jgonera@wikimedia.org wrote:

...
https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... Editor doesn't seem to load, possible causes: beta labs API error, or problem with connection between saucelabs and beta labs

My guess is that you are talking about this failure[1]. Looking at the Sauce Labs screencast, it looks to me that labs was just slow to respond and the test failed after 5 seconds with pretty descriptive error message[2]. It is hard for me to say why labs is slow. As Antoine has suggested, looking at logs for that date/time could help.

If that happens a lot, a short workaround would be to make the test wait for 10 seconds instead of 5.

Thanks for the response, Zeljko. I'm going to make the dumb manager response and ask the question (not just to Zeljko, but to everyone): as a permanent fix, can we change the wait to 60 seconds and call it good? How was 5 seconds arrived at as the time for an automated test to fail? Labs was never intended for performance testing, and it's not suited for it, so if the rationale is because we're testing end-user performance, we should stop. SauceLabs also isn't designed for it, so any effort to use it will also end in sadness. 60 seconds may be a bit extreme, but really, let's set a number here (and everywhere we have timeouts) that's high enough to stop getting false positives, and leave it there. In cases where we want to automate a responsiveness test, let's make sure we're doing it against test2 or some production cluster machine, and that we're doing it from a client that isn't also likely to introduce random delays.

Rob

...
1: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... 2: timed out after 5 seconds, waiting for {:css=>".wikitext-editor", :tag_name=>"textarea"} to become present (Watir::Wait::TimeoutError)

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

James Forrester

10:50 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On 11 July 2014 15:19, Arthur Richards arichards@wikimedia.org wrote:

...

Greg, Rob, Tomasz and I just had an IRL conversation about this. Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see. I'll followup with the mobile web team directly to start digging into this. Further, it was clarified that Greg G has ownership of getting the issues resolved. We also agreed that for the time being, mobile-tech will be removed from the list of recipients of the failure emails until the issues are resolved. However, no one in the room was sure how to actually do this - Zeljko, Chris, Dan, is this something one of you can help out with?

Steps (I had to do this last week; sharing the learning rather than just replicating the issue):

1. Go to https://integration.wikimedia.org/ci/view/BrowserTests/ 2. Be logged in as someone with admin permissions (I *think* that's automatic for ldap/wmf) 3. Go to the browser test you want to modify (e.g. the MobileFrontend Chrome enwiki BetaLabs one https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/ ) 4. Click "configure" in the upper-left of the project page. 5. Scroll down to "Project Recipient List" 6. Add/remove as needed. 7. Press "Save" at the bottom of the page.

Have done this for MobileFrontend's Chrome and Firefox enwiki BetaLabs, and the Firefox test2 Prod projects. They now only send to qa-alerts and Chris McMahon.

HTH!

-- James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc. jforrester@wikimedia.org | @jdforrester

Dan Duvall

11:04 p.m.

New subject: [QA] Failing MobileFrontend browser tests

I believe many of the jobs listed there are defined in integration/jenkins-job-builder-config (see jobs.yaml in the cloudbees branch). Whether changes made through the web interface will be clobbered by the next import, I'm not sure. Antoine will likely know more.

On Fri, Jul 11, 2014 at 3:50 PM, James Forrester jforrester@wikimedia.org wrote:

...

On 11 July 2014 15:19, Arthur Richards arichards@wikimedia.org wrote:

...
Greg, Rob, Tomasz and I just had an IRL conversation about this. Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see. I'll followup with the mobile web team directly to start digging into this. Further, it was clarified that Greg G has ownership of getting the issues resolved. We also agreed that for the time being, mobile-tech will be removed from the list of recipients of the failure emails until the issues are resolved. However, no one in the room was sure how to actually do this - Zeljko, Chris, Dan, is this something one of you can help out with?

Steps (I had to do this last week; sharing the learning rather than just replicating the issue):

Go to https://integration.wikimedia.org/ci/view/BrowserTests/

Be logged in as someone with admin permissions (I *think* that's

automatic for ldap/wmf) 3. Go to the browser test you want to modify (e.g. the MobileFrontend Chrome enwiki BetaLabs one https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/ ) 4. Click "configure" in the upper-left of the project page. 5. Scroll down to "Project Recipient List" 6. Add/remove as needed. 7. Press "Save" at the bottom of the page.

Have done this for MobileFrontend's Chrome and Firefox enwiki BetaLabs, and the Firefox test2 Prod projects. They now only send to qa-alerts and Chris McMahon.

HTH!

J.

James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.

jforrester@wikimedia.org | @jdforrester

QA mailing list QA@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/qa

-- Dan Duvall Automation Engineer Wikimedia Foundation http://wikimediafoundation.org

Željko Filipin

14 Jul 14 Jul

1:45 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Sat, Jul 12, 2014 at 1:04 AM, Dan Duvall dduvall@wikimedia.org wrote:

...

I believe many of the jobs listed there are defined in integration/jenkins-job-builder-config (see jobs.yaml in the cloudbees branch).

_All_ jobs are managed via JJB.

...

Whether changes made through the web interface will be clobbered by the next import, I'm not sure.

Yes, the changes made via the web interface will be overwritten.

Željko

Arthur Richards

11 Jul 11 Jul

11:04 p.m.

New subject: [QA] Failing MobileFrontend browser tests

Nice! Thanks James :)

On Fri, Jul 11, 2014 at 3:50 PM, James Forrester jforrester@wikimedia.org wrote:

...

On 11 July 2014 15:19, Arthur Richards arichards@wikimedia.org wrote:

...
Greg, Rob, Tomasz and I just had an IRL conversation about this. Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see. I'll followup with the mobile web team directly to start digging into this. Further, it was clarified that Greg G has ownership of getting the issues resolved. We also agreed that for the time being, mobile-tech will be removed from the list of recipients of the failure emails until the issues are resolved. However, no one in the room was sure how to actually do this - Zeljko, Chris, Dan, is this something one of you can help out with?

Steps (I had to do this last week; sharing the learning rather than just replicating the issue):

Go to https://integration.wikimedia.org/ci/view/BrowserTests/

Be logged in as someone with admin permissions (I *think* that's

automatic for ldap/wmf) 3. Go to the browser test you want to modify (e.g. the MobileFrontend Chrome enwiki BetaLabs one https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/ ) 4. Click "configure" in the upper-left of the project page. 5. Scroll down to "Project Recipient List" 6. Add/remove as needed. 7. Press "Save" at the bottom of the page.

Have done this for MobileFrontend's Chrome and Firefox enwiki BetaLabs, and the Firefox test2 Prod projects. They now only send to qa-alerts and Chris McMahon.

HTH!

J.

James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc.

jforrester@wikimedia.org | @jdforrester

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Željko Filipin

14 Jul 14 Jul

1:44 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Sat, Jul 12, 2014 at 12:50 AM, James Forrester jforrester@wikimedia.org wrote:

...

Steps (I had to do this last week; sharing the learning rather than just replicating the issue):

Go to https://integration.wikimedia.org/ci/view/BrowserTests/

Be logged in as someone with admin permissions (I *think* that's

automatic for ldap/wmf) 3. Go to the browser test you want to modify (e.g. the MobileFrontend Chrome enwiki BetaLabs one https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/ ) 4. Click "configure" in the upper-left of the project page. 5. Scroll down to "Project Recipient List" 6. Add/remove as needed. 7. Press "Save" at the bottom of the page.

Oh noes! That is _not_ the way to do it! :)

We use JJB[1] for job configuration. This[2] is how to do it.

You could _temporarily_ change a Jenkins job via the web interface (useful for debugging a job), but the next time somebody pushes a change via JJB, your change will be overwritten.

James, the changes you have made are overwritten, since we have been updating jobs via JJB last week. Let me know if you need help making changes to jobs.

Željko -- 1: http://ci.openstack.org/jenkins-job-builder/ 2: https://gerrit.wikimedia.org/r/#/c/146056/

James Forrester

3:28 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On 14 July 2014 06:44, Željko Filipin zfilipin@wikimedia.org wrote:

...

On Sat, Jul 12, 2014 at 12:50 AM, James Forrester < jforrester@wikimedia.org> wrote:

...
Steps (I had to do this last week; sharing the learning rather than just replicating the issue):

Go to https://integration.wikimedia.org/ci/view/BrowserTests/

Be logged in as someone with admin permissions (I *think* that's

automatic for ldap/wmf) 3. Go to the browser test you want to modify (e.g. the MobileFrontend Chrome enwiki BetaLabs one https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-MobileFrontend-en.m.wikipedia.beta.wmflabs.org-linux-chrome-sauce/ ) 4. Click "configure" in the upper-left of the project page. 5. Scroll down to "Project Recipient List" 6. Add/remove as needed. 7. Press "Save" at the bottom of the page.

Oh noes! That is _not_ the way to do it! :)

We use JJB[1] for job configuration. This[2] is how to do it.

You could _temporarily_ change a Jenkins job via the web interface (useful for debugging a job), but the next time somebody pushes a change via JJB, your change will be overwritten.

James, the changes you have made are overwritten, since we have been updating jobs via JJB last week. Let me know if you need help making changes to jobs.

Ah, interesting. All the changes I've made to the VisualEditor jobs have stuck for weeks – presumably they've not been updated since? Apologies all for the mis-information.

-- James D. Forrester Product Manager, Editing Wikimedia Foundation, Inc. jforrester@wikimedia.org | @jdforrester

Željko Filipin

17 Jul 17 Jul

1:18 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Mon, Jul 14, 2014 at 5:28 PM, James Forrester jforrester@wikimedia.org wrote:

...

All the changes I've made to the VisualEditor jobs have stuck for weeks – presumably they've not been updated since?

Probably. Let me know if you need help changing VisualEditor jobs.

Željko

Arthur Richards

11 Jul 11 Jul

11:47 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Fri, Jul 11, 2014 at 3:19 PM, Arthur Richards arichards@wikimedia.org wrote:

...

Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see.

Juliusz is going to coordinate this - should bugs get filed under Wikimedia -> Quality Assurance, or is there a better place for them?

-- Arthur Richards Team Practices Lead [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Greg Grossmeier

12 Jul 12 Jul

12:51 a.m.

New subject: [QA] Failing MobileFrontend browser tests

...

On Fri, Jul 11, 2014 at 3:19 PM, Arthur Richards arichards@wikimedia.org wrote:

...
Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see.

Juliusz is going to coordinate this - should bugs get filed under Wikimedia -> Quality Assurance, or is there a better place for them?

If the issue is a browser test or as-of-yet unclear, yeah. If it turns out to really be a bug somewhere else, we can move it.

Greg

(PS: I trimmed the cc's again, assuming robla was on qa and/or mobile, and arthur was on either and maryana was on mobile-l, let me know if that's wrong.)

-- | Greg Grossmeier GPG: B2FA 27B1 F7EB D327 6B8E | | identi.ca: @greg A18D 1138 8E47 FAC8 1C7D |

Željko Filipin

14 Jul 14 Jul

1:51 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Sat, Jul 12, 2014 at 1:47 AM, Arthur Richards arichards@wikimedia.org wrote:

...

Juliusz is going to coordinate this - should bugs get filed under Wikimedia -> Quality Assurance, or is there a better place for them?

That is a good place, especially for anything related to Ruby, Selenium, Cucumber, page object pattern and friends.

Wikimedia > Continuous integration is a good place for anything related to Jenkins.

Željko

Rob Lanphier

12 Jul 12 Jul

1:43 a.m.

New subject: [QA] Failing MobileFrontend browser tests

On Fri, Jul 11, 2014 at 3:19 PM, Arthur Richards arichards@wikimedia.org wrote:

...

Greg, Rob, Tomasz and I just had an IRL conversation about this. Given some of the ambiguity of the test failures we've been discussing as related to 'infrastructure/architecture issues', we should be filing specific bug reports in Bugzilla in regards to the issues we see. I'll followup with the mobile web team directly to start digging into this. Further, it was clarified that Greg G has ownership of getting the issues resolved. We also agreed that for the time being, mobile-tech will be removed from the list of recipients of the failure emails until the issues are resolved. However, no one in the room was sure how to actually do this - Zeljko, Chris, Dan, is this something one of you can help out with?

Finally, I'd like to mention that none of the conversation on this thread was intended to question the integrity or validity of the hard work that the QA team has put in to making improvements to the testing infrastructure. We're all on the same (figurative) team, and we understand that it takes time to iron out inevitable issues particularly when it pertains to complex systems, migrations, etc. At the end of the day, we're very eager to be able to fully leverage an automated test system to help us ship better quality stuff, and the heart of this conversation is about resolving the things currently standing in the way of that goal.

Arthur, thanks for the excellent recap and for the conversation earlier. I'm really happy we were able to hash things out, and my apologies again for being glib about the test failures.

Rob

Željko Filipin

14 Jul 14 Jul

1:36 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Sat, Jul 12, 2014 at 12:19 AM, Arthur Richards arichards@wikimedia.org wrote:

...

We also agreed that for the time being, mobile-tech will be removed from the list of recipients of the failure emails until the issues are resolved. However, no one in the room was sure how to actually do this - Zeljko, Chris, Dan, is this something one of you can help out with?

Done[1].

Željko -- 1: https://gerrit.wikimedia.org/r/#/c/146056/

Željko Filipin

1:34 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Fri, Jul 11, 2014 at 9:39 PM, Rob Lanphier robla@wikimedia.org wrote:

...

as a permanent fix, can we change the wait to 60 seconds and call it good?

That would be a workaround, not really a fix, especially not a permanent one. It is doable[1-2] and it might help, but it also means that every time a test really fails because an element is not present, it will not fail after waiting for 5, but 60 seconds. That might make the test runs longer.

I will create a few test jobs and see if it helps.

A permanent fix would be to run both the Jenkins job, the browser and the mediawiki instance on the same machine (or as close as possible), instead of reaching over the internet to Sauce Labs, WMF labs or production.

...

How was 5 seconds arrived at as the time for an automated test to fail?

We have an option of using three (yes, 3) APIs to drive the browser, but I think all three of them default to waiting for an element for 5 seconds and then giving up.

Željko -- 1: https://code.google.com/p/selenium/wiki/RubyBindings#Implicit_waits 2: http://rdoc.info/gems/selenium-webdriver/Selenium/WebDriver/Timeouts#implici...

S Page

16 Jul 16 Jul

9:37 p.m.

New subject: [QA] Failing MobileFrontend browser tests

On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera jgonera@wikimedia.org wrote:

...

https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Mobi... getaddrinfo: Name or service not known (SocketError) - seems like a problem with network on saucelabs

Three Flow Chrome browsertests on beta labs run at Sauce Labs failed today with "getaddrinfo: Name or service not known (SocketError)" on Jul 16, 2014 6:26:46 PM (UTC?, I think 11:26 AM SF time). See https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Flow...

15 minutes earlier a Firefox test also failed with the getaddrinfo error, see https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Flow...

So I filed *Bug 68125* https://bugzilla.wikimedia.org/show_bug.cgi?id=68125 - browser tests failing with "getaddrinfo: Name or service not known (SocketError)"

[Sage manager] suggested

...

can we change the wait to 60 seconds and call it good? How was 5 seconds arrived at as the time for an automated test to fail?

The other Firefox test failure on that run was adding a topic took 6 seconds, thus triggering

timed out after 5 seconds, Element still visible after 5 seconds (Watir::Wait::TimeoutError)

Flow tests often fail with these timeouts yet the expected result appears is in the screencast or ends up on the test page. So yes, increasing the wait timeout to 10 seconds would cut down our false failures.

QA folk, is there a way to "grep" all browser tests for gettaddrinfo and "timed out after 5 seconds" to see if there's a pattern to when and how often they occur?

Thanks indeed,

-- =S Page Features engineer

3663

Age (days ago)

3671

Last active (days ago)

mobile-l@lists.wikimedia.org

36 comments

10 participants

tags (0)

participants (10)

Arthur Richards
Dan Duvall
Greg Grossmeier
James Forrester
Jon Robson
Juliusz Gonera
Rob Lanphier
S Page
Tomasz Finc
Željko Filipin