On Wed, Jul 9, 2014 at 3:57 PM, Juliusz Gonera <jgonera@wikimedia.org> wrote:

Three Flow Chrome browsertests on beta labs run at Sauce Labs failed today with "getaddrinfo: Name or service not known (SocketError)" on Jul 16, 2014 6:26:46 PM (UTC?, I think 11:26 AM SF time).  See https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-chrome-sauce/65/

15 minutes earlier a Firefox test also failed with the getaddrinfo error, see https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox-sauce/72/

So I filed  Bug 68125 - browser tests failing with "getaddrinfo: Name or service not known (SocketError)"

[Sage manager] suggested
can we change the wait to 60 seconds and call it good?  How was 5 seconds arrived at as the time for an automated test to fail?

The other Firefox test failure on that run was adding a topic took 6 seconds, thus triggering
timed out after 5 seconds, Element still visible after 5 seconds (Watir::Wait::TimeoutError)
Flow tests often fail with these timeouts yet the expected result appears is in the screencast or ends up on the test page. So yes, increasing the wait timeout to 10 seconds would cut down our false failures.

QA folk, is there a way to "grep" all browser tests for gettaddrinfo and "timed out after 5 seconds" to see if there's a pattern to when and how often they occur?

Thanks indeed,
--
=S Page  Features engineer