[QA] Tests timing out when running via Jenkins/SauceLabs

Sat Nov 16 23:22:23 UTC 2013

On Fri, Nov 15, 2013 at 12:28 PM, Jeff Hall <jhall at wikimedia.org> wrote:

>  I spent some time today looking at automated test failures via
> CloudBees/Jenkins (https://wmf.ci.cloudbees.com), and a pretty common
> theme of tests that fail inconsistently is "Watir::Wait::TimeoutError"
> issues.  Here's an example of a recent failure that falls into this
> category:
>
>
> https://wmf.ci.cloudbees.com/job/browsertests-commons.wikimedia.beta.wmflabs.org-linux-chrome/463/testReport/(root)/UploadWizard/Navigate_to_Describe_page/
>

Yes, this is a legitimate timeout issue, unlike the UW test failure on Nov
14, another of those maddening unexplained js errors from a ResourceLoader
call:
https://wmf.ci.cloudbees.com/job/browsertests-commons.wikimedia.beta.wmflabs.org-linux-chrome/461/

>From previous experience working with SauceLabs, I know that this is not
> unusual, since by definition you're initiating a test workflow that creates
> a lot of network traffic, and latencies are probably inevitable.
>
> What I'm wondering is whether or not it might be a good idea to use the
> page-object "wait_until" method more widely.  For example, we currently use
> it in aftv5_steps.rb<https://github.com/wikimedia/qa-browsertests/blob/master/features/step_definitions/aftv5_steps.rb>
> .
>

You're right about the root cause, but in this case I would rather provide
a longer timeout value to the when_present() call than the default 5
seconds.

You can see the line in question, line 17 at
https://github.com/wikimedia/qa-browsertests/blob/master/features/step_definitions/upload_wizard_steps.rb
which
should show as

(UploadPage).continue_element.when_present.click

We can make that

 (UploadPage).continue_element.when_present(15).click

and accomplish the same thing in a definite span of time, 15 seconds.

I realize that adding any type of sleep or wait behavior to a test just
> causes overall test execution time to increase, but I'm thinking it's more
> important to have fewer failing tests overall, so that folks can focus
> their trouble-shooting efforts on test failures that may be a consequence
> of actual bugs (and not just timeouts).
>
> I'd love to hear other opinions on this topic, so please speak up if you
> have an opinion ;)
>

I agree that explicit sleep() calls are bad design, but I like polling for
conditions:

when_present() polls until an element can be engaged
wait_until() polls until a condition returns true
wait_while() polls until a condition returns false

That aft5 test is a good example:

page.wait_until(10) do
      page.text.include? 'Thanks!'
    end
    page.text.should include 'Your post can be viewed on this feedback page.'

says:  "We'll hang out until the AFT post is processed.  We know that
the processing is finished when the page contains the text "Thanks".
At that point we should have a message showing a link to the feedback
page"

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/qa/attachments/20131116/7833b792/attachment-0001.html>