As QA Lead for WMF, one of the things I want to do is to create an institutional suite of automated cross-browser regression tests using Selenium. I have two goals for this suite: first, to make it attractive and convenient for the greater software testing community to both use the suite locally and also to contribute to it. Second, to have the suite be a useful regression test tool within WMF, tied to Beta Labs and controlled by Jenkins.
For various reasons, I think the best language for this project is Ruby. I realize that is a controversial choice, and I would like to explain my reasoning. First let me address what I think will be the most serious objections:
** Ruby gems are incompatible with Debian/Ubuntu apt packaging, making it difficult or impossible to maintain Ruby code in production.
The selenium test suite is not intended to run on production servers. There are two targets for this code. The first target is users' local machines, including and especially Windows users. The second target is a single dedicated headless Labs Ubuntu instance, controlled by Jenkins, serving as a client to selenium-server, where selenium-server is running on the various Windows etc. VMs that exist today on the WMF VPN.
** It's not PHP.
As of today, PHP has no complete or authoritative implementation of selenium-webdriver/Selenium 2.0. That situation is unlikely to change any time soon. This leaves a choice of Ruby or Python. For various reasons I think Ruby is the superior choice.
** Design goals and their implementation.
In the interest of making this project as attractive as possible to the greater testing community, I have defined a browser automation "stack", using the most current and accepted practices for browser test automation. The stack looks like this:
* Selenium-webdriver low-level "toolbox" API. * Higher-level API for consistent access to pages and elements without the user having to create their own handling for timeouts, exceptions, "stale objects", multiple access criteria, etc. * Modern, BDD-style assertions for pass/fail criteria (as opposed to xUnit style assertions) * "Page Object" design pattern in place and functioning out of the box * Support for mobile emulators * Institutional support for Jenkins integration
I submit that having these things in place and working when the user downloads the suite is what will make this project attractive to the global testing community.
Taking these point by point:
Even as I write this, the W3C is meeting in London to approve webdriver as an internet standard, with Selenium 2.0 as the reference implementation of that standard. This means that the selenium 2.0 API is only a fraction the size of that of Selenium 1.0, and serves a very different purpose. The selenium 2.0 API can be considered a "toolbox" from which higher-level APIs are constructed. Both Ruby and Python have full implementation and support for the Selenium 2.0 API.
At higher levels of the stack, the Python and Ruby communities take different design approaches, with Python being more DIY/NIH and Ruby being more shared and organized around communities of practice. This is what makes Ruby so attractive as I implemented the rest of the stack.
Watir (Web Application Testing In Ruby) is a browser test project that pre-dates Selenium itself. Watir provides an intuitive, thoughtful, well-designed high-level API for access to pages and elements that the native selenium-webdriver API does not. While historically Watir and Selenium have been different projects, as of about 2010 watir-webdriver is simply a wrapper for the low-level selenium API that preserves all of the convenience and good design Watir has always had. Watir is in use today at Facebook, The Gap, and many other places. Python has no such equivalent high-level API.
Behavior Driven Development (BDD) style assertions are implemented in Ruby with the Rspec library. This approach to assertions has been a generally accepted standard in the Ruby community for some time. "The Rspec Book" was published by Pragmatic Press in 2010. Python has no equivalent BDD-style assertion library. In fact, Rspec takes advantage of Ruby's affordance for metaprogramming, something not possible in Python.
"Page Objects" is a design pattern for browser testing that has become a generally accepted practice in the browser testing community, with a lot of momentum particularly in the last one or two years. Page Objects have institutional support in Ruby with the 'page-object' gem, which supports both watir-webdriver and selenium-webdriver syntax. Python has no such institutional support for page objects, each Python project using page objects implements the pattern locally, from scratch.
Some support for mobile emulators is provided by Ruby's webdriver-user-agent gem, created by Alister Scott of Thoughtworks. This gem piggybacks on either the watir-webdriver or selenium-webdriver APIs in Ruby. I am unaware of any such mobile emulator support in Python.
Ruby selenium tests are run via 'rake', the Ruby version of make. Jenkins has an officially supported plugin for rake.
Both Ruby and Python are viable choices for this project. The difference is that Python would require a lot of custom infrastructure and scaffolding, with all of the risk and maintenance that that entails, where support for the elements of the "stack" in Ruby is already in place, well-designed, and supported by the greater Ruby community. Not only does this make maintenance of the stack for WMF purposes much simpler, it also makes the project significantly more attractive to casual users in the testing community who can get started with the suite immediately by simply installing a few gems locally and reading some public documentation, rather than having to face a daunting pile of custom code.
For reference and further research, I created a prototype of this project that implements this stack here: https://github.com/chrismcmahon/Page-Object-WMF-spike
Addendum: I am aware of two projects at WMF already using Selenium, one with OmniTI for AFTv5, the other from Jeremy Postlethwaite. Since the purpose of this project is to attract members of the global testing community, and to provide an institutional regression test suite for Mediawiki/Wikipedia, I see no reason to require individual WMF projects to use this framework, although I suspect that individual WMF projects might find this framework more convenient than other approaches.
On Fri, Apr 6, 2012 at 7:45 AM, Chris McMahon cmcmahon@wikimedia.org wrote:
For various reasons, I think the best language for this project is Ruby. I realize that is a controversial choice, and I would like to explain my reasoning. First let me address what I think will be the most serious objections:
** Ruby gems are incompatible with Debian/Ubuntu apt packaging, making it difficult or impossible to maintain Ruby code in production.
You will want to talk to OPs, AFAIK they have desires to not see any more ruby on the cluster, since very little amount of the dev community known it and/or how to maintain.
On Thu, 05 Apr 2012 14:53:27 -0700, K. Peachey p858snake@gmail.com wrote:
On Fri, Apr 6, 2012 at 7:45 AM, Chris McMahon cmcmahon@wikimedia.org wrote:
For various reasons, I think the best language for this project is Ruby. I realize that is a controversial choice, and I would like to explain my reasoning. First let me address what I think will be the most serious objections:
** Ruby gems are incompatible with Debian/Ubuntu apt packaging, making it difficult or impossible to maintain Ruby code in production.
You will want to talk to OPs, AFAIK they have desires to not see any more ruby on the cluster, since very little amount of the dev community known it and/or how to maintain.
We really need to start surveying real statistics on what programming languages community members know. I've seen this assertion waved around again and again, but don't see where it originates from, besides the very unreliable fact that we just don't talk much about ruby around here.
I for one know ruby. And frankly ruby isn't like our other issue. Ruby is nothing like how hard OCaml would be to learn and maintain.
We really should beat down this notion that anything written in ruby should be avoided. I don't like how we reject the possibility of using well written existing open-source projects simply because they were written using ruby. It's almost as bad as NIH.
We really need to start surveying real statistics on what programming languages community members know. I've seen this assertion waved around again and again, but don't see where it originates from, besides the very unreliable fact that we just don't talk much about ruby around here.
I for one know ruby. And frankly ruby isn't like our other issue. Ruby is nothing like how hard OCaml would be to learn and maintain.
We really should beat down this notion that anything written in ruby should be avoided. I don't like how we reject the possibility of using well written existing open-source projects simply because they were written using ruby. It's almost as bad as NIH.
How many languages can we reasonably support? We're currently using PHP, Python, Java, OCaml and Javascript (and probably more). Should we also throw Ruby in here as well? What level of support are the Selenium tests really going to get if they require developers to use Ruby?
We've already gone down the Ruby road once. I think a lot of the people involved with that would say it was a bad call, especially ops.
- Ryan
On Thu, Apr 5, 2012 at 5:25 PM, Ryan Lane rlane32@gmail.com wrote:
How many languages can we reasonably support? We're currently using PHP, Python, Java, OCaml and Javascript (and probably more). Should we also throw Ruby in here as well? What level of support are the Selenium tests really going to get if they require developers to use Ruby?
I was initially pretty skeptical on the Ruby choice, and I'm not going to say that I'm sold yet. However, the part of Chris's proposal that's most persuasive to me is the fact that Ruby/Selenium seems to be the most built out, with the largest community. In particular, RSpec seems to be a rather important piece of all of this.
I decided to see if there was a Python equivalent of RSpec, and as often is the case these days, the exact question was asked on Stack Overflow: http://stackoverflow.com/questions/7079855/are-there-technical-reasons-a-rub...
In short, Ruby lends itself to that kind of thing a lot more.
In reading the Stack Overflow thing, I was reminded of a talk I saw a couple of years ago titled "Python vs. Ruby: A Battle to The Death" http://blog.extracheese.org/2010/02/python-vs-ruby-a-battle-to-the-death.htm...
...which, as it turns out, specifically talks about RSpec.
Anyway, the point that I'm making here is that it's quite possible that Ruby really is the right tool for the job. Since I'm much more comfortable with Python than Ruby, I'd be much more comfortable if we stuck with Python. Sticking with Python may mean a lot of wheel reinvention to accommodate language orthodoxy which really kinda sucks.
I'd like us to test the assertion that the Ruby/Selenium combination is much more mature than the Python/Selenium combination, but if that's true, then I think we may want to suppress the anti-Ruby bias and figure out how we can work with Ruby/Selenium.
Rob
On Thu, Apr 5, 2012 at 5:25 PM, Ryan Lane rlane32@gmail.com wrote:
How many languages can we reasonably support? We're currently using PHP, Python, Java, OCaml and Javascript (and probably more). Should we also throw Ruby in here as well? What level of support are the Selenium tests really going to get if they require developers to use Ruby?
It might be good to see examples of what MW developers would actually have to do to implement new Selenium tests once the framework is complete. There's a login example in the github prototype that's straight forward but I assume it will get simpler as more is written which can be reused. I doubt it will require much in terms of actual ruby finesse.
We've already gone down the Ruby road once. I think a lot of the
people involved with that would say it was a bad call, especially ops.
Ruby at scale can certainly be a lulz engine, especially for those on the sidelines. This project doesn't seem to place any software demands on the production cluster, or even necessarily require anything from ops though.
I assume the road you refer to was the mobile gateway; I consider that to have been a train wreck primarily from a project standpoint as opposed to a technical one. When I stumbled upon it, there wasn't an employee with the combination of access and knowledge required to commit code changes to its read-only-to-us repo, and to deploy those changes. We were essentially passing bits of duct tape back and forth by transatlantic carrier pigeon. For a slew of reasons, it makes much more sense to do what we're doing now with MobileFrontend, but we've yet to reach the point where it does anything the ruby gateway couldn't have done with a bit of iteration. In its last incarnation, it was typically faster than the current MobileFrontend for a request not served by the frontend caching layer. The point being, I don't think language was the main issue there.
Chris makes a compelling argument that his preferred route is closer to being off the shelf and widely supported by industry and community. I have no comment on what QA engineers prefer to hack on, but I think the ease of hiring new ones who are good at what they do and excited about the tools they get to use should be part of the decision.
-A
Le 06/04/12 08:43, Asher Feldman a écrit :
passing bits of duct tape back and forth by transatlantic carrier pigeon.
At least that was using a standard:
RFC 1149 http://www.ietf.org/rfc/rfc1149.txt "A Standard for the Transmission of IP Datagrams on Avian Carriers"
PHPUnit has been working on Selenium2 with the backwards compatibility API:
https://github.com/sebastianbergmann/phpunit-selenium/blob/master/PHPUnit/Ex...
I have not had a chance to test it out, it is still experimental.
It would seem appropriate for someone to add it to the extension UnitTest extension:
https://www.mediawiki.org/wiki/Extension:UnitTest
I have spoken to Chris and Mark about the extension.
Correct me if I am wrong, Chris: I think we came to the conclusion that UnitTest may be better suited for local developer usage instead of automated testing.
I would also like to see examples of Watir.
On Fri, Apr 6, 2012 at 4:00 AM, Antoine Musso hashar+wmf@free.fr wrote:
Le 06/04/12 08:43, Asher Feldman a écrit :
passing bits of duct tape back and forth by transatlantic carrier pigeon.
At least that was using a standard:
RFC 1149 http://www.ietf.org/rfc/rfc1149.txt "A Standard for the Transmission of IP Datagrams on Avian Carriers"
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Correct me if I am wrong, Chris: I think we came to the conclusion that UnitTest may be better suited for local developer usage instead of automated testing.
Jeremy, yes, I'm aware of what you're doing and I think it makes sense in context. In my original post I said " Since the purpose of this project is to attract members of the global testing community, and to provide an institutional regression test suite for Mediawiki/Wikipedia, I see no reason to require individual WMF projects to use this framework..."
I would also like to see examples of Watir.
Watir is pretty slick: http://rubydoc.info/gems/watir-webdriver/0.5.4/frames . I'm not sure if you're using selenium-webdriver or the old Selenium 1.0, but Watir solves a lot of issues for selenium-webdriver. In fact, if you're using selenium-webdriver in any language, you probably end up hacking up something that looks like Watir anyway. Watir has the advantage of nearly a decade of design and refiinement behind it, plus the API is generated directly from the HTML5 spec today. (just an aside, Selenium 1.0 is not yet fully deprecated, there is still a ton of Se1.0 out in the world, but the Se1.0 API will eventually be fully retired.)
On Fri, Apr 6, 2012 at 4:00 AM, Antoine Musso hashar+wmf@free.fr wrote:
Le 06/04/12 08:43, Asher Feldman a écrit :
passing bits of duct tape back and forth by transatlantic carrier
pigeon.
At least that was using a standard:
RFC 1149 http://www.ietf.org/rfc/rfc1149.txt "A Standard for the Transmission of IP Datagrams on Avian Carriers"
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
-- Jeremy Postlethwaite jpostlethwaite@wikimedia.org 415-839-6885 x6790 Backend Software Developer Wikimedia Foundation http://wikimediafoundation.org/ _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
We've already gone down the Ruby road once. I think a lot of the
people involved with that would say it was a bad call, especially ops.
Ruby at scale can certainly be a lulz engine, especially for those on the sidelines. This project doesn't seem to place any software demands on the production cluster, or even necessarily require anything from ops though.
I had mentioned in IRC that I don't have any major ops objections as long as whatever gems are required are installable via apt in some way, rather than using gems.
I assume the road you refer to was the mobile gateway; I consider that to have been a train wreck primarily from a project standpoint as opposed to a technical one. When I stumbled upon it, there wasn't an employee with the combination of access and knowledge required to commit code changes to its read-only-to-us repo, and to deploy those changes. We were essentially passing bits of duct tape back and forth by transatlantic carrier pigeon. For a slew of reasons, it makes much more sense to do what we're doing now with MobileFrontend, but we've yet to reach the point where it does anything the ruby gateway couldn't have done with a bit of iteration. In its last incarnation, it was typically faster than the current MobileFrontend for a request not served by the frontend caching layer. The point being, I don't think language was the main issue there.
Ubuntu support for ruby was problematic. The need for a bunch of gems was also problematic.
This particular use of ruby is less problematic since we won't need to run a web service using ruby. My main objection is really that we're fracturing our codebase into a bunch of languages. It's hard enough getting people to write tests, and making people write their code in PHP, some of their tests in PHP, and some of the tests in Ruby is likely prone to failure. I don't think python is a reasonable choice here either, honestly.
Chris makes a compelling argument that his preferred route is closer to being off the shelf and widely supported by industry and community. I have no comment on what QA engineers prefer to hack on, but I think the ease of hiring new ones who are good at what they do and excited about the tools they get to use should be part of the decision.
I'd be more interested in what's more likely to grow and keep our community, and give us proper tests, than who's easier to hire. We're based on the concept of scaling people by building communities.
- Ryan
On Fri, 06 Apr 2012 12:33:40 -0700, Ryan Lane rlane32@gmail.com wrote:
We've already gone down the Ruby road once. I think a lot of the
people involved with that would say it was a bad call, especially ops.
Ruby at scale can certainly be a lulz engine, especially for those on the sidelines. This project doesn't seem to place any software demands on the production cluster, or even necessarily require anything from ops though.
I had mentioned in IRC that I don't have any major ops objections as long as whatever gems are required are installable via apt in some way, rather than using gems.
Ryan, I only ran into it recently. But look over bundler: http://gembundler.com/rationale.html If another situation where something needs gems without existing apt packages comes up it may be a helpful thing to have in your toolkit of solutions.
Two useful things that bundler seems to provide: - In deployment it can install gems in a place local to the application. So instead of using `gem` or apt to globally install the gems needed they'll be installed locally in a way that won't conflict with other applications. - Bundler uses a Gemfile.lock setup; When you initially install and updated gems on development it tracks the installed versions of every single dependency installed (even indirect ones you didn't depend on directly). And when you develop you check this file into version control with the rest of your source code. Using this setup bundler ensures that every gem you install using bundler (especially under deployment) is installed using the exact same version of the gem that you used during development.; So if you're using apt and a central apt sources server to ensure that all servers have the same package versions installed, bundler should help you attain that same goal with gems that don't have apt packages.
- Ryan
Ryan, I only ran into it recently. But look over bundler: http://gembundler.com/rationale.html If another situation where something needs gems without existing apt packages comes up it may be a helpful thing to have in your toolkit of solutions.
Two useful things that bundler seems to provide:
- In deployment it can install gems in a place local to the application. So
instead of using `gem` or apt to globally install the gems needed they'll be installed locally in a way that won't conflict with other applications.
- Bundler uses a Gemfile.lock setup; When you initially install and updated
gems on development it tracks the installed versions of every single dependency installed (even indirect ones you didn't depend on directly). And when you develop you check this file into version control with the rest of your source code. Using this setup bundler ensures that every gem you install using bundler (especially under deployment) is installed using the exact same version of the gem that you used during development.; So if you're using apt and a central apt sources server to ensure that all servers have the same package versions installed, bundler should help you attain that same goal with gems that don't have apt packages.
If that goal is constantly changing versions of packages that may or may not have proper security patches applied due to dependency chains, then yes, it meets the goal.
I hate programming language package installers like pip, gems, etc. When Ubuntu ships versions of things, they keep stable versions and backport security fixes. This ensures that you'll have a consistent environment until you upgrade the OS, and that security patches are applied properly for everything in this environment.
If your application depends on gem blah-0.1, and specifies that, then you won't get security patches since gems expects you'll just upgrade to blah-0.9. It fills me with rage.
- Ryan
On Fri, Apr 6, 2012 at 10:17 AM, Daniel Friesen lists@nadir-seen-fire.com wrote:
I've seen this assertion waved around again and again, but don't see where it originates from, besides the very unreliable fact that we just don't talk much about ruby around here.
When I said community not knowing it, I was more referring to the OPs community (compared to the wider developer community) and others that are more likely to be around {if|when} something breaks and the perceivable monkey is flinging things, although it is a testing setup so it wouldn't matter as much compared to other utilities, or if there is a change of circumstances in the future and needs maintaining (eg: person resonbile leaves, assigned to other projects and doesn't have time, etc)
You should talk to Jeremy Postlethwaite jpostlethwaite@wikimedia.org. He did some work on setting up automated Selenium tests for our fundraiser.
Ryan Kaldari
On 4/5/12 2:45 PM, Chris McMahon wrote:
As QA Lead for WMF, one of the things I want to do is to create an institutional suite of automated cross-browser regression tests using Selenium. I have two goals for this suite: first, to make it attractive and convenient for the greater software testing community to both use the suite locally and also to contribute to it. Second, to have the suite be a useful regression test tool within WMF, tied to Beta Labs and controlled by Jenkins.
For various reasons, I think the best language for this project is Ruby. I realize that is a controversial choice, and I would like to explain my reasoning. First let me address what I think will be the most serious objections:
** Ruby gems are incompatible with Debian/Ubuntu apt packaging, making it difficult or impossible to maintain Ruby code in production.
The selenium test suite is not intended to run on production servers. There are two targets for this code. The first target is users' local machines, including and especially Windows users. The second target is a single dedicated headless Labs Ubuntu instance, controlled by Jenkins, serving as a client to selenium-server, where selenium-server is running on the various Windows etc. VMs that exist today on the WMF VPN.
** It's not PHP.
As of today, PHP has no complete or authoritative implementation of selenium-webdriver/Selenium 2.0. That situation is unlikely to change any time soon. This leaves a choice of Ruby or Python. For various reasons I think Ruby is the superior choice.
** Design goals and their implementation.
In the interest of making this project as attractive as possible to the greater testing community, I have defined a browser automation "stack", using the most current and accepted practices for browser test automation. The stack looks like this:
- Selenium-webdriver low-level "toolbox" API.
- Higher-level API for consistent access to pages and elements without the
user having to create their own handling for timeouts, exceptions, "stale objects", multiple access criteria, etc.
- Modern, BDD-style assertions for pass/fail criteria (as opposed to xUnit
style assertions)
- "Page Object" design pattern in place and functioning out of the box
- Support for mobile emulators
- Institutional support for Jenkins integration
I submit that having these things in place and working when the user downloads the suite is what will make this project attractive to the global testing community.
Taking these point by point:
Even as I write this, the W3C is meeting in London to approve webdriver as an internet standard, with Selenium 2.0 as the reference implementation of that standard. This means that the selenium 2.0 API is only a fraction the size of that of Selenium 1.0, and serves a very different purpose. The selenium 2.0 API can be considered a "toolbox" from which higher-level APIs are constructed. Both Ruby and Python have full implementation and support for the Selenium 2.0 API.
At higher levels of the stack, the Python and Ruby communities take different design approaches, with Python being more DIY/NIH and Ruby being more shared and organized around communities of practice. This is what makes Ruby so attractive as I implemented the rest of the stack.
Watir (Web Application Testing In Ruby) is a browser test project that pre-dates Selenium itself. Watir provides an intuitive, thoughtful, well-designed high-level API for access to pages and elements that the native selenium-webdriver API does not. While historically Watir and Selenium have been different projects, as of about 2010 watir-webdriver is simply a wrapper for the low-level selenium API that preserves all of the convenience and good design Watir has always had. Watir is in use today at Facebook, The Gap, and many other places. Python has no such equivalent high-level API.
Behavior Driven Development (BDD) style assertions are implemented in Ruby with the Rspec library. This approach to assertions has been a generally accepted standard in the Ruby community for some time. "The Rspec Book" was published by Pragmatic Press in 2010. Python has no equivalent BDD-style assertion library. In fact, Rspec takes advantage of Ruby's affordance for metaprogramming, something not possible in Python.
"Page Objects" is a design pattern for browser testing that has become a generally accepted practice in the browser testing community, with a lot of momentum particularly in the last one or two years. Page Objects have institutional support in Ruby with the 'page-object' gem, which supports both watir-webdriver and selenium-webdriver syntax. Python has no such institutional support for page objects, each Python project using page objects implements the pattern locally, from scratch.
Some support for mobile emulators is provided by Ruby's webdriver-user-agent gem, created by Alister Scott of Thoughtworks. This gem piggybacks on either the watir-webdriver or selenium-webdriver APIs in Ruby. I am unaware of any such mobile emulator support in Python.
Ruby selenium tests are run via 'rake', the Ruby version of make. Jenkins has an officially supported plugin for rake.
Both Ruby and Python are viable choices for this project. The difference is that Python would require a lot of custom infrastructure and scaffolding, with all of the risk and maintenance that that entails, where support for the elements of the "stack" in Ruby is already in place, well-designed, and supported by the greater Ruby community. Not only does this make maintenance of the stack for WMF purposes much simpler, it also makes the project significantly more attractive to casual users in the testing community who can get started with the suite immediately by simply installing a few gems locally and reading some public documentation, rather than having to face a daunting pile of custom code.
For reference and further research, I created a prototype of this project that implements this stack here: https://github.com/chrismcmahon/Page-Object-WMF-spike
Addendum: I am aware of two projects at WMF already using Selenium, one with OmniTI for AFTv5, the other from Jeremy Postlethwaite. Since the purpose of this project is to attract members of the global testing community, and to provide an institutional regression test suite for Mediawiki/Wikipedia, I see no reason to require individual WMF projects to use this framework, although I suspect that individual WMF projects might find this framework more convenient than other approaches. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
I wasn't able to make the selenium work (presumably selenium 1.0). How hard would it be to make this work? Who would be responsible of sorting out any problems encountered doing that and ensuring all developers can locally run those tests? (Actually, I'd try this before creating too many tests in that platform)
Say I've fixed a bug and want to add a test. How would I do it? In which language? Do I need to know ruby for that?
Suppose there's a test: * Failing in jenkins but passing locally. OR * Failing just for a single person. OR * Failing but it's apparently ok. OR * Passing but should be failing.
How do you debug it? Do you need to know ruby for that? To which extent?
On Sun, Apr 8, 2012 at 10:52 AM, Platonides Platonides@gmail.com wrote:
I wasn't able to make the selenium work (presumably selenium 1.0). How hard would it be to make this work? Who would be responsible of sorting out any problems encountered doing that and ensuring all developers can locally run those tests? (Actually, I'd try this before creating too many tests in that platform)
Because Selenium manipulates browsers at the UI level, whether a test passes or fails depends on the environment the test runs against. Ultimately I would like to have the labs beta wikis serve as the test environment of record, but we're not there yet. Until then, my approach would be to grow the number of these tests slowly and to keep them shallow (but still useful) as the test environment becomes more dependable.
As with Selenium in any language, running tests locally requires enough infrastructure in place to do that. I have not yet written any detailed instructions to do that, but my prototype I mentioned earlier is entirely functional and a few people in the Ruby testing community have tried it out already and offered some critique.
Say I've fixed a bug and want to add a test. How would I do it? In which
language? Do I need to know ruby for that?
Assuming that the change is manifested in the UI, one would add a test according to a Page Object model: define the structure of the page in question if it is not already defined, then create a test to manipulate that structure.
In Ruby there is a standard way to define a Page Object with the 'page-object' gem, and a standard way to create a test with Rspec. The documentation for these is widely available.
This is harder in other languages, because there is no standard Page Object model or standard reporting structure in say, Python. This is an issue, for example, for Mozilla, where anyone who wants to contribute a test first has to understand a custom implementation of the Page Object model and reporting structure in Python.
Unlike unit tests, browser/UI tests are by their nature expensive to run and expensive to maintain. The Selenium test suite should have a tight focus on exercising high-risk/high-value paths through the application. I don't foresee having Selenium tests for every aspect of every feature ever developed, but rather having a manageable set of browser tests with a clear focus on value.
Suppose there's a test:
- Failing in jenkins but passing locally.
Since Jenkins would be the test environment of record, if a test fails there, it would indicate either an issue with the code in the test environment, or else a need to fix the test itself.
OR
- Failing just for a single person.
Could indicate any number of things depending on circumstance.
OR
- Failing but it's apparently ok.
Browser tests need maintenance. A test that fails but does not indicate a problem of some sort will be removed or subject to maintenance.
OR
- Passing but should be failing.
This would indicate a useless test. Again, such a test would be either removed or altered to be useful.
How do you debug it? Do you need to know ruby for that? To which extent?
My idea right now is that maintaining the Selenium test suite as run by Jenkins would be primarily a QA activity, with contributions from any other interested parties in the greater community or among the Mediawiki/Wikipedia dev/ops community. Contributions from developers would be welcome but not required.
Some time ago some people from the test framework team started working on a Selenium Framework for MediaWiki [1], in PHP and with Selenium 1.0. One of the reasons the project discontinued was that there was no clear case of when Selenium would be useful as opposed to unit tests, esp. using QUnit and TestSwarm for UI testing. I still see some use cases, though: * Smoketesting the whole application on a very high level * Testing complex interaction patterns with several page reloads (e.g. maybe producing edit conflicts) * Testing cross-browser compatibility (e.g. using the screenshot feature) * It's easy to record tests using the Selenium IDE. So basically anyone could write tests, esp. for some less used extensions * This also might be useful when filing bugs (make them reproducible)
While I see the case for new approach based on Ruby and Watir, I still think, some of the experiences we made back then could be useful in the design of the new testing environment. Here are some points of what we have so far: * a (rudimentary) set of methods that generalize login, page calls, page preparation with wikitext etc. * a configuration system to run the tests against different wikis, and also a test suite layout that helps to select a subset of tests. * a command-line test runner * a way of reconfiguring the wiki under test so that we can standartize some settings, such as language (which is an issue in some cases when testing UI). This also provides means to switch the wiki database to a test db and test images folder. * some ideas about the design of tests, suites and individual assertions
I would be happy to contribute and take some of this to the next level within the new environment :)
Cheers, Markus (mglaser)
[1] https://www.mediawiki.org/wiki/Selenium
-----Ursprüngliche Nachricht----- Von: wikitech-l-bounces@lists.wikimedia.org [mailto:wikitech-l-bounces@lists.wikimedia.org] Im Auftrag von Chris McMahon Gesendet: Donnerstag, 5. April 2012 23:46 An: wikitech-l@lists.wikimedia.org Betreff: [Wikitech-l] selenium browser testing proposal and prototype
As QA Lead for WMF, one of the things I want to do is to create an institutional suite of automated cross-browser regression tests using Selenium. I have two goals for this suite: first, to make it attractive and convenient for the greater software testing community to both use the suite locally and also to contribute to it. Second, to have the suite be a useful regression test tool within WMF, tied to Beta Labs and controlled by Jenkins.
For various reasons, I think the best language for this project is Ruby. I realize that is a controversial choice, and I would like to explain my reasoning. First let me address what I think will be the most serious objections:
** Ruby gems are incompatible with Debian/Ubuntu apt packaging, making it difficult or impossible to maintain Ruby code in production.
The selenium test suite is not intended to run on production servers. There are two targets for this code. The first target is users' local machines, including and especially Windows users. The second target is a single dedicated headless Labs Ubuntu instance, controlled by Jenkins, serving as a client to selenium-server, where selenium-server is running on the various Windows etc. VMs that exist today on the WMF VPN.
** It's not PHP.
As of today, PHP has no complete or authoritative implementation of selenium-webdriver/Selenium 2.0. That situation is unlikely to change any time soon. This leaves a choice of Ruby or Python. For various reasons I think Ruby is the superior choice.
** Design goals and their implementation.
In the interest of making this project as attractive as possible to the greater testing community, I have defined a browser automation "stack", using the most current and accepted practices for browser test automation. The stack looks like this:
* Selenium-webdriver low-level "toolbox" API. * Higher-level API for consistent access to pages and elements without the user having to create their own handling for timeouts, exceptions, "stale objects", multiple access criteria, etc. * Modern, BDD-style assertions for pass/fail criteria (as opposed to xUnit style assertions) * "Page Object" design pattern in place and functioning out of the box * Support for mobile emulators * Institutional support for Jenkins integration
I submit that having these things in place and working when the user downloads the suite is what will make this project attractive to the global testing community.
Taking these point by point:
Even as I write this, the W3C is meeting in London to approve webdriver as an internet standard, with Selenium 2.0 as the reference implementation of that standard. This means that the selenium 2.0 API is only a fraction the size of that of Selenium 1.0, and serves a very different purpose. The selenium 2.0 API can be considered a "toolbox" from which higher-level APIs are constructed. Both Ruby and Python have full implementation and support for the Selenium 2.0 API.
At higher levels of the stack, the Python and Ruby communities take different design approaches, with Python being more DIY/NIH and Ruby being more shared and organized around communities of practice. This is what makes Ruby so attractive as I implemented the rest of the stack.
Watir (Web Application Testing In Ruby) is a browser test project that pre-dates Selenium itself. Watir provides an intuitive, thoughtful, well-designed high-level API for access to pages and elements that the native selenium-webdriver API does not. While historically Watir and Selenium have been different projects, as of about 2010 watir-webdriver is simply a wrapper for the low-level selenium API that preserves all of the convenience and good design Watir has always had. Watir is in use today at Facebook, The Gap, and many other places. Python has no such equivalent high-level API.
Behavior Driven Development (BDD) style assertions are implemented in Ruby with the Rspec library. This approach to assertions has been a generally accepted standard in the Ruby community for some time. "The Rspec Book" was published by Pragmatic Press in 2010. Python has no equivalent BDD-style assertion library. In fact, Rspec takes advantage of Ruby's affordance for metaprogramming, something not possible in Python.
"Page Objects" is a design pattern for browser testing that has become a generally accepted practice in the browser testing community, with a lot of momentum particularly in the last one or two years. Page Objects have institutional support in Ruby with the 'page-object' gem, which supports both watir-webdriver and selenium-webdriver syntax. Python has no such institutional support for page objects, each Python project using page objects implements the pattern locally, from scratch.
Some support for mobile emulators is provided by Ruby's webdriver-user-agent gem, created by Alister Scott of Thoughtworks. This gem piggybacks on either the watir-webdriver or selenium-webdriver APIs in Ruby. I am unaware of any such mobile emulator support in Python.
Ruby selenium tests are run via 'rake', the Ruby version of make. Jenkins has an officially supported plugin for rake.
Both Ruby and Python are viable choices for this project. The difference is that Python would require a lot of custom infrastructure and scaffolding, with all of the risk and maintenance that that entails, where support for the elements of the "stack" in Ruby is already in place, well-designed, and supported by the greater Ruby community. Not only does this make maintenance of the stack for WMF purposes much simpler, it also makes the project significantly more attractive to casual users in the testing community who can get started with the suite immediately by simply installing a few gems locally and reading some public documentation, rather than having to face a daunting pile of custom code.
For reference and further research, I created a prototype of this project that implements this stack here: https://github.com/chrismcmahon/Page-Object-WMF-spike
Addendum: I am aware of two projects at WMF already using Selenium, one with OmniTI for AFTv5, the other from Jeremy Postlethwaite. Since the purpose of this project is to attract members of the global testing community, and to provide an institutional regression test suite for Mediawiki/Wikipedia, I see no reason to require individual WMF projects to use this framework, although I suspect that individual WMF projects might find this framework more convenient than other approaches. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Thanks for the point-by-point notices!
On Tue, Apr 10, 2012 at 4:17 PM, Markus Glaser glaser@hallowelt.biz wrote:
- Testing complex interaction patterns with several page reloads (e.g.
maybe producing edit conflicts)
Yes, and also in environments where interactions among many extensions are not well understood.
- Testing cross-browser compatibility (e.g. using the screenshot feature)
IE should really get more attention than it does. I've found IE-only problems in a number of places.
- It's easy to record tests using the Selenium IDE. So basically anyone
could write tests, esp. for some less used extensions
Actually, creating tests with the IDE is a really bad idea. Such tests are almost always brittle and impossible to maintain. The IDE is occasionally useful for hints about how to identify tricky page elements, but really should not be used to generate whole tests.
- This also might be useful when filing bugs (make them reproducible)
Definitely.
While I see the case for new approach based on Ruby and Watir, I still think, some of the experiences we made back then could be useful in the design of the new testing environment. Here are some points of what we have so far:
- a (rudimentary) set of methods that generalize login, page calls, page
preparation with wikitext etc.
Definitely. Much of this sort of thing is implicit and automatic when using the Page Object design pattern. While it is possible to implement Page Objects in any language (Mozilla has done a good job of it in Python), Ruby is the only webdriver-supporting language with a common, documented, well-supported implementation of the pattern.
- a configuration system to run the tests against different wikis, and
also a test suite layout that helps to select a subset of tests.
This already exists using Rspec and rake.
- a command-line test runner
rake controls running Rspec tests, and Jenkins has institutional support for rake integration
* a way of reconfiguring the wiki under test so that we can standartize
some settings, such as language (which is an issue in some cases when testing UI). This also provides means to switch the wiki database to a test db and test images folder.
I'm not sure about this. Browser tests should be about controlling browsers, not databases, not files on the filesystem. I want users to be able to download the test suite and have it Just Work, whether they are on Linux, OSX, or Windows. Beta Labs wikis should help here.
- some ideas about the design of tests, suites and individual assertions
I would be happy to contribute and take some of this to the next level within the new environment :)
Thanks Markus, there is still much to be done, but I'm hoping to have something useful very soon.
-Chris
Cheers, Markus (mglaser)
Markus Glaser glaser@hallowelt.biz wrote:
Some time ago some people from the test framework team started working on a Selenium Framework for MediaWiki [1], in PHP and with Selenium 1.0. One of the reasons the project discontinued was that there was no clear case of when Selenium would be useful as opposed to unit tests, esp. using QUnit and TestSwarm for UI testing. I still see some use cases, though:
- This also might be useful when filing bugs (make them reproducible)
One example I would like to have scripted is the following:
- have a sysop account to watch the non-existing page name - create that page with some content - have a sysop to delete this page
Very good testing case for DB transaction related problems.
I doing such tests should be possible in the new framework?
//Saper
- have a sysop account to watch the non-existing page name
- create that page with some content
- have a sysop to delete this page
Very good testing case for DB transaction related problems.
I doing such tests should be possible in the new framework?
//Saper
Yes. This is a good example of where browser testing shows an advantage over unit or api testing, because it exercises a whole (interesting) path through the application.
I'll see about implementing exactly this situation and show it to you. (I have to get the proper test accounts in place safely).
-Chris
Chris McMahon <cmcmahon <at> wikimedia.org> writes:
As QA Lead for WMF, one of the things I want to do is to create an institutional suite of automated cross-browser regression tests using Selenium. I have two goals for this suite: first, to make it attractive and convenient for the greater software testing community to both use the suite locally and also to contribute to it. Second, to have the suite be a useful regression test tool within WMF, tied to Beta Labs and controlled by Jenkins.
For various reasons, I think the best language for this project is Ruby. I realize that is a controversial choice, and I would like to explain my reasoning. First let me address what I think will be the most serious objections:
[...]
** It's not PHP.
As of today, PHP has no complete or authoritative implementation of selenium-webdriver/Selenium 2.0. That situation is unlikely to change any time soon. This leaves a choice of Ruby or Python. For various reasons I think Ruby is the superior choice.
Not sure what counts as authoritative, but there are a number of fairly usable PHP implementations such as php-webdriver [1] from Facebook or phpunit-selenium [2] from the PHPUnit framework, both of which are non-complete but very easy to extend (and in practice, you don't use most Selenium commands anyway). Using one of them is more troublesome than choosing a language in which there is a reference Selenium implementation, but on the other hand, you don't need to introduce another language, you can write the tests in a language all MediaWiki developers are comfortable with, and you leave open the option of reusing MediaWiki components in tests to handle setup/teardown of fixtures in a clean way.
Also, my (admittedly very superficial) experience with BDD is that Cucumber/Gherkin is much better for acceptance testing than RSpec (which is more suited for unit testing). Gherkin tests are clean, human-readable descriptions which are easier to read than program code, and can be easily understood by non- developers (end users, QA people, managers) even if they have no idea what a programming language is. On the other hand Gherkin is not a real programming language, so you lose some flexibility (such as the ability to use page objects), but IMO it is well worth it. And while RSpec relies on Ruby's elegant but obscure poetry mode, and thus cannot be easily copied in other languages, Gherkin has a simple custom syntax which is trivial to implement in any language; specifically, there is a good PHP implementation called Behat [3] which has its own Selenium implementation (Mink [4]) but also can be used with any other Selenium library.
Mink has the additional advantage that it abstracts away the Selenium interface so that Selenium can be replaced with some other browser simulator without changing the tests; while that doing Selenium-specific things more complicated, it can yield huge speedups for test which don't require Javascript and so Selenium can be replaced with some simple browser emulator. (Yes, Selenium2 has its own browser emulator, but it is still a fair bit slower than something like Goutte [5]).
So maybe a PHP - Mink (or other Selenium library) - Behat stack instead of a Ruby - Watir - RSpec stack would be worth considering.
[1] https://github.com/facebook/php-webdriver [2] https://github.com/sebastianbergmann/phpunit-selenium [3] http://behat.org/ [4] http://mink.behat.org/ [5] https://github.com/fabpot/Goutte
On Mon, Apr 16, 2012 at 5:14 AM, Gergő Tisza gtisza@gmail.com wrote:
Not sure what counts as authoritative, but there are a number of fairly usable PHP implementations such as php-webdriver [1] from Facebook or phpunit-selenium [2] from the PHPUnit framework, both of which are non-complete but very easy to extend (and in practice, you don't use most Selenium commands anyway).
I might disagree with both those assertions.
Also, my (admittedly very superficial) experience with BDD is that
Cucumber/Gherkin is much better for acceptance testing than RSpec (which is more suited for unit testing).
Cucumber adds a layer of abstraction I think is unnecessary-- these tests are to be read by developers, many of whom will not be expert. Rspec is a nice alternative to xUnit-style assertions, and the standard among Ruby developers.
That said, if in the future some context were to come along where Cucumber makes sense, this framework allows adding that level of ATDD easily.
Mink has the additional advantage that it abstracts away the Selenium interface so that Selenium can be replaced with some other browser simulator without changing the tests; while that doing Selenium-specific things more complicated, it can yield huge speedups for test which don't require Javascript and so Selenium can be replaced with some simple browser emulator.
The point of the exercise is to test browsers, not browser emulators.
On 16/04/12 16:45, Chris McMahon wrote:
Cucumber adds a layer of abstraction I think is unnecessary-- these tests are to be read by developers, many of whom will not be expert. Rspec is a nice alternative to xUnit-style assertions, and the standard among Ruby developers.
Which seem to be the only ones that will be able to write our tests.
You can talk about how "there's a standard way" or "it's documented", but that doesn't help if they aren't easy to develop. For example X.509 is all standard you want, really flexible, but you wouldn't want to manually touch it :) Moreover, making tests is not generally fun to do, so you want the barrier to be as low as possible.
My idea right now is that maintaining the Selenium test suite as run by Jenkins would be primarily a QA activity, with contributions from any other interested parties in the greater community or among the Mediawiki/Wikipedia dev/ops community. Contributions from developers would be welcome but not required.
Not that I have anything against making tests by opening bugs, and letting the QA team (who forms that?) struggle to make them work. It's simple to just be lazy and let you test it manually, with a monkey farm, gems or a street rat. But I don't think that's the way we should go. The developers should be able to maintain them, even if they're not daily involved with it. Not only for being able to keep them the day you stop doing it, but also for detecting why they fail and maintaining them when changing the expected output.
It wouldn't be fun to held a discussion in gerrit about why a bunch of tests suddenly fail after merging a big feature, wondering what it could have been trying to test.
wikitech-l@lists.wikimedia.org