The first rule of “UI automated regression tests” is “You do not perform any UI automated regression tests”.
The second rule of “UI automated regression tests” is “In some contexts, if you really know what you are doing, then some automated regression tests performed via the UI can be relevant”.
Martin Fowler explained it very well in his Testing Pyramid article: “a common problem is that teams conflate the concepts of end-to-end tests, UI tests, and customer facing tests. These are all orthogonal characteristics. For example a rich javascript UI should have most of its UI behaviour tested with javascript unit tests using something like Jasmine”. So the top of a testing pyramid should be:
- built on top of a portfolio of unit and service/component/api tests (that should include some tests focussed on the UI layer)
- a (small) set of end-2-end tests performed via the GUI to check that we did not miss anything with our more focussed tests.
For those end-2-end tests, the usual suspect in the Open Source scene is Selenium. Driving the browser through our app via the most common paths is a good way to gain some final confidence on the SUT. But one should really understand that what Selenium will check is the presence of some elements on the page and the events associated with some elements. “if I fill this input box and click on that button then I expect to see a table with this and that strings in it”, but it does not check the visual aspect of the page. To put it another way, with Selenium we are checking the nervous system of our SUT, but not its skin. Here come the visual regression testing tools.
Thoughtworks Radar raised their level from “assess” to “trial” on “visual regression testing tools last July with this comment: “Growing complexity in web applications has increased the awareness that appearance should be tested in addition to functionality. This has given rise to a variety of visual regression testing tools, including CSS Critic, dpxdt, Huxley, PhantomCSS, and Wraith. Techniques range from straightforward assertions of CSS values to actual screenshot comparison. While this is a field still in active development we believe that testing for visual regressions should be added to Continuous Delivery pipelines.”
So I wanted to give it a try and did a quick survey of the tools available. I wanted to know which project were still active, what was their langage/ecosystem and what browser were supported. Here is list I just built:
- CSS critic: active, javascript, Firefox
- dpxdt: active, python, phantomJS
- Huxley: not active anymore (don’t know why)
- PhantomCSS: active, Javascript, PhantomJS
- Wraith: active, Javascript+Ruby, PhantomJS
- Grunt-photoBox: active, Javascript
- Hardy: active, Javascript/Cucumber, currently not working with Firefox
- diffux: active, Ruby, PhantomJS
- needle: active, Python/Nose/Selenium/pdiff, any browser supported by Selenium
My shopping list was:
- Python, as I am already using Robot Framework in this langage
- Support for another browser than PhantomJS because my SUT does not render very well under PhantomJS at the moment
So I chose needle which, according to its author is “still being maintained, but I no longer use it for anything”. I gave it a try and it works well indeed. I now have a basic smoke test trying to catch visual regression tests. More on that soon if I have more feedback to share.