Visual Regression Tests

The first rule of “UI automated regression tests” is “You do not perform any UI automated regression tests”.

The second rule of “UI automated regression tests” is “In some contexts, if you really know what you are doing, then some automated regression tests performed via the UI can be relevant”.

Martin Fowler explained it very well in his Testing Pyramid article: “a common problem is that teams conflate the concepts of end-to-end tests, UI tests, and customer facing tests. These are all orthogonal characteristics. For example a rich javascript UI should have most of its UI behaviour tested with javascript unit tests using something like Jasmine”. So the top of a testing pyramid should be:

  • built on top of a portfolio of unit and service/component/api tests (that should include some tests focussed on the UI layer)
  • a (small) set of end-2-end tests performed via the GUI to check that we did not miss anything with our more focussed tests.

For those end-2-end tests, the usual suspect in the Open Source scene is Selenium. Driving the browser through our app via the most common paths is a good way to gain some final confidence on the SUT. But one should really understand that what Selenium will check is the presence of some elements on the page and the events associated with some elements. “if I fill this input box and click on that button then I expect to see a table with this and that strings in it”, but it does not check the visual aspect of the page. To put it another way, with Selenium we are checking the nervous system of our SUT, but not its skin. Here come the visual regression testing tools.

Thoughtworks Radar raised their level from “assess” to “trial” on “visual regression testing tools last July with this comment: “Growing complexity in web applications has increased the awareness that appearance should be tested in addition to functionality. This has given rise to a variety of visual regression testing tools, including CSS Critic, dpxdt, Huxley, PhantomCSS, and Wraith. Techniques range from straightforward assertions of CSS values to actual screenshot comparison. While this is a field still in active development we believe that testing for visual regressions should be added to Continuous Delivery pipelines.”

So I wanted to give it a try and did a quick survey of the tools available. I wanted to know which project were still active, what was their langage/ecosystem and what browser were supported. Here is list I just built:

My shopping list was:

  • Python, as I am already using Robot Framework in this langage
  • Support for another browser than PhantomJS because my SUT does not render very well under PhantomJS at the moment

So I chose needle which, according to its author is “still being maintained, but I no longer use it for anything”. I gave it a try and it works well indeed. I now have a basic smoke test trying to catch visual regression tests. More on that soon if I have more feedback to share.

Robot Framework, Selenium, Jenkins, and XVFB

Until recently, my test cases written with Robot Framework were not driven by the GUI. Those tests were interacting with the SUT using many different ways: process library, database access, file system, REST requests etc. Those access usually don’t change too often during the lifetime of a product as they often become a contract for users/tools communicating with product (i.e. API). Whereas driving the SUT through the UI can be brittle and engage the team in a constant refactoring of the tests. This is summarized in the famous pyramid of tests. But, having no automated GUI tests, my pyramid was headless! So I decided to invest a bit in the topic.

From there I had the choice between :

  • writing tc with Robot using the Robot Framework Selenium 2 library
    pro: integrated with other existing Robot tc + easy to start
    con: library a bit behind the official one + selenium users are mostly not using this API
  • writing tc in Java using the Selenium 2 Java binding
    pro: up to date + large community of users
    con: another set of tools to write/launch/report test cases in addition to Robot

I took the easy path and went for the Robot Library. There is an excellent demo of the library which helps to start-up. Before digging into the tests of my SUT, I added another 2 steps:

  1. going a bit further in my knowledge of Chrome Developer Tool. There is an excellent online course by Code School. Those tools are soon becoming very useful when creating GUI driven test case. Note that, Firefox dev console seems to offer the same service, but somehow I feel more ease with Chrome ones.
  2. running a testing dojo with a group of peers. We used Jenkins as SUT (as I already did to perform some stress test trials) and it worked quite well although Jenkins does not seem to be the easiest web app to test as we had to use XPATH to locate element which can become obsolete quite fast.

I was then able to create my first tests on the SUT I am testing. Those were more easy as expected as all the element I had to address had unique IDs! The biggest obstacle is to chain actions without using the horrible sleep keyword (see Martin again). With the Robot Selenium library, the easiest way is to use (and abuse) the “wait until page contains element” keyword.

Last but not least, I needed to launch the tests using Jenkins (very easy and quite nice with the plugin) on a linux VM…. without a display! Here enters PhantomJS that is a headless browser supported by the Robot Selenium library. But somehow, my tests were randomly failing with it, so I gave it up and went back to Firefox. And here is the last character of this little story: xvfb. This display server sends the graphic to memory so no need to have a full/real display available. The install is not immediate but some shared their feedback on it so it is quite smooth after all.

So the current configuration  of my Robot Selenium tc in Jenkins is:

  • xvfb running on the VM on which tests are going sending display to :89
  • in Jenkins, in my Robot task, in “Build Environment / Set environment variable” I put “DISPLAY=:89”

And that’s it, tests are running OK on this VM. And on the top of that, when there is a failure in the tests, the Robot Selenium lib is able to perform a screenshot and save it to disk to ease analysis of the issue!