Silicon Cowboys

Now that Personal Computers sales seem to have peaked to the profit of iOS and Androids mobile devices, it is worth remembering how this market grew from hobbyists-only to massive market during the 80’s. Apple and Microsoft “wars and peace” made it into many books/documentaries and Steve Jobs and Bill Gates fame is such that we are regularly remembered of their biographies. But at the same time, another battle took place: IBM versus its clones. The documentary “Silicon Cowboys” is telling the story of how Compaq (created in Texas, hence the “cowboys”) entered the market of Personal Computer with a bet on high compatibility with IBM PCs and portability. This is quite a fascinating watch to see “David” Compaq beating “Goliath” IBM in the PC market. An episode I was not totally familiar with is how Compaq and 8 others manufacturers responded with EISA standard to IBM move to launch an “incompatible” PS/2 computer. At the beginning of the 90’s the “IBM PC clones” market was huge and as DOOM (released?) was available on PC, it was about time for the late-teenagers to sell their Commodore/Atari and buy a 486. Unfortunately for Compaq, the market remained ultra-competitive, and Compaq was unable to fight with DELL or unbranded PC assembled in small shops.

Watch “Silicon Cowboys” if you want a nice piece of geek-history!
And if you want to go further on the topic, you might be interested in
– The book written by Rod Canion about the story of Compaq: How Compaq Ended IBM’s PC Domination and Helped Invent Modern Computing (amazon/audible)
Halt and Catch Fire (loosely based on Compaq history)
Internet History Podcast episode with Rod Canion

compaq

My favorite podcasts

Year after years, I am becoming more and more addicted to podcasts. Here is a list of the the ones I am currently listening to:

  • Internet History Podcast – My all-time favorite. Starting with Mosaic and Netscape, it is the story of Internet since the beginning of the web 20 years ago. Mix of “chapter episode” where Brian tells episodes of this amazing period and “interview episode” in which actors of this story appear.  Very very well researched and delivered. Hats off to Brian for this fabulous (and useful) work!
  • This week in TechLeo Laporte is a great host and have some very interesting guests to discuss all things about tech. Sometimes a bit chatty but always full of energy.
  • NipTech – (in french). Their motto: “#tech #startups #inspiration”. Good content and rythme. Ben and Mike are doing a great job! Fun to listen!
  • Le RDV Tech – (in french).  Tech news, rumors and stories by Patrick with various guests. He takes time to explain and dig into some hot topics and is not afraid to share opinions on polemical topics.
  • a16z – Andreessen Horowitz VC company podcast. High quality with high profiles guests. Mostly about topic in which a16z is investing (bitcoins, AI, big data etc.).
  • Foundation – Interviews with tech entrepreneurs who share their paths, their advices and some anecdotes.
  • This American Life –  My other all-time favorite, this time not tech related. Mostly true stories of everyday people. Deep dive in american life. Very very well produced. Fascinating.

Should the test automation framework be written in the same language the application is being written or developed in?

Just tweeted an article published by Sauce Labs on “Reducing False Positives in Automated Testing” that takes the form of a Q&A about automation. I mostly agree with the article but there is one answers that bothers me :

Should the test automation framework be written in the same language the application is being written or developed in?

A: Yes. Creating the automation framework in the same language of application will help the development team in testing specified areas whenever there is any code change or defect fix.[…]

I would replace the “yes” by “it depends”. I also see value in using another langage to automate functional tests. Specially when you have a dedicated QA/tester team. Here are a couple of reasons:

  • if your company is producing different products in different langages, then it could be difficult for your QA team to keep up with those different langages for their automation. So using a single langage for the tests would ease QA life. That would also allow QA to share libraries/tools among different tests for different SUT.
  • hiring skilled QA is not easy. Finding people with good functional understanding/curiosity of the SUT *and* a good technical knowledge can be difficult. Asking them to code in a langage like Python/Ruby rather than coding in C, C#, Java etc. could be make it easier to achieve
  • using a different langage than the SUT is also a good way to force QA (or dev if they code tests) to take a new perspective on the SUT. If you test your Java product in Java, you might swim in the “integration tests” water and even when you try to black-box test your application, you might end up interacting with the SUT in a gray-box way.

In the end it really depends on your organisation, culture, people etc. but it is worth having a debate on that point. You might want to take a look at this other article on the same subject: Your automated acceptance tests needn’t be written in the same language as your system being tested

Re-executing failed test cases and merging outputs with Robot Framework

In a previous post, I discussed solving intermittent issues aka building more robust automated tests. A solution I did not mention is the simple “just give it another chance”. When you have big and long suites of automated tests (quite classic to have suites in the 1000’s and lasting hours when doing functional tests), then you might get a couple of tests randomly failing for unknown reasons. Why not just launching only those failed tests again? If they fail once more, you are hitting a real problem. If they succeed, you might have hit an intermittent problem and you might decide to just ignore it.

Re-executing failed tests (–rerunfailed) appeared in Robot Framework 2.8. And since version 2.8.4 a new option (–merge) was added to rebot to merge output from different runs. Like explained in the User Guide, those 2 options make a lot of sense when used together:

# first execute all tests
pybot --output original.xml tests 
# then re-execute failing
pybot --rerunfailed original.xml --output rerun.xml tests 
# finally merge results
rebot --merge original.xml rerun.xml

This will produce a single report where the second execution of the failed test is replacing the first execution. So every test appears once and for those executed twice, we see the first and second execution message:

modified

Here, I propose to go a little bit further and show how to use –rerunfailed and –merge while:

  • writing output files in an “output” folder instead of the execution one (use of –outputdir). This is quite a common practice to have the output files written in a custom folder but it makes the whole pybot call syntax a bit more complex.
  • giving access to log files from first and second executions via links displayed in the report (use of Metadata). Sometimes having the “new status” and “old status” (like in previous screenshot) is not enough and we want to have details on what went wrong in the execution, and having only the merged report is not enough.

To show this let’s use a simple unstable test:

*** Settings ***
Library  String

*** Test Cases ***
stable_test
    should be true  ${True}

unstable_test
    ${bool} =  random_boolean
    should be true  ${bool}
    
*** Keywords ***
random_boolean
    ${nb_string} =  generate random string  1  [NUMBERS]
    ${nb_int} =  convert to integer  ${nb_string}
    Run keyword and return  evaluate  (${nb_int} % 2) == 0

The unstable_test will fail 50% of times and the stable test will always succeed.

And so, here is the script I propose to launch the suite:

# clean previous output files
rm -f output/output.xml
rm -f output/rerun.xml
rm -f output/first_run_log.html
rm -f output/second_run_log.html
 
echo
echo "#######################################"
echo "# Running portfolio a first time      #"
echo "#######################################"
echo
pybot --outputdir output $@
 
# we stop the script here if all the tests were OK
if [ $? -eq 0 ]; then
	echo "we don't run the tests again as everything was OK on first try"
	exit 0	
fi
# otherwise we go for another round with the failing tests
 
# we keep a copy of the first log file
cp output/log.html  output/first_run_log.html
 
# we launch the tests that failed
echo
echo "#######################################"
echo "# Running again the tests that failed #"
echo "#######################################"
echo
pybot --outputdir output --nostatusrc --rerunfailed output/output.xml --output rerun.xml $@
# Robot Framework generates file rerun.xml
 
# we keep a copy of the second log file
cp output/log.html  output/second_run_log.html
 
# Merging output files
echo
echo "########################"
echo "# Merging output files #"
echo "########################"
echo
rebot --nostatusrc --outputdir output --output output.xml --merge output/output.xml  output/rerun.xml
# Robot Framework generates a new output.xml

and here is an example of execution (case where unstable test fails once and then succeeds):

$ ./launch_test_and_rerun.sh unstable_suite.robot

#######################################
# Running portfolio a first time      #
#######################################

==========================================================
Unstable Suite
==========================================================
stable_test                                       | PASS |
----------------------------------------------------------
unstable_test                                     | FAIL |
'False' should be true.
----------------------------------------------------------
Unstable Suite                                    | FAIL |
2 critical tests, 1 passed, 1 failed
2 tests total, 1 passed, 1 failed
==========================================================
Output:  path/to/output/output.xml
Log:     path/to/output/log.html
Report:  path/to/output/report.html

#######################################
# Running again the tests that failed #
#######################################

==========================================================
Unstable Suite
==========================================================
unstable_test                                     | PASS |
----------------------------------------------------------
Unstable Suite                                    | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed
==========================================================
Output:  path/to/output/rerun.xml
Log:     path/to/output/log.html
Report:  path/to/output/report.html

########################
# Merging output files #
########################

Output:  path/to/output/output.xml
Log:     path/to/output/log.html
Report:  path/to/output/report.html

So, the first part is done: we have a script that launch the suite twice if needed and put all the output in “output” folder. Now let’s update the “settings” section of our test to include links to first and second run logs:

*** Settings ***
Library   String
Metadata  Log of First Run   [first_run_log.html|first_run_log.html]
Metadata  Log of Second Run  [second_run_log.html|second_run_log.html]

 If we launch our script again, we will get a report with links to first and second run in the “summary information” section:

report

The script and the test can be found in a GitHub repository. Feel free to comment on that topic if you found out more tips on those Robot options.

Visual Regression Tests

The first rule of “UI automated regression tests” is “You do not perform any UI automated regression tests”.

The second rule of “UI automated regression tests” is “In some contexts, if you really know what you are doing, then some automated regression tests performed via the UI can be relevant”.

Martin Fowler explained it very well in his Testing Pyramid article: “a common problem is that teams conflate the concepts of end-to-end tests, UI tests, and customer facing tests. These are all orthogonal characteristics. For example a rich javascript UI should have most of its UI behaviour tested with javascript unit tests using something like Jasmine”. So the top of a testing pyramid should be:

  • built on top of a portfolio of unit and service/component/api tests (that should include some tests focussed on the UI layer)
  • a (small) set of end-2-end tests performed via the GUI to check that we did not miss anything with our more focussed tests.

For those end-2-end tests, the usual suspect in the Open Source scene is Selenium. Driving the browser through our app via the most common paths is a good way to gain some final confidence on the SUT. But one should really understand that what Selenium will check is the presence of some elements on the page and the events associated with some elements. “if I fill this input box and click on that button then I expect to see a table with this and that strings in it”, but it does not check the visual aspect of the page. To put it another way, with Selenium we are checking the nervous system of our SUT, but not its skin. Here come the visual regression testing tools.

Thoughtworks Radar raised their level from “assess” to “trial” on “visual regression testing tools last July with this comment: “Growing complexity in web applications has increased the awareness that appearance should be tested in addition to functionality. This has given rise to a variety of visual regression testing tools, including CSS Critic, dpxdt, Huxley, PhantomCSS, and Wraith. Techniques range from straightforward assertions of CSS values to actual screenshot comparison. While this is a field still in active development we believe that testing for visual regressions should be added to Continuous Delivery pipelines.”

So I wanted to give it a try and did a quick survey of the tools available. I wanted to know which project were still active, what was their langage/ecosystem and what browser were supported. Here is list I just built:

My shopping list was:

  • Python, as I am already using Robot Framework in this langage
  • Support for another browser than PhantomJS because my SUT does not render very well under PhantomJS at the moment

So I chose needle which, according to its author is “still being maintained, but I no longer use it for anything”. I gave it a try and it works well indeed. I now have a basic smoke test trying to catch visual regression tests. More on that soon if I have more feedback to share.

Create Jenkins Job for Robot Framework

Once you have created your first tests in Robot Framework, next step is to include those tests in your Continuous Integration (CI) System. Here I will show the different steps to do so in Jenkins.

Let’s assume you have

jenkins_begin

First we create a new job to launch our Robot tests:

create_job

Once the job is created, we configure it.

  1. set up Source Code Management for the source code of the tests:jenkins_svn
  2. set up a first “Build Trigger” on the success of the job that builds the SUT:trigger_build
  3. set up a second “Build Trigger” on changes in the Source Code of the tests:trigger_scm
    This way your tests will be launched either if there is a new build of the SUT or if your tests have changed. Second trigger is relevant because some modifications in your tests may have broken some of them, and you don’t want to wait for the next build of the SUT to find it out. In other words, when a test fails in Jenkins, it is good to know if this is a consequence of a change in the SUT of a change in the tests (if both changed, analysis will be tricker).
  4. get the artefact from the project that build your SUT so that your SUT is available from your Jenkins’ workspace where the Robot tests will be run. To do so you can either use the Jenkins Copy Artifact Plugin or write a piece of Batch/Shell script.screenshot-copy-artefacts
  5. The comes the step in which Robot tests are going to be launched. For this you create a “Execute Shell” Build step that contains, at least:
    pybot path/to/my_tests/

    and all the –variable, –include, –exclude etc. that you use to customise you run.
    One noteworthy command line option in the context of a CI server is –NoStatusRC, which force Robot’s return code to zero even when there is test that fails. This way the status of the Jenkins build can be driven by Robot Jenkins Plugin like you will see in final step.build_robot

  6. Finally, to have a more granular settings of the results of the tests, and keep a copy of the report/log of the test executions located in the Jenkins Server, you can use Robot Framework Plugin. Once the plugin is installed, it will be available in the list of “Post Builds Actions”. A simple configuration would be like this:pluginand after a couple of builds, the project page would look like that:plugin_project_page

Once this basic setup is working, you will find out many options in Jenkins and Robot Framework to get more value out of it. To give just one example, once the test portfolio becomes large and/or long, you might find out that this is not efficient to launch the full regression suite at once when there is a change in the SUT or the tests’ code. A good strategy is to have 2 Jenkins job. The first one (“smoke tests”) is running only a portion of the whole suite that runs quickly (say 5/10 minutes for example):

pybot path/to/my_tests/ --include smoke --exclude not_ready

and the second job (“full tests”) launches all the tests:

pybot path/to/my_tests/ --exclude not_ready

but is launched only when smoke tests are run successfully: chain_builds

so if your SUT or your tests have some essential feature (covered in the smoke test) broken, you will save your machine a “full test”, and, more important, the team have a quicker feedback on the quality of the SUT build.

 

What editor for Robot Framework test cases

Robot Framework home page lists a number of plugins to edit Robot Framework test cases along with Robot’s own editor RIDE. Here is a feedback based on my experience with some of those tools.

RIDE was my first choice four years ago when I started using Robot. We were a team of 20 quality engineers working on Windows machine. Some of us had a technical background and some others were more on the banking business side (we were producing a financial software). RIDE ended up being a good choice, specially for the non-tech people as it hided part of the grammar of test cases: Suite Setup, Tags, Library etc. are all input boxes in a GUI. The ability to launch test cases from Ride was also very handy and kept us away from the command-line (which most of the team was never using). Overall we had a very good experience with RIDE and I would recommend it in a similar context.

TextMate (with its Robot’s bundle) was the editor I switched to when I switched to Mac. There were 2 motivations to move away from RIDE. First one is that RIDE is not very slick on Mac (there are several issues opened for a while) and even install is a bit complicated with WxPython and Python versions collisions. Second motivation was that I joined a more technical company where manipulating the source code of the test was much more frequent (e.g. editing the test case on a remote VM via SSH with vi, merging changes done by another tester…). So I didn’t want anymore to have a GUI layer hiding the source code of the tests. I chose TextMate because it was free, lightweight and worked at once with Robot and SVN. After some time, I started to miss keyword completion and quick access to keywords source code though.

PyCharm is the editor I am using for a couple of weeks now. This time the switch was motivated by some limitation of TextMate (see before) and also by looking over my colleague’s  shoulder who were getting happier and happier with PyCharm. Looks like  JetBrains’s IDE is getting momentum in the Python  community as I hear/read more and more about it. There are currently 2 competing plugins for Robot Framework: Robot Plugin and Intellibot. Both are providing Syntax Highlighting, Code Completion, Jump to Source with some little differences. Best thing is to try them both and see which one fits best.

A side note about this editor topic. when I moved to PyCharm, the amount of syntax checking became one level higher than on previous editor, and I was bothered by the fact that all my TXT files were getting analysed by the plugin (making my non-Robot TXT files uneasy to read). So I changed the extensions of my Robot Test Cases and libraries from .txt from .robot. This way I can configure Robot Plugin to affect only my Robot files and not all the classical text files.

 

 

Retour sur la JFTL 2014

la JFTL est une conférence organisée par le CFTL, association qui propose des certifications de tests logiciels. Cette année, pour la 6e édition, il y avait 500 personnes inscrites pour la journée. Dans les diapos d’introduction on nous informe que la population des visiteurs est bien ventilée entre management et opérationnel. Voici un rapide retour sur les sessions que j’ai pu voir.

Présentation intéressante de la gestion des tests de performance sur un projet RATP. Le SUT est une web app à destination des conducteurs de bus. Méthodologie classique teintée d’Agile. Les présentateurs ont partagé leur souhaits de faire des tests de perf en continu, mais ont avoué avoir eu beaucoup de mal à le faire (pour pouvoir faire les tests de perf, il faut déjà que le SUT soit fonctionnellement correct, d’ou besoin d’attendre la fin des sprints ou le sprint suivant). Côté outil: Gatling pour les développeurs et NeoLoad pour les testeurs.

SmartTesting a présenté sa solution Zest. Il s’agit d’un outil d’écriture de tests fonctionnels en ligne. Avec cet outil, on va pouvoir créer progressivement un DSL (ensemble de mots d’actions) que l’on veut utiliser dans des scénarios de tests. La plate-forme aide à l’écriture (suggestion d’actions lors de la frappe) et au refactoring (renommer des actions, créer des actions pour des motifs récurrents). Si l’on veut automatiser ces tests, on peut récupérer les tests en XML que l’on n’aura “plus qu’à” traduire dans le langage/framework de tests automatique de son choix. Je reste assez perplexe devant ce choix de ne pas proposer en natif une solution qui permette directement l’execution des tests comme il est possible de le faire avec des Cucumber, Fitness et Robot Framework.

Pages Jaunes a fait un retour d’experience sur l’utilisation de MaTeLo (de All4Tec) pour tester quelques features du site pagesjaunes.fr. L’outil MaTeLo propose du Model Based Testing. On peut importer ses exigences, décrire les états de son applis et générer des diagramme/flux en utilisant divers algorithme. Une fois les scénarios de tests générés, on peut les automatiser en Selenium. Là encore, je n’accroche pas trop sur l’outil. Pas de commentaire sur le côté mapping exigence/test case qui me semble un peu lourd (mais compréhensible dans de grosses organisations avec MOA, MOE et autres prestataires…). Par contre, générer automatiquement des dizaines/centaines de scénarios de tests qu’on va “automatiquement” exporter en Java/Selenium est contraire aux bonnes pratique d’automatisation qui commande plutôt d’automatiser autant que possible les comportement business “sous la UI” (voir cet article par exemple)

Enfin, présentation un peu poussive sur la génération de données de tests. Sujet intéressant à priori, mais quand je commence à entendre parler de “cellule de génération de tests”, ça sent l’ultra fragmentations et spécialisation des équipes. J’aurais aimé une présentation moins magistrale, plus concrète.

Globalement une conférence peu technique sur un sujet très technique. De ce point de vue, la présentation par HP de la nouvelle version de HP ALM/QC qui promet d’automatiser tous les tests et de trouver tous les bugs (ainsi que de résoudre la faim de la monde) a semblé captiver l’auditoire alors qu’elle aurait pu pas mal prêter aux sarcasmes :-)

A noter aussi la grande place occupée par les SSII dans beaucoup de présentation. Les clients finaux ne faisaient jamais de présentations seuls, mais toujours accompagnée de leur SSII. D’ou un dialogue très poli entre client qui a une vision métier du problème et la SSI qui a un discours “on va résoudre tout vos problèmes”. Grosse absence des testeurs qui mettent les mains de cambouis. Peut être car ils étaient retenus en Inde…

Journée intéressante malgré ces quelques bémols. Merci aux organisateurs et aux présentateurs !

bash: grep: command not found

It took me a year to understand why my grep was every so often not working:

[MBP]$ ps -ef | grep openidm
-bash: grep: command not found

The explanation lies in the fact that I was typing this a bit too fast. In fact when I was typing the “space” after the “pipe” (which on mac is shit+option+L), the option key was still pressed and I ended up typing option+space instead of space. And option+space is interpreted differently from space in the terminal. Got this hint from this thread: http://hintsforums.macworld.com/showthread.php?p=644491

On solution would be to carefully release the option key before pressing the space…. but a more convenient one can be found there:
http://earthwithsun.com/questions/78245/how-to-disable-the-option-space-key-combination-for-non-breaking-spaces
I choose the configuration of iTerm2:
“I use iTerm2 for most of my work and the mapping can be added in the “Keys” preference pane, by adding a new key combination in Preferences -> Keys -> the plus button. Note when adding the key make sure to put a single space in the lower box as shown.” => works great

Mystery explained and problem solved!

Randomizing test execution order

Many testing framework offer optional randomization of test execution order. For example:
– Robot Framework with –randomize
– Rspec with –order random

I consider this option as very useful and use it by default for all the automated tests portfolio I run. The advantages I see are:

– we detect ordering dependencies as soon as possible.  If we execute tests A and B always in the same order, test B could work only because test A left the system in a state that is used by test B. If one day we invert the order (by renaming the tests for example, if order depend on alphabetical order) then the suite will fail and it will take us some time to understand the problem (because test A and B were maybe written months/years ago). This happens also if we insert a new test between A and B or refactor test A or B. If we run the tests in random order all the time, we will detect this issue very soon.

– we might detect bugs in the SUT that appear only in some specific sequence of actions that a random order of test could meet with luck. The problem then could be how to duplicate the bug we just bumped into. For Rspec, randomness can have predictability via seeding.  JUnit had a randomization problem when Java 7 came out, and had to think that over and came up with a deterministic but random order. There is no such thing for Robot Framework so we have to manually reproduce the test order that caused the failure.

– we won’t always run the same tests first. And usually when we read a test report, we start by the top and analyse error one by one. This could help us to not analyze the “Access”, “Audit” and “Authentication” tests first all the time…

One could argue that using a fixed test execution order is useful to run some smoke/sanity tests first, and then the rest of the porfolio. I think in that case, it is better to split that in 2 different jobs. A first “smoke test” job that runs quickly (5-10 minutes) and another “full test” that can take several hours. In Robot Framework, this can be easily achieved using tags.

Another reason to push for fixed tests execution order could be performance optimisation. Test A could be preparing the system for test B to start and when test B ends, the system is ready for test C to go. One of the reason this is a bad pattern is that you won’t be able to run only test B or only test C! If the setup of a given test is in the previous test, then you are doomed to run always the full portfolio. This is simply not bearable. When a full portfolio detect a couple of failed tests, we want to be able to run those tests once more to double-check they are failing and then start to analyse the problem.

We could also introduce randomness in the test themselves, but this is another topic… for a future post!