Testing in a Continuous Delivery Environment

In his book Out of the Crisis, W. Edwards Deming cautioned “Cease dependence on inspection to achieve quality. Eliminate the need for inspection on a mass basis by building quality into the product in the first place”. Ever since, ‘Building Quality In’ has become one of the central tenets of quality focused lean initiatives, including lean software development. The act of testing in software development is an example of inspection: inspection to find bugs and faults in the developed software; static code analysis is another example of inspection. Quality is important in the context of software development because software bugs cost both users and software providers dearly: a study conducted on behalf of the US National Institute of Standards and Technology estimated the cost of software bugs to the US economy to be around $60 billion. Perhaps the extent of this scourge is not surprising since in many organizations software testing is not effective: testing/QA teams run “quality gates” as an afterthought and even then testing does not necessarily translate into quality. When Agile came around, practitioners came up with new approaches to testing, aptly described under the banner of “Agile Testing”, that provided some improvement by driving more collaboration across teams and bringing testing up in the development cycle. Now with the advent of DevOps, testing specifically has taken on a new level of significance since continuous delivery is not just about delivering software rapidly, but software that works as well. A few have even coined a term for this new discipline: continuous testing. All that is well, but what does testing mean in a continuous integration/delivery environment?

blue devops_4

In a continuous delivery (CD) environment, quality becomes the responsibility of all. This does not mean that the QA and testing teams do not have a role to play in a CD environment. On the contrary, the QA and testing function moves into a strategic role, providing oversight, direction and leadership for diving overall quality. For example, instead of spending countless hours running manual tests, QA teams will invest resources and time to develop and implement a comprehensive test automation strategy, or they will spend effort putting in place governance processes, metrics and incentives to drive quality at every step. An example of how quality becomes everybody’s responsibility is what the development staff would do in such an environment. Development teams in a CD environment are empowered to take on quite a bit of testing upon themselves. In addition to a ‘test first’ approach, developers may also be required run pre commit testing that runs a suite of unit, component and integration tests. Indeed many CI servers provide the capability for ‘private builds’, which allows an individual developer to see if their code changes can be integrated into the main trunk for a successful build. Pre commit testing should enable developers to conduct a quick ‘smoke test’ to ensure that their work will not break the code in the main trunk. Therefore, pre commit testing may contain a selection of integration and acceptance tests. Once the developer checks in the code to the CI server after pre commit testing, the CI server runs the commit stage tests, which includes performing any static code analysis as required, component and integration testing, followed by system testing. Commit stage testing results are immediately fed back to the development team to get any errors or bugs addressed. Successful commit stage testing increases confidence in the build’s ability to be a candidate for acceptance testing. Builds failing commit stage testing do not progress to the next stage: the acceptance testing stage.

Acceptance testing is the domain of business analysts and business representatives assigned to the project team. However, this should not mean that development staff do not have any involvement in acceptance testing. Successful testing in a CD environment gives developers more ownership in driving quality by allowing them to conduct automated acceptance tests in their development environments. Common obstacles to enabling this, such as insufficient licenses and/or manual deployment and setup processes, need to be removed. Acceptance testing is a critical step in the deployment pipeline: a release is deemed acceptable for deployment only if it passes the acceptance test stage. The entire team should focus on fixing acceptance testing issues for a given release. A CD environment requires acceptance testing to be automated as much as possible: a fully automated acceptance testing suite enables the tests to be run for a build as when needed – this speeds up the development process and also enables creation of a powerful suite of regression tests that can be run over and over again. Some tools even offer capabilities to encode acceptance test criteria and to programmatically drive creation of acceptance testing based on those criteria: thus testing, and hence ultimately delivered software, can never be out of sync with evolving acceptance criteria and requirements.

If the system under development is a high performance system, some capacity and performance testing may become part of acceptance testing as well. Usually however capacity testing and testing for other ‘non-functional requirements’ is separate stage in a CD deployment pipeline. Although a CD environment requires such tests to be as automated as possible e.g. through the use of Recorded Interaction Templates and other devices, the success criteria for such tests is somewhat subjective – so although a release may fail automated capacity testing technically, it may still be greenlighted to go ahead based on human judgment. Ultimately, as the release completes the non-functional testing stage gate, it may then be put through more of the traditional manual testing. This is where human testers can excel and apply their expertise in UI testing, exploratory testing, and in creating unique testing conditions that automated testing may have not tested the software for. Manual testing effort thus is one of the last stages in the testing pipeline in a CD environment.

If testing is to indeed become ‘continuous’ in nature, there are several critical factors that need to be in place. Perhaps the most critical one is test automation, which is often times slammed by some practitioners to be difficult to do or non-value added. Whatever the reservations with automation, testing in a CD environment cannot possibly be efficient and effective without automation of testing – especially since testing is done in big numbers and it is done quite often. Automation is just one piece of the various test design and execution strategies to make testing execute efficiently and thus be successful in a CD environment. For example, CD practitioners recommend a commit testing stage lasting no more than 10 minutes – a hurdle that can be met only by adopting such strategies. Automation also applies to provisioning of and deployment to environments. ‘Push button’ deployments and provisioning of test environments is critical if developers are to conduct smoke acceptance tests to quickly test their work. Similarly, test data needs to be managed effectively. Test design and test isolation need to be such that data requirements for testing purposes are fit for purpose and parsimonious : wholesale replication of production data is neither feasible nor recommended in a CD environment. Data management, like environment management, needs to be fully automated with configurable design and push button techniques.

Testing has the opportunity to move from being a reactive purely static analysis function to being a proactive quality focused initiative. Achieving this requires making a number of tough decisions related to processes, division of responsibilities and organization of the testing effort. Making these tough decisions in a traditional environment is many times a choice. Moving to a CD environment however mandates those decisions to be made, which should be reason enough for organizations to start examining today how they can evolve and improve their testing efforts toward that ultimate model.