After having defined the test levels and in particular the system tests, it seems important to mention some very common traps, into which novice testers can easily fall. These traps can have different effects, from loss of quality in the product to lower profitability of investments in testing. In any case, these practices can quickly prove counterproductive for companies.
1. Entering the “black box “
It can sometimes be tempting to want to check the system to see if an element has been inserted, for example, but be careful! This “intrusion” into the black box creates a dependency on the implementation of the system and exposes it to increased test maintenance as the system evolves. To be used only when absolutely necessary and by weighing up the pros and cons!
2. Depend on the proper functioning of the system itself to carry out the tests
This is the case where we want to avoid the sometimes-significant complexity of setting up the test context (preconditions for execution, test data, etc.). We can then choose to use the system itself to get in shape before starting our test. We take the risk of overlooking one of the objectives of the test, which is to provide information on the proper functioning of the entire system.
Let’s take an example:
We would like to validate a feature to delete an element in a system. To ensure the replay-ability of the test, we must have a data to delete at each launch. Several options are then available to us, including:
- Playing a script before starting the test to insert the data (the best option in many cases)
- Reinstalling all test data each time the system is deployed and therefore redeploying the system before each test launch
- Having “enough” data to play the test a number of times (this option is clearly not the best because it only postpones the problem!)
- Inserting the data using the system’s data addition functionality: this is the option that hides a major trap!
The risk is to lose visibility on the status of all system features if data insertion does not work.
Data retrievals, modifications and deletions can no longer be tested if all test scenarios start by using the system to insert the test data and this is not possible.
We find the same phenomenon if, in order to “save time” on the implementation of test contexts, we build a “super” scenario in which we carefully link all the functionalities to be tested to cover the whole: we will only be able to see the first problem encountered and we will be blind to other potential problems that can be detected later on in the test scenario. We then expose ourselves to what I would call “chain effect bugs”: we fix one bug, we discover the next one, we fix it, we find another one etc… and we absolutely do not know how many features of the system are altered at a moment T.
This makes decision-making and planning very difficult because the correction effort cannot be estimated. This dependence on the proper functioning of some system functionalities to validate others is inevitable, but it must be handled with care and it is important never to lose the objective of testing, which is to provide continuous information on each functional rule of the system.
3. Reproduce the system’s behaviour in tests
It may be tempting to reproduce a calculation algorithm for example to validate the relevance of the result returned by the system. But this practice is very risky and there is no way to tell if the correct result is the one calculated by the system under test or the one calculated by the test tool. A common example is the generation of authentication tokens: it is not good to try to calculate a valid token in the test tool to validate the one returned by the system. The best way is still to use the received token to check its validity (while taking care not to fall into the previous mistake!).
Be careful not to reimplement the complexity of the system you are trying to test in the test tool itself: the test tool must remain “simple” observer of the system it validates and that is the whole difficulty!
4. Waiting “for a while”
When some system processes take “time” or are asynchronous, you may want to wait a “certain time” before checking certain information. But how long will it take?
- If we wait too little, the test is misled when the functionality is working correctly.
- If we wait too long, the test execution time is degraded when the system has finished processing, and this can have a very serious impact when we have hundreds or even thousands of tests.
We often find ourselves having to gradually increase this waiting time to avoid false positives and thus waste a lot of time playing the tests. It is in this situation that we can look at the secondary outputs of the system: we can check the appearance of a log input or the emission of a particular notification to trigger the next step of the test. It is then the system itself that tells us when to continue.
It is still necessary to decide on a reasonable timeout to conclude that something went wrong, but if the test is successful, we will have limited the time spent waiting as much as possible.
5. Use dynamic reference values
To conduct the tests, we define “expected data” that will be used to validate system returns. We must always be careful not to over-dynamize these data: we can always dynamize data linked to the current date or to overcome the constraints of uniqueness of certain information (generated guidebooks for example) in the outputs that we want to check but we must be careful to have fixed and controlled data, taken as reference values. Otherwise, the risk is to validate the format or structure of the data well, but not its veracity, i.e. its relevance, given the test dataset.
6. Mass update of reference data
This is often what can be done when there are significant changes in the outputs of the system or when you want to establish a first set of reference data on an existing system. You have to be careful of this because one or more bugs can hide behind the current feedback. Indeed, if everything seems to be linked to the current modification in the system, one or more regressions can still be hidden in the newly obtained returns. Master data will be used for a long time to secure product deliveries and embedding a bug can have a very significant effect on customers without anyone noticing.
Reference data must be validated manually and on a case-by-case basis to ensure the best benefit from system testing.
7. Implement a specific behaviour for testing in the system
This is the famous “If test” in the production code! A particular behaviour is triggered if a certain header or other information is received to identify that it is an ongoing test. This makes it possible to validate the system’s behaviour, when it is part of the tests, but not its real behaviour in production. I must admit that I have sometimes had to use it in very specific cases where it has proved to be essential, but this practice must be handled with great caution.
8. Establish a global test report
Depending on the test tool used, the different test cases can be run, and reports generated in different ways. It is essential to have a report on the status of each system functionality at the end of the tests. We must avoid having a single indicator giving the global status (which would very often be “red”) with the need to go to the console or a file to be able to identify the problem or problems encountered.
9. Only play the tests after a system modification
While it is important to play the system tests after each deployment following a system change, this is not enough. It is indeed important to regularly ensure that the system remains functional in its environment. The environment can evolve and a system that is not often modified can be altered by external causes. It is therefore advisable to initiate existing tests on a regular basis, even if there are no system modifications. Otherwise, there is a risk of finding a number of surprises when the next change is made to the system.
10. Test only busy cases
While it is essential to validate the proper functioning of a system’s passing cases, error cases are just as important, and sometimes even more important if we are interested in the non-functional characteristics of system quality, such as usability or safety. Properly informing the user in case of misuse of the system will improve the quality of his or her experience, for example.
You have probably already encountered one or more of these traps and it is always a question of making a decision adapted to each situation. The important thing in risk assessment or decision-making is to know as well as possible the impact that this can have in the long term.
In software development, quality management is largely about the relevance of long-term investments. If you sometimes have to make decisions that will increase the technical debt in your product, it is not a concern as long as you do not forget to manage it properly as soon as possible. But this may be the subject of another article….
Photo credit: Michael Podger