Running out of time for testing? A genetic algorithm can reorder your tests!

post

research paper

regression testing

Time-aware prioritization finds more faults when the testing clock is ticking!

Author

Gregory M. Kapfhammer

Published

2006

Introduction

If you have ever worked on a project that runs its entire test suite overnight — or worse, over an entire weekend — you know that testing time is a precious resource. When you only have a few hours to test before a release, which test cases should you run first? My colleagues and I addressed this question in (Walcott et al. 2006) , “Time-Aware Test Suite Prioritization,” published at the International Symposium on Software Testing and Analysis (ISSTA). We present a genetic algorithm that reorders a test suite so that it detects faults as fast as possible while always finishing within a specified time budget.

Key Contributions

Time-Aware Prioritization Problem: We formally define the problem of reordering a test suite to maximize fault detection within a given time limit. This formulation reduces to the NP-complete 0/1 knapsack problem, making it a natural fit for heuristic search techniques like the one presented in this paper.
Genetic Algorithm Approach: We design a genetic algorithm that considers both the coverage potential and the execution time of each test case. The algorithm uses crossover, mutation, and elitist selection operators to evolve test orderings that pack the most fault-detection capability into the available testing window.
Empirical Evaluation: Using two case study applications, we show that the genetic algorithm produces prioritizations with significantly higher fault detection rates than random orderings, the initial test ordering, and the reverse ordering. The technique proves especially valuable when the time budget is tight.

Empirical Results

Our experiments revealed important trade-offs in time-aware prioritization. Test suites prioritized using basic block level coverage frequently achieved higher average percentage of faults detected (APFD) values compared to method level coverage. The genetic algorithm consistently outperformed simpler strategies, and the advantage was most pronounced when the testing time budget was small — precisely when prioritization matters most. We also measured the time and space overheads of the approach and found that it is practical when there is a fixed set of time constraints, when prioritization occurs infrequently, or when the time budget is particularly tight.

Future Work

The paper discusses several enhancements to the baseline approach. These include using per-test coverage information to improve the fitness function, extending the technique to additional time-constrained testing scenarios, and reducing the overhead of the prioritization process itself. The central insight — that testing time should be treated as a first-class concern in test suite prioritization — will likely continue to influence research in regression testing. With that said, there is a clear need for software testing tools to perform regression test suite prioritization with methods like the one presented in this paper.

Further Details

If you work in an environment where testing time is limited, I encourage you to read (Walcott et al. 2006) to learn more about how genetic algorithms can help you get the most out of your testing budget. If you have questions about test suite prioritization, please contact me. To stay updated on the latest developments in software testing research, you can also subscribe to my mailing list.

Return to Blog Post Listing

References

Walcott, Kristen R., Mary Lou Soffa, Gregory M. Kapfhammer, and Robert S. Roos. 2006. “Time-Aware Test Suite Prioritization.” In Proceedings of the International Symposium on Software Testing and Analysis.