TY - GEN
T1 - Adaptive random testing
T2 - 20th International Symposium on Software Testing and Analysis, ISSTA 2011
AU - Arcuri, Andrea
AU - Briand, Lionel
PY - 2011
Y1 - 2011
N2 - Adaptive Random Testing (ART) has been proposed as an enhancement to random testing, based on assumptions on how failing test cases are distributed in the input domain. The main assumption is that failing test cases are usually grouped into contiguous regions. Many papers have been published in which ART has been described as an effective alternative to random testing when using the average number of test case executions needed to find a failure (F-measure). But all the work in the literature is based either on simulations or case studies with unreasonably high failure rates. In this paper, we report on the largest empirical analysis of ART in the literature, in which 3727 mutated programs and nearly ten trillion test cases were used. Results show that ART is highly inefficient even on trivial problems when accounting for distance calculations among test cases, to an extent that probably prevents its practical use in most situations. For example, on the infamous Triangle Classification program, random testing finds failures in few milliseconds whereas ART execution time is prohibitive. Even when assuming a small, fixed size test set and looking at the probability of failure (P-measure), ART only fares slightly better than random testing, which is not sufficient to make it applicable in realistic conditions. We provide precise explanations of this phenomenon based on rigorous empirical analyses. For the simpler case of single-dimension input domains, we also perform formal analyses to support our claim that ART is of little use in most situations, unless drastic enhancements are developed. Such analyses help us explain some of the empirical results and identify the components of ART that need to be improved to make it a viable option in practice.
AB - Adaptive Random Testing (ART) has been proposed as an enhancement to random testing, based on assumptions on how failing test cases are distributed in the input domain. The main assumption is that failing test cases are usually grouped into contiguous regions. Many papers have been published in which ART has been described as an effective alternative to random testing when using the average number of test case executions needed to find a failure (F-measure). But all the work in the literature is based either on simulations or case studies with unreasonably high failure rates. In this paper, we report on the largest empirical analysis of ART in the literature, in which 3727 mutated programs and nearly ten trillion test cases were used. Results show that ART is highly inefficient even on trivial problems when accounting for distance calculations among test cases, to an extent that probably prevents its practical use in most situations. For example, on the infamous Triangle Classification program, random testing finds failures in few milliseconds whereas ART execution time is prohibitive. Even when assuming a small, fixed size test set and looking at the probability of failure (P-measure), ART only fares slightly better than random testing, which is not sufficient to make it applicable in realistic conditions. We provide precise explanations of this phenomenon based on rigorous empirical analyses. For the simpler case of single-dimension input domains, we also perform formal analyses to support our claim that ART is of little use in most situations, unless drastic enhancements are developed. Such analyses help us explain some of the empirical results and identify the components of ART that need to be improved to make it a viable option in practice.
KW - distance
KW - diversity
KW - F-measure
KW - faulty region
KW - P-measure
KW - random testing
KW - shape
KW - similarity
UR - http://www.scopus.com/inward/record.url?scp=80051947344&partnerID=8YFLogxK
U2 - 10.1145/2001420.2001452
DO - 10.1145/2001420.2001452
M3 - Conference contribution
AN - SCOPUS:80051947344
SN - 9781450305624
T3 - 2011 International Symposium on Software Testing and Analysis, ISSTA 2011 - Proceedings
SP - 265
EP - 275
BT - 2011 International Symposium on Software Testing and Analysis, ISSTA 2011 - Proceedings
Y2 - 17 July 2011 through 21 July 2011
ER -