TY - JOUR
T1 - Cost-effective strategies for the regression testing of database applications
T2 - Case study and lessons learned
AU - Rogstad, Erik
AU - Briand, Lionel
N1 - Publisher Copyright:
© 2015 Elsevier Inc. All rights reserved.
PY - 2016/3/1
Y1 - 2016/3/1
N2 - Testing and, more specifically, the regression testing of database applications is highly challenging and costly. One can rely on production data or generate synthetic data, for example based on combinatorial techniques or operational profiles. Both approaches have drawbacks and advantages. Automating testing with production data is impractical and combinatorial test suites might not be representative of system operations. In this paper, based on a large scale case study in a representative development environment, we explore the cost and effectiveness of various approaches and their combination for the regression testing of database applications, based on production data and synthetic data generated through classification tree models of the input domain. The results confirm that combinatorial test suite specifications bear little relation to test suite specifications derived from the system operational profile. Nevertheless, combinatorial testing strategies are effective, both in terms of the number of regression faults discovered but also, more surprisingly, in terms of the importance of these faults. However, our study also shows that relying solely on synthesized test data derived from test models could lead to important faults slipping to production. Thus, we recommend that testing on production data and combinatorial testing be combined to achieve optimal results.
AB - Testing and, more specifically, the regression testing of database applications is highly challenging and costly. One can rely on production data or generate synthetic data, for example based on combinatorial techniques or operational profiles. Both approaches have drawbacks and advantages. Automating testing with production data is impractical and combinatorial test suites might not be representative of system operations. In this paper, based on a large scale case study in a representative development environment, we explore the cost and effectiveness of various approaches and their combination for the regression testing of database applications, based on production data and synthetic data generated through classification tree models of the input domain. The results confirm that combinatorial test suite specifications bear little relation to test suite specifications derived from the system operational profile. Nevertheless, combinatorial testing strategies are effective, both in terms of the number of regression faults discovered but also, more surprisingly, in terms of the importance of these faults. However, our study also shows that relying solely on synthesized test data derived from test models could lead to important faults slipping to production. Thus, we recommend that testing on production data and combinatorial testing be combined to achieve optimal results.
KW - Classification tree modeling
KW - Database applications
KW - Regression testing
UR - http://www.scopus.com/inward/record.url?scp=84962339872&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2015.12.003
DO - 10.1016/j.jss.2015.12.003
M3 - Article
AN - SCOPUS:84962339872
SN - 0164-1212
VL - 113
SP - 257
EP - 274
JO - Journal of Systems and Software
JF - Journal of Systems and Software
ER -