Developing Interpretable Models with Optimized Set Reduction for Identifying High-Risk Software Components

Lionel C. Briand, Victor R. Basili, Christopher J. Hetmanski

Research output: Contribution to journalArticlepeer-review


Applying equal testing and verification effort to all parts of a software system is not very efficient, especially when resources are limited and scheduling is tight. Therefore, one needs to be able to differentiate low/high fault frequency components so that testing/verification effort can be concentrated where needed. Such a strategy is expected to detect more faults and thus improve the resulting reliability of the overall system. This paper presents the Optimized Set Reduction approach for constructing such models, intended to fulfill specific software-engineering needs. Our approach to classification is to measure the software system and build multivariate stochastic models for predicting high-risk system components. We present experimental results obtained by classifying Ada components into two classes: is or is not likely to generate faults during system and acceptance test. Also, we evaluate the accuracy of the model and the insights it provides into the error-making process.

Original languageEnglish
Pages (from-to)1028-1044
Number of pages17
JournalIEEE Transactions on Software Engineering
Issue number11
Publication statusPublished - Nov 1993
Externally publishedYes


  • Ada components
  • Classification tree
  • data analysis
  • fault-prone
  • logistic regression
  • machine learning
  • Optimized Set Reduction
  • stochastic modeling


Dive into the research topics of 'Developing Interpretable Models with Optimized Set Reduction for Identifying High-Risk Software Components'. Together they form a unique fingerprint.

Cite this