TY - GEN
T1 - BoostNSift
T2 - 21st IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2021
AU - Razzaq, Abdul
AU - Buckley, Jim
AU - Patten, James Vincent
AU - Chochlov, Muslim
AU - Sai, Ashish Rajendra
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Locating bugs is an important, but effort-intensive and time-consuming task, when dealing with large-scale systems. To address this, Information Retrieval (IR) techniques are increasingly being used to suggest potential buggy source code locations, for given bug reports. While IR techniques are very scalable, in practice their effectiveness in accurately localizing bugs in a software system remains low. Results of empirical studies suggest that the effectiveness of bug localization techniques can be augmented by the configuration of queries used to locate buggy code. However, in most IR-based bug localization techniques, presented by researchers, the impact of the queries' configurations is not fully considered. In a similar vein, techniques consider all code elements as equally suspicious of being buggy while localizing bugs, but this is not always the case either.In this paper, we present a new method-level, information-retrieval-based bug localization technique called "BoostNSift". BoostNSift exploits the important information in queries by 'boost'ing that information, and then 'sift's the identified code elements, based on a novel technique that emphasizes the code elements' specific relatedness to a bug report over its generic relatedness to all bug reports. To evaluate the performance of BoostNSift, we employed a state-of-The-Art empirical design that has been commonly used for evaluating file level IR-based bug localization techniques: 6851 bugs are selected from commonly used Eclipse, AspectJ, SWT, and ZXing benchmarks and made openly available for method-level analyses. The performance of BoostNSift is compared with the openly-Available state-of-The-Art IR-based BugLocator, BLUiR, and BLIA techniques. Experiments show that BoostNSift improves on BLUiR by up to 324%, on BugLocator by up to 297%, and on BLIA up to 120%, in terms of Mean Reciprocal Rank (MRR). Similar improvements are observed in terms of Mean Average Precision (MAP) and Top-N evaluation measures.
AB - Locating bugs is an important, but effort-intensive and time-consuming task, when dealing with large-scale systems. To address this, Information Retrieval (IR) techniques are increasingly being used to suggest potential buggy source code locations, for given bug reports. While IR techniques are very scalable, in practice their effectiveness in accurately localizing bugs in a software system remains low. Results of empirical studies suggest that the effectiveness of bug localization techniques can be augmented by the configuration of queries used to locate buggy code. However, in most IR-based bug localization techniques, presented by researchers, the impact of the queries' configurations is not fully considered. In a similar vein, techniques consider all code elements as equally suspicious of being buggy while localizing bugs, but this is not always the case either.In this paper, we present a new method-level, information-retrieval-based bug localization technique called "BoostNSift". BoostNSift exploits the important information in queries by 'boost'ing that information, and then 'sift's the identified code elements, based on a novel technique that emphasizes the code elements' specific relatedness to a bug report over its generic relatedness to all bug reports. To evaluate the performance of BoostNSift, we employed a state-of-The-Art empirical design that has been commonly used for evaluating file level IR-based bug localization techniques: 6851 bugs are selected from commonly used Eclipse, AspectJ, SWT, and ZXing benchmarks and made openly available for method-level analyses. The performance of BoostNSift is compared with the openly-Available state-of-The-Art IR-based BugLocator, BLUiR, and BLIA techniques. Experiments show that BoostNSift improves on BLUiR by up to 324%, on BugLocator by up to 297%, and on BLIA up to 120%, in terms of Mean Reciprocal Rank (MRR). Similar improvements are observed in terms of Mean Average Precision (MAP) and Top-N evaluation measures.
KW - Bug localization
KW - Code Sifting
KW - Code analysis
KW - Query boosting
KW - Query enhancement
KW - Software maintenance
UR - http://www.scopus.com/inward/record.url?scp=85123292156&partnerID=8YFLogxK
U2 - 10.1109/SCAM52516.2021.00019
DO - 10.1109/SCAM52516.2021.00019
M3 - Conference contribution
AN - SCOPUS:85123292156
T3 - Proceedings - IEEE 21st International Working Conference on Source Code Analysis and Manipulation, SCAM 2021
SP - 81
EP - 91
BT - Proceedings - IEEE 21st International Working Conference on Source Code Analysis and Manipulation, SCAM 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 27 September 2021 through 1 October 2021
ER -