TY - GEN
T1 - Automatic test suite generation for key-points detection DNNs using many-objective search (experience paper)
AU - Haq, Fitash Ul
AU - Shin, Donghwan
AU - Briand, Lionel C.
AU - Stifter, Thomas
AU - Wang, Jun
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/7/11
Y1 - 2021/7/11
N2 - Automatically detecting the positions of key-points (e.g., facial key-points or finger key-points) in an image is an essential problem in many applications, such as driver's gaze detection and drowsiness detection in automated driving systems. With the recent advances of Deep Neural Networks (DNNs), Key-Points detection DNNs (KP-DNNs) have been increasingly employed for that purpose. Nevertheless, KP-DNN testing and validation have remained a challenging problem because KP-DNNs predict many independent key-points at the same time - -where each individual key-point may be critical in the targeted application - -and images can vary a great deal according to many factors. In this paper, we present an approach to automatically generate test data for KP-DNNs using many-objective search. In our experiments, focused on facial key-points detection DNNs developed for an industrial automotive application, we show that our approach can generate test suites to severely mispredict, on average, more than 93% of all key-points. In comparison, random search-based test data generation can only severely mispredict 41% of them. Many of these mispredictions, however, are not avoidable and should not therefore be considered failures. We also empirically compare state-of-the-art, many-objective search algorithms and their variants, tailored for test suite generation. Furthermore, we investigate and demonstrate how to learn specific conditions, based on image characteristics (e.g., head posture and skin color), that lead to severe mispredictions. Such conditions serve as a basis for risk analysis or DNN retraining.
AB - Automatically detecting the positions of key-points (e.g., facial key-points or finger key-points) in an image is an essential problem in many applications, such as driver's gaze detection and drowsiness detection in automated driving systems. With the recent advances of Deep Neural Networks (DNNs), Key-Points detection DNNs (KP-DNNs) have been increasingly employed for that purpose. Nevertheless, KP-DNN testing and validation have remained a challenging problem because KP-DNNs predict many independent key-points at the same time - -where each individual key-point may be critical in the targeted application - -and images can vary a great deal according to many factors. In this paper, we present an approach to automatically generate test data for KP-DNNs using many-objective search. In our experiments, focused on facial key-points detection DNNs developed for an industrial automotive application, we show that our approach can generate test suites to severely mispredict, on average, more than 93% of all key-points. In comparison, random search-based test data generation can only severely mispredict 41% of them. Many of these mispredictions, however, are not avoidable and should not therefore be considered failures. We also empirically compare state-of-the-art, many-objective search algorithms and their variants, tailored for test suite generation. Furthermore, we investigate and demonstrate how to learn specific conditions, based on image characteristics (e.g., head posture and skin color), that lead to severe mispredictions. Such conditions serve as a basis for risk analysis or DNN retraining.
KW - Deep neural network
KW - Key-point detection
KW - Many-objective search algorithm
KW - Software testing
UR - http://www.scopus.com/inward/record.url?scp=85111437804&partnerID=8YFLogxK
U2 - 10.1145/3460319.3464802
DO - 10.1145/3460319.3464802
M3 - Conference contribution
AN - SCOPUS:85111437804
T3 - ISSTA 2021 - Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis
SP - 91
EP - 102
BT - ISSTA 2021 - Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis
A2 - Cadar, Cristian
A2 - Zhang, Xiangyu
PB - Association for Computing Machinery, Inc
T2 - 30th ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2021
Y2 - 11 July 2021 through 17 July 2021
ER -