TY - JOUR
T1 - Using machine learning to assist with the selection of security controls during security assessment
AU - Bettaieb, Seifeddine
AU - Shin, Seung Yeob
AU - Sabetzadeh, Mehrdad
AU - Briand, Lionel C.
AU - Garceau, Michael
AU - Meyers, Antoine
N1 - Publisher Copyright:
© 2020, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2020/7/1
Y1 - 2020/7/1
N2 - Context: In many domains such as healthcare and banking, IT systems need to fulfill various requirements related to security. The elaboration of security requirements for a given system is in part guided by the controls envisaged by the applicable security standards and best practices. An important difficulty that analysts have to contend with during security requirements elaboration is sifting through a large number of security controls and determining which ones have a bearing on the security requirements for a given system. This challenge is often exacerbated by the scarce security expertise available in most organizations. Objective: In this article, we develop automated decision support for the identification of security controls that are relevant to a specific system in a particular context. Method and Results: Our approach, which is based on machine learning, leverages historical data from security assessments performed over past systems in order to recommend security controls for a new system. We operationalize and empirically evaluate our approach using real historical data from the banking domain. Our results show that, when one excludes security controls that are rare in the historical data, our approach has an average recall of ≈ 94% and average precision of ≈ 63%. We further examine through a survey the perceptions of security analysts about the usefulness of the classification models derived from historical data. Conclusions: The high recall – indicating only a few relevant security controls are missed – combined with the reasonable level of precision – indicating that the effort required to confirm recommendations is not excessive – suggests that our approach is a useful aid to analysts for more efficiently identifying the relevant security controls, and also for decreasing the likelihood that important controls would be overlooked. Further, our survey results suggest that the generated classification models help provide a documented and explicit rationale for choosing the applicable security controls.
AB - Context: In many domains such as healthcare and banking, IT systems need to fulfill various requirements related to security. The elaboration of security requirements for a given system is in part guided by the controls envisaged by the applicable security standards and best practices. An important difficulty that analysts have to contend with during security requirements elaboration is sifting through a large number of security controls and determining which ones have a bearing on the security requirements for a given system. This challenge is often exacerbated by the scarce security expertise available in most organizations. Objective: In this article, we develop automated decision support for the identification of security controls that are relevant to a specific system in a particular context. Method and Results: Our approach, which is based on machine learning, leverages historical data from security assessments performed over past systems in order to recommend security controls for a new system. We operationalize and empirically evaluate our approach using real historical data from the banking domain. Our results show that, when one excludes security controls that are rare in the historical data, our approach has an average recall of ≈ 94% and average precision of ≈ 63%. We further examine through a survey the perceptions of security analysts about the usefulness of the classification models derived from historical data. Conclusions: The high recall – indicating only a few relevant security controls are missed – combined with the reasonable level of precision – indicating that the effort required to confirm recommendations is not excessive – suggests that our approach is a useful aid to analysts for more efficiently identifying the relevant security controls, and also for decreasing the likelihood that important controls would be overlooked. Further, our survey results suggest that the generated classification models help provide a documented and explicit rationale for choosing the applicable security controls.
KW - Automated decision support
KW - Machine learning
KW - Security assessment
KW - Security requirements engineering
UR - http://www.scopus.com/inward/record.url?scp=85083778635&partnerID=8YFLogxK
U2 - 10.1007/s10664-020-09814-x
DO - 10.1007/s10664-020-09814-x
M3 - Article
AN - SCOPUS:85083778635
SN - 1382-3256
VL - 25
SP - 2550
EP - 2582
JO - Empirical Software Engineering
JF - Empirical Software Engineering
IS - 4
ER -