TY - GEN
T1 - Automated Question Answering for Improved Understanding of Compliance Requirements
T2 - 30th IEEE International Requirements Engineering Conference, RE 2022
AU - Abualhaija, Sallam
AU - Arora, Chetan
AU - Sleimi, Amin
AU - Briand, Lionel C.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Software systems are increasingly subject to regulatory compliance. Extracting compliance requirements from regulations is challenging. Ideally, locating compliance-related information in a regulation requires a joint effort from requirements engineers and legal experts, whose availability is limited. However, regulations are typically long documents spanning hundreds of pages, containing legal jargon, applying complicated natural language structures, and including cross-references, thus making their analysis effort-intensive. In this paper, we propose an automated question-answering (QA) approach that assists requirements engineers in finding the legal text passages relevant to compliance requirements. Our approach utilizes large-scale language models fine-tuned for QA, including BERT and three variants. We evaluate our approach on 107 question-answer pairs, manually curated by subject-matter experts, for four different European regulatory documents. Among these documents is the general data protection regulation (GDPR) - a major source for privacy-related requirements. Our empirical results show that, in $\approx 94$% of the cases, our approach finds the text passage containing the answer to a given question among the top five passages that our approach marks as most relevant. Further, our approach successfully demarcates, in the selected passage, the right answer with an average accuracy of $\approx$91%.
AB - Software systems are increasingly subject to regulatory compliance. Extracting compliance requirements from regulations is challenging. Ideally, locating compliance-related information in a regulation requires a joint effort from requirements engineers and legal experts, whose availability is limited. However, regulations are typically long documents spanning hundreds of pages, containing legal jargon, applying complicated natural language structures, and including cross-references, thus making their analysis effort-intensive. In this paper, we propose an automated question-answering (QA) approach that assists requirements engineers in finding the legal text passages relevant to compliance requirements. Our approach utilizes large-scale language models fine-tuned for QA, including BERT and three variants. We evaluate our approach on 107 question-answer pairs, manually curated by subject-matter experts, for four different European regulatory documents. Among these documents is the general data protection regulation (GDPR) - a major source for privacy-related requirements. Our empirical results show that, in $\approx 94$% of the cases, our approach finds the text passage containing the answer to a given question among the top five passages that our approach marks as most relevant. Further, our approach successfully demarcates, in the selected passage, the right answer with an average accuracy of $\approx$91%.
KW - BERT
KW - Language Models (LMs)
KW - Natural Language Processing (NLP)
KW - Question Answering
KW - Regulatory Compliance
KW - Requirements Engineering
UR - http://www.scopus.com/inward/record.url?scp=85138923987&partnerID=8YFLogxK
U2 - 10.1109/RE54965.2022.00011
DO - 10.1109/RE54965.2022.00011
M3 - Conference contribution
AN - SCOPUS:85138923987
T3 - Proceedings of the IEEE International Conference on Requirements Engineering
SP - 39
EP - 50
BT - Proceedings - 30th IEEE International Requirements Engineering Conference, RE 2022
A2 - Knauss, Eric
A2 - Mussbacher, Gunter
A2 - Arora, Chetan
A2 - Bano, Muneera
A2 - Schneider, Jean-Guy
PB - IEEE Computer Society
Y2 - 15 August 2022 through 19 August 2022
ER -