Extracting domain models from natural-language requirements: Approach and industrial evaluation

Chetan Arora, Mehrdad Sabetzadeh, Lionel Briand, Frank Zimmer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Domain modeling is an important step in the transition from natural-language requirements to precise specifications. For large systems, building a domain model manually is a laborious task. Several approaches exist to assist engineers with this task, whereby candidate domain model elements are automatically extracted using Natural Language Processing (NLP). Despite the existing work on domain model extraction, important facets remain under-explored: (1) there is limited empirical evidence about the usefulness of existing extraction rules (heuristics) when applied in industrial settings; (2) existing extraction rules do not adequately exploit the natural-language dependencies detected by modern NLP technologies; and (3) an important class of rules developed by the information retrieval community for information extraction remains unutilized for building domain models. Motivated by addressing the above limitations, we develop a domain model extractor by bringing together existing extraction rules in the software engineering literature, extending these rules with complementary rules from the information retrieval literature, and proposing new rules to better exploit results obtained from modern NLP dependency parsers. We apply our model extractor to four industrial requirements documents, reporting on the frequency of different extraction rules being applied. We conduct an expert study over one of these documents, investigating the accuracy and overall effectiveness of our domain model extractor.

Original languageEnglish
Title of host publicationProceedings - 19th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2016
PublisherAssociation for Computing Machinery, Inc
Pages250-260
Number of pages11
ISBN (Electronic)9781450343213
DOIs
Publication statusPublished - 2 Oct 2016
Externally publishedYes
Event19th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2016 - Saint-Malo, France
Duration: 2 Oct 20167 Oct 2016

Publication series

NameProceedings - 19th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2016

Conference

Conference19th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2016
Country/TerritoryFrance
CitySaint-Malo
Period2/10/167/10/16

Keywords

  • Case study research
  • Model extraction
  • Natural language processing
  • Natural-language requirements

Fingerprint

Dive into the research topics of 'Extracting domain models from natural-language requirements: Approach and industrial evaluation'. Together they form a unique fingerprint.

Cite this