Towards Software Requirements Extraction Using Natural Language Approach AMJAD HUDAIB, BASSAM HAMMO, YARA ALKHADER DEPARTMENT OF COMPUTER INFORMATION SYSTEMS University of Jordan Amman 11942 Jordan JORDAN Abstract: - In this paper, we present an automated support environment to reduce the time and efforts required to produce and maintain a reusable specification document. Our proposed model has two operation modes: the first one is the forward mode in which our model automatically converts English natural language requirements into UML class diagram models. While the second one is the backward mode in which our model automatically reverses UML class diagram models into English natural language requirements. We compared our model with previous models and the results are promising. Key-Words: - Requirements Engineering, UML Class Diagram, Natural Language Processing, Specification Document, Software Design. 1 Introduction Requirement engineering is the first step toward building software [12] [13]. Requirement engineering main concerns are to establish, structure and model software specifications into written documents namely specification documents. Those documents serve as the mean of communication between different stakeholders of software [2, 14]. To realize the importance of the specification document many of the work in requirement engineering field has been targeting the way specification documents describe their requirements [1, 11]. Although there were many attempts to use formal and semiformal languages for describing specification documents, the use of the informal natural languages remains the most widely in use [6] [5]. However, sever problems emerged while using natural language in specification documents: first of all, they can be ambiguous, inconsistent and incomplete [10] [14]. Secondly, they are never understood by computers directly without preprocessing [10].Therefore, some reinterpretations of the natural language requirements are usually conducted by the requirements engineer before proceeding with system design and development [6]. This reinterpretation is non-trivial and error prone. It needs a considerable amount of experience and it is time consuming. What makes it even implausible is the fact that requirements evolves in order to reflect real world changes. This change in the specification document requires reinterpreting the specifications into models and updating the software accordingly [9]. In our work, we present a model that has two operation modes: The forward operation mode which automates the reinterpretation of natural language requirements into UML class diagram models. The second mode is the backward operation mode which automatically reveres the models into natural language requirement specifications. The advantage from our operation scheme is to provide seamless model - natural language view. 2 Related Works Reference [6] used the eXtensible Markup Language (XML) and Two Level Grammar (TLG) to transform natural language into the formal object oriented Vienna Development Method (VDM++). The main concern of their work was to automate the management of formal requirements keeping them compatible with their natural language counterpart. However, in [1] they built a requirements engineering supporting environment that analyzes and synthesize different views given requirements written in natural languages. In their work, they used a shared repository and multiple viewers and modelers to provide different interfaces for the given natural language requirements. Where in [9], the authors automated the transformation of natural language into the semiformal Unified Modeling Language (UML) using role based technique, which is a conceptual model used to produce object oriented static views. In fact, they first translated Proceedings of the 6th WSEAS Int. Conf. on Software Engineering, Parallel and Distributed Systems, Corfu Island, Greece, February 16-19, 2007 155