Generation of an OWL Ontology from a Knowledge Domain Extended Lexicon Karla Olmos-Sánchez Jorge Rodas-Osollo Yanet Garay Sonia Herrera Universidad Autónoma de Universidad Autónoma de Universidad Autónoma de Universidad Autónoma de Ciudad Juárez Ciudad Juárez Ciudad Juárez Ciudad Juárez Av. Del Charro 450 Nte. Av. Del Charro 450 Nte. Av. Del Charro 450 Nte. Av. Del Charro 450 Nte. Cd. Juárez Chih. Mex. Cd. Juárez Chih. Mex. Cd. Juárez Chih. Mex. Cd. Juárez Chih. Mex. (52) 656 6884841 (52) 656 6884841 (52) 656 6884841 (52) 656 6884841 kolmos@uacj.mx jorge.rodas@uacj.mx yanet.garay@gmail.co sonia.magdiel.269@gm m ail.com ABSTRACT where not all concepts and their relationships can be formally Informally Structured Domains (ISD) are characterized by informal defined, the solutions of most of their problems are situated and and unstructured information that depends on the context for diverse, not susceptible to be described by an algorithm, and where interpretation; thus, the most of the concepts and their relationships domain specialists use large amounts of tacit knowledge in order to are defined by consensus and domain specialists use large amounts solve problems. In this kind of domains, named Informally of tacit knowledge in order to solve everyday situations. These Structured Domains, providing certain structure that synthesizes the characteristics cause that modeling this kind of domains becomes a knowledge of domain specialists and make it explicit is fundamental challenging and time-consuming task in which the representations in order to develop a correct and appropriate solution or product [9]. do not reflects correctly the reality. This paper proposes a new This structure can be proportionate by an OWL ontology. approach to the process of understanding and modeling an ISD, by The objective of this paper is to present a new approach to generate using a process for generating an OWL ontology from a lexicon an OWL ontology from the Knowledge Domain Extended Lexicon named KDEL with the aim of supporting a better understanding of (KDEL), in order to facilitate the understanding, development and the application domain, hence facilitates the development of validation of an ISD. There is a similar approach proposed by [3], products or solutions in ISD. however our work is designed keeping in mind the ISD characteristics. Thus, our motivations is to provide a tool that Keywords minimize the time to structure and visualize the domain knowledge, Informally Structured Domains; Ontologies Building Process; with the further aim of facilitating to discover relationships that Knowledge Domain Extended Lexical could have hidden to domain specialist awareness. 1. INTRODUCTION The rest of this paper is structured as follows: Section 2 provides a The importance of knowledge domain in order to elicit detailed explication of KDEL, section 3 describes the OWL requirements of a solution or product that fulfill the needs and ontology language, the OWL building process from KDEL is expectative of clients and users is widely accepted among the introduced in section 4, section 5 reports the results of the research community [2][12]; especially when the solution-solvers application of the method in ISD real cases, finally section 6 or product developers are not immerse in the application domain. concludes our work with future directions. Recently, the use of ontologies as a means to define and make explicit this knowledge has become seen as a good option [14][6]. 2. KDEL Domain ontologies can be used as a way of facilitating the The term Universe of Discourse (UofD) generally refers to the understanding among stakeholders, detecting of missing and collection of objects being discussed in a specific domain. It is erroneous information and describing the domain in the way of evident that there is a correlation between the domain knowledge domain specialists thinking, hence avoid ambiguous, insufficient and the terms used daily by domain specialists. Thus, in order to and incomplete requirements. In this work, the term domain assist to solution-solver or product developer in understanding the specialists refers to all people involved in the application domain, terms of the domain specialists, several authors [8][13] propose the which could have partial and different knowledge of it depending use of a glossary of common terms with the additional aim of on their role and experience. facilitating communication and understanding of all involved in a project. Despite that natural language is ambiguous and depends on In particular, the OWL Web Ontology Language has been designed the context for interpretation, it is the only notation that is for use by computer systems instead of just presenting information commonly readable and understandable by the domain specialists to humans. OWL facilitates greater machine interpretability of Web and its use encouraging them to participate dynamically in the first content by providing additional vocabulary to XML, RDF and RDF steps of any project [7]. Scheme, along with a formal semantics. OWL allows us to describe the semantics of knowledge in a machine-accessible way, therefore One of these proposals is the Language Extended Lexical (LEL) it has promoted the development of multiple and varied software [11], which is a set of terms related to the application domain with applications [1]. the aim of understanding the language of the problem without worrying about understanding the problem. In order to give an Nevertheless, ontology development is a complex and time- initial structure to the knowledge domain, each term in the UofD is consuming activity that seems to be an art rather than a formal method. Besides, not all domains are equal, there are domains classified as object, subject, verb or state and is described by a by understanding the problem and the structure of the solution. notion (denotation) and a behavioral response (connotation). Figure 1 depicts a scheme of KDEL in an UML class diagram. 2.1 KDEL Process Building Handling synonymous is a special issue of any representation of In order to deal with the challenges of ISD, the Knowledge of language. Two or more terms are synonymous if they share the Domain on an Extended Lexicon (KDEL) evolves LEL by meaning of a concept. In KDEL, when two or more terms are modifying two aspects of it. The first one is that, besides the synonymous they share the same structure and the two terms are classification of object, subject, verb and state, it incorporates separated by a diagonal slash. definitions and NF-Requirements. The rationality behind this is that Based on the rules to describe terms proposed in [7], a set of in the early stage of any development software project, the domain suggestions for description of terms have been proposed for KDEL. specialists do not have a clear idea of what they want. In ISD, even Table 1 gives the rules of description for objects, Table 2 for they do not have a well-defined structure of the application domain subjects, Table 3 for verbs, Table 4 for definitions and Table 5 for knowledge and a great quantity of it is tacit or implicit. Thus, the NF-requirements. domain specialists interleave in their discourse needs, desires, domain properties, and current and future processes. To give a Table 1. Rules of description of objects preliminary order to this information, KDEL characterizes the Object application domain in terms, which can be concepts, definitions and Non-Functional (NF) requirements; as it is explained below: Define the object and its relationships Notion with other objects or subjects - Concepts. They are equivalents to the terms in LEL and Describe the actions that are done with are described by a notion (denotation) and a behavioral Current Behavioral the object in the current time response (connotation). KDEL classify the concepts as Describe the actions that are done with objects, subjects or verbs. Unlike LEL, state is not Future Behavioral the object once the solution or product considered as a concept because we consider that it is were deployment inherently attached to subjects or objects. Describe all possible states of the object States and the event that triggers it - Definitions. They are statements that assign a precise or consensual meaning to terms used in the applications domain, but that cannot be considered as concepts; thus Table 2. Rules of description of subject they cannot have a behavioral response. Definitions are Subject necessary in order to understand the context of the Define the subject and its relationships application domain. Notion with other objects or subjects - NF Requirements. They refer to concerns not related to Describe the actions that are done by Current Behavioral the functionality of the software, such as usability, the subject in the current time flexibility, performance, interoperability and security [5]. Describe the actions that are done by One of the objectives of the KDEL is to capture the NF- Future Behavioral the subject once the solution or product Requirements introduced in the early discourses of were deployment domain specialists; however, in subsequent stages of the Describe all possible states of the object States process, solution-solvers of product developers can and the event that triggers it include more of them. Table 3. Rules of description of subject The second aspect that KDEL modifies is the internal structure of Verb Describe who performs the action Notion represented by the verb and when the it occurs Describe in detail the action in the Current Behavioral current time Describe in detail the action in the Future Behavioral future time Table 4. Rules of description of definitions Definition Figure 1. Conceptual scheme of KDEL in UML Describes the meaning of the term in Notion the domain terms. The application domain will be affected after a solution or product was deployed in it; hence, the set of terms, as well as its denotation and connotation, will not be the same. To handle this Table 5. Rules of description of NF-requirements issue, a future behavioral response is added to the structure of them, which is not mandatory; it is only added if it is evident in the early NF-requirements stages of the project. It allows the requirements engineers to gain Notion Describes the NF-Requirement more domain knowledge and explore new possibilities of solution Describe the goals to be achieved by authors have raised the use of ontologies to formalize the Goals the NF-requirement application domain. 3.1 OWL Structure Besides the rules describes allow, in order to build KDEL the OWL is a widely used proposal of formal languages for ontologies following task are also necessary: [1] , which is defined using the syntax of RDF/XML. The elements of an OWL ontology concern classes, properties, instances of - Apply techniques of discourse analysis to identify classes, and relationships between these instances. This section syntactic constructions that could hide tacit knowledge. presents the essential components of this language in order to - Record, for each term in KDEL, questions or comments introduce those elements. that will be consulted to domain specialists. 1) Classes are concrete representations of concepts; OWL classes Table 6 depicts the representation of the term evaluator in a domain are interpreted as a set of individual objects with similar features. of cognitive diagnosis of multiple sclerosis patients. The RDF/XML syntax to represent OWL classes is: Table 6. Structure of the term evaluator in KDEL Term: Evaluator Classification Subject The evaluator is a certified neuro Notion psychologist in the cognitive diagnostic and The taxonomic constructor for classes is rdfs:subClassOf. It rehabilitation of multiple sclerosis patients. relates a more specific class to a more general class. If X is a - The evaluator applies the neuro subclass of Y, then every instance of X is also an instance of Y. psychological battery of test for cognitive diagnostic to multiple sclerosis patients. Current - The evaluator proposes the rehabilitations Behavioral cognitive training based on the result of the - The evaluator indicates to patients when ... the next test will be applied. The concept of evaluator disappears in the Future future system; their functionalities will be Behavioral carried out by the software system. 2) Individuals represent objects in the domain of discourse; States No apply they can be referred to as being instances of classes. The RDF/XML - Must the evaluator be a neuro syntax to represent OWL individuals where Individual_X is a psychologist? member of Class X is: - How the evaluator determines that the Questions patient has multiple sclerosis? - How he or she makes the evaluation? - How the evaluator defines the date of the next test? 2.2 Limitations of KDEL KDEL facilitates to solution-solver or product developers to be 3) Properties are also known as roles in description logic or familiar with the language of domain specialist; hence it minimizes relations in UML and other object oriented notations. In brief, they the difficulties involved by describing the domain using a semi- represent relationships. There are a number of ways to restrict the formal or formal method. However, there are some drawbacks such relation defined by a property: the domain and range can be as the redundancy in the description of terms, which causes a specified and the property can be defined to be a specialization validation time-rise. In addition, it must be considered the hard and time-consuming activity of using KDEL to build a graphical conceptual model, which will be used by solution-solvers or product developers in two different ways: to facilitate the validation of the structure of domain and to let bring to light relationships that were hide to domain specialists. 3. OWL ONTOLOGIES An ontology is an explicit formal specification of how to represent the entities and relationships that exist in a domain. They describe the properties of a domain and reasoning about it [4]. Ontologies have also been used to capture and synthesize knowledge from diverse domain specialists, especially when their knowledge depends on their own interests and points of view [6]. Thus, several Figure 2. OWL ontology building process from KDEL (subproperty) of an existing property [36]. There are two main 1) Each subject or object of KDEL becomes an OWL class. types of properties: object properties and datatype properties. 2) For each KDEL construction with the structure X is Object properties are relationships between two individuals. In the synonym of Y/Z the following OWL properties are next example X has a link with Y. created: X Is_Synonym of Y and X Is_Synonym of Z. 3) Descriptions of the terms (notion, current and future behavior) become OWL comments. 4) Relationships in KDEL with the syntactic structure Term + verb phrase + term are turned into OWL properties. 5) Definitions in KDEL are converted into OWL classes. 6) NF-Requirements are currently not part of the ontology; Datatype properties describe relationships between an individual their handling is considered to be future work. and data values. In the next example a Person_X has age and it is a positive integer. 4.1 OWL Ontology Building Process In order to build an OWL ontology from a KDEL lexicon, the following process is proposed. The process works together with the heuristics proposed above. 1) Perform a pre-processing of KDEL based in the rules of 2) Convert KDEL terms into OWL classes (Heuristics 1, 3, 5). 3) Convert KDEL relationships into OWL properties (Heuristic 4). 4) Convert KDEL synonymous into OWL classes by creating 4. FROM KDEL TO AN OWL ONTOLOGY the OWL property between them: Is_Synonym_of As was mentioned above, KDEL is a glossary of interrelated terms. (Heuristic 2). They are classified as object, subject or verb and have a description. 5) Create an OWL file following the format of RDF/XML In particular, objects and subjects represent an entity in the domain by integrating classes, properties and individuals, as and have relationships with other terms, which are described in the identified in the previous steps. current behavioral. Thus, there is a correlation of them with OWL 6) Name the file created in the last step with the name classes. Likewise, the relationships between concepts are selected for the OWL ontology including the file equivalent with OWL properties. extension for OWL. A set of heuristics, rules or methods that helps to solve problems faster than it would if all the computing were done, have been proposed in order to facilitate the conversion of KDEL to OWL, which are listed follows: Figure 3. Screen for the Graphical Representation of an OWL ontology Figure 2 depicts the OWL building process in a SADT diagram. visualization also allows domain specialists to discover relationships that were hidden to them. Therefore, the software 4.2 Software Tool system is also appropriate for discovery knowledge issues. The solution-solvers or product developers must learn the domain terms in a short period of time in order to reduce the symmetry of ignorance, improve the cognitive dialogue and find, with the 5. APPLICATION IN ISD REAL CASES domain specialists, the set of solution or product requirements. Our proposal has been applied to generate OWL ontologies as a However, in Informally Structured Domains, the universe of part of the process for diverse solution in ISD real cases, which are discourse is frequently too large and specialized. In addition, the listed below: process of validation of the terms is generally a boring and stressful task. Thus, a software system has been developed to support the - Software Development of a Cognitive Rehabilitation handle of KDEL with the aims to facilitate the building, System for Sclerosis Multiple Patients [10]. maintenance and validation of the KDEL. The system is designed - Case-based Reasoning System to Support Heating to be used by the design team and it is able to manage several Ventilation and Air Conditioning (HVAC) Design projects. The terms of KDEL are recorded in a relational database, Decisions. following the structure showed in Figure 1. - Method to develop Bayesian Networks for evaluation in Intelligent Tutoring System for complex domains. This database is the input of another software system that executes - Analysis of the requirements elicitation process of a the OWL ontology building process described in the previous HVAC company. section. The software also has the functionality of represent in a graphical format the RDF/XML file. Thus, it allows the The generation of the ontology in each project allowed the visualization of KDEL with the aim of facilitating the validation extraction of relevant information from KDEL, which led to a view process by domain specialists. In addition, if domain specialists of the information in a synthesized way. This process also realize that the description of a term must be improved, the software facilitated the correction of the following errors committed in allows this change and automatically reconstructs the KDEL and KDEL: 1) repeated information, 2) ideas vaguely described, 3) the OWL file. ideas mixed or unfinished and 4) typing errors. In summary, the Figure 3 depicts a screen with the graphical representation of OWL ontology building process improves KDEL, which facilitates KDEL of the domain of cognitive diagnosis for multiple sclerosis the validation of it. In addition, the automatic generation of the patients. This project was developed for a Mexican real OWL ontology significantly shortens the time and effort required organization; thus the KDEL was developed in Spanish. However, to generate a graphical representation of the domain and contributes it is not our intention to give a detail description of the lexical, but to the understanding of the ISD. demonstrate the utility of the software tool in order to facilitate the validation of the domain terms and their relationships. The 6. CONCLUSIONS AND FUTURE WORK 2015. Applications of ontologies in requirements The application of KDEL and the generation of an OWL ontology engineering: a systematic review of the literature. from it in order to provide certain structure and facilitate the Requirements Engineering, 1–33. understanding of the domain to the solution-solvers or product http://doi.org/10.1007/s00766-015-0222-6 developers in ISD real cases showed that the process minimizes the 7. Shivani Goel. 2012. Transformation from LEL to UML. time of understanding the domain. It also minimizes the time it International Journal of Computer Applications 48, 12: would take the domain specialists to validate the domain structure 975–888. due to the graphical domain visualization. Finally, the graphical 8. Julio Cesar Sampario do Leite, Ana P M Franco, and domain visualization also facilitates the discovery or relationships others. 1993. A strategy for conceptual model acquisition. that were hidden to the domain specialist; allowing the detection In Proceedings of IEEE International Symposium on and correction of errors in KDEL. Requirements Engineering 1993, 243–246. As future work it is necessary to apply the process in others ISD 9. Karla Olmos and Jorge Rodas. 2013. Requirements real cases in order to verify their effectiveness and improved it, if engineering process model for informal structural necessary. domains. International Journal of Computer and Communication Engineering 2, 1: 75–77. 7. ACKNOWLEDGMENTS Our thanks to the Unity for Health Research UIS for its acronym in 10. Karla Olmos and Jorge Rodas. 2014. KMoS-RE Spanish (Unidad de Investigación en Salud) for allowing us to work Knowledge Management on a Strategy to Requirements with them in the development of the rehabilitation system for Engineering. Special Issue on Requirements Engineering multiple sclerosis patients. in Software Product Line Engineering, Requirements Engineering Journal 19, 4: 421–440. 8. REFERENCES 11. Julio Cesar Sampaio do Prado Leite, Jorge Horacio Doorn, 1. Dean Allemang and James Hendler. 2008. Semantic Web Graciela D S Hadad, and Gladys N Kaplan. 2005. Scenario for the Working Ontologist: Effective Modeling in RDFS inspections. Requirements Engineering 10, 1: 1–21. and OWL. Morgan Kaufmann Publishers Inc. 12. Pedro O Rossel, María Cecilia Bastarrica, Nancy 2. Dines Bjørner. 2010. Rôle of domain engineering in Hitschfeld-Kahler, Violeta Díaz, and Mario Medina. 2014. software development : Why current requirements Domain modeling as a basis for building a meshing tool engineering is flawed. In Lecture Notes in Computer software product line. ADVANCES IN ENGINEERING Science (including subseries Lecture Notes in Artificial SOFTWARE 70: 77–89. Intelligence and Lecture Notes in Bioinformatics), 2–34. http://doi.org/10.1016/j.advengsoft.2014.01.011 http://doi.org/10.1007/978-3-642-11486-1_2 13. Matt Selway, Wolfgang Mayer, and Markus Stumptner. 3. Karen Koogan Breitman and Julio Cesar Sampaio do 2014. Semantic interpretation of requirements through Prado Leite. 2003. Ontology as a requirements cognitive grammar and configuration. Lecture Notes in engineering product. In Proceedings of the 11th IEEE Computer Science (including subseries Lecture Notes in International Requirements Engineering Conference, Artificial Intelligence and Lecture Notes in 2003, 309–319. Bioinformatics) 8862: 496–510. 4. Verónica Castañeda, Luciana Ballejos, Ma. Laura http://doi.org/10.1007/978-3-319-13560-1 Caliusco, and Ma. Rosa Galli. 2010. The Use of 14. Katja Siegemund, Edward J Thomas, Yuting Zhao, Jeff Ontologies in Requirements Engineering. Global Journal Pan, and Uwe Assmann. 2011. Towards ontology-driven of Research In Engineering 10, 6. requirements engineering. In Proceedings of the 5. Luiz Marcio Cysneiros and Julio Cesar Sampaio do Prado Workshop Semantic Web Enabled Software Engineering Leite. 2004. Nonfunctional requirements: From elicitation at 10th International Semantic Web Conference (ISWC), to conceptual models. IEEE Transactions on Software Bonn. Engineering 30, 5: 328–350. 6. Diego Dermeval, Jassyka Vilela, Ig Ibert Bittencourt, et al.