EnArgus – Ontology Based Search Michael Dembach Lukas Sikorski Rudolf Ruland Fraunhofer Institute for Fraunhofer Institute for Fraunhofer Institute for Applied Communication, Communication, Information Technology FIT Information Processing and Information Processing and Schloss Birlinghoven Ergonomics FKIE Ergonomics FKIE 53754 Sankt Augustin, Germany Fraunhoferstrasse 20 Fraunhoferstrasse 20 rudolf.ruland@fit. 53343 Wachtberg, Germany 53343 Wachtberg, Germany fraunhofer.de michael.dembach@fkie. lukas.sikorski@fkie. fraunhofer.de fraunhofer.de ABSTRACT EnArgus is building a central information system for energy This paper presents the EnArgus Project – a project which aims to research projects funded by the Federal and State. Via this, system make energy research funding more transparent and includes the professionals – as well as interested members of the public – can use and development of an ontology. The structure of the paper is receive consistent and central access to information about energy as follows: first, we will present the EnArgus project, its domain, research in the Federal Republic of Germany. For the general and its main goals. Next, we will describe the domain-specific public, the system will be available as “EnArgus Public”. ontology which we developed for EnArgus. This will lead to a Professionals, who receive special authorization, may access description of how that ontology has been constructed and “EnArgus Master”. These two versions exist due to data security evaluated. Finally, we will discuss describe how the whole system reasons: The system uses official databases for research funding has been evaluated. which contain protected data and are therefore only accessible with special privileges. Keywords The system works on a database of the BMWi which contains all energy projects funded by the Federal Republic of Germany. To Ontology Research, Ontology Visualization, Semantic use the system, the user may enter a single keyword. For example, Search, Energy Research in order to answer the question “Which of Germany’s states fund how many projects for the development of wind power plants?”, 1. THE ENARGUS PROJECT the keyword “Windkraftanlage” (wind power plant) should be The EnArgus Project is sponsored by the Federal Ministry of entered. Just by the use of that keyword about 136 projects will be Economic Affairs and Energy (BMWi) by decision of the German found. This number might be considered too low [1]. In that case, Bundestag.1 The project aims to make federal subsidies policy in the system provides ways of increasing the search result, e.g., the the field of energy research more transparent by facilitating the use of synonyms which are already in the system. With the use of assessment of technology development. EnArgus’s approach synonyms, the query for “Windkraftanlage” will deliver 1682 includes the close collaboration of experts from energy research projects, mostly because the more common term and information science.2 “Windkraftanlage” will be used [2]. To handle this huge number of projects, the distribution may be shown in different clusters depending on parameters like “start of project”, “amount of 1 money granted” and, as was the original intent, “state”. EnArgus has been sponsored in two phases by the Federal Ministry for Economic Affairs and Energy at the tag 03ET1064 (Phase 1: July 2011 to June 2013) and at the tag 03ET4010 2. THE DOMAIN-SPECIFIC ONTOLOGY (Phase 2: July 2013 to June 2016 as EnArgus 2.0). The backbone of any search process in EnArgus is its domain- 2 specific ontology about energy research. In this ontology the The project partners in the first phase were Fraunhofer Institute knowledge from the fields of energy, energy research and energy for Applied Information Technology (FIT), Fraunhofer Institute research funding is formally stored, so it can be used by the whole for Communication, Information Processing and Ergonomics system. When using EnArgus Public, the user is able to (FKIE), Fraunhofer Institute for Systems and Innovation incorporate alternative terms (synonyms) suggested by the system Research (ISI), Fraunhofer Institute for Environmental, Safety into his query. EnArgus Master will additionally suggest terms and Energy Technology (UMSICHT), Ruhr University which are semantically related to the keyword (hyponyms, Bochum’s Chair for Energy Systems and Energy Economy hypernyms and terms resulting out of certain relations, e.g., (LEE), Forschungszentrum Jülich’s Institute of Energy and meronyms). The user can choose which of those terms are to be Climate Research, section Technology Development (IEK- STE), and OrbiTeam Software GmbH (Bonn). IEK-STE had dropped out after phase one. In the current second phase the Institute for Water Supply, Wastewater Technology, Material Flow Management and Resource Economy and Spatial and bense.com joined as new project partners. Fraunhofer FIT is Infrastructure Planning (IWAR) of the TU Darmstadt, the responsible for the coordination and Project Management Jülich Materials Testing Institute (MPA) of the University of Stuttgart, supervises the project in the name of the Federal Ministry for the Zentrum für Beratungssysteme in der Technik (ZEDO) and Economic Affairs and Energy. 5 included in the query. The connection of those terms to the initial terms are emerging from this root like branches. In the case of a keyword is called semantic relation [4][5]. wider evaluation, the root can be changed at any point by navigation through the branches. This changes the focus and Especially useful are the taxonomic relations (hyponym and provides new branches. hypernym) and alternative terms (synonyms). With the help of synonyms it is possible to find a relevant element even if it uses a The hyper trees allow flexible search depth: one can choose depth different term than the keyword. The taxonomic relations, which “1” if one only wants to see direct relations, or a depth with are always given in an ontology, are also useful because a project higher value to see and evaluate relations which are more indirect. which refers to B also refers to A when B is a hyponym to A. For In other words, it is possible to determine the semantic radius for example, when searching for projects which refer to “wind power a term such that the user may adjust the visualization according to plants”, it is also useful to look for “lift-based wind turbines”, her/his own demands and preferences. This is a significant which are a special kind of “wind power plants”. Another kind of advantage when evaluating the ontology. The visualization is also alternative terms are translations into different languages integrated into the system and will be open to the public. (especially from German to English) which are stored just like synonyms. 4. EVALUATION OF THE ENARGUS The next important semantic relation is the meronymy (part-of SYSTEM relation) which appears in certain parts of the ontology. At the end of the first project phase, the EnArgus system was Meronymy is used in classes of the concrete objects and in classes evaluated in two expert workshops. During these workshops, of the processes. With respect to the semantic search, meronymy external energy experts were asked to test various sorts of queries fulfills a function similar to hyponymy. A typical example is a using the system. Some search problems were pre-defined, and the project in which the characteristics of specific membranes are experts tackled these problems on the one hand with a standard examined and these membranes are parts of batteries. This project search, and on the other hand with a search supported by the belongs to projects for the improvement of batteries, even if the ontology. As expected, it turned out that the searches supported project title is about the membranes and the concept "battery" isn't by the ontology found significantly more relevant projects (see [2] found in the project description. for details on these results). In addition, the experts did searches on self-defined problems. In order to get valuable hints and ideas As a rule, further semantic relations are only defined between for improvement, the experts were asked to express their specific classes and stored in the ontology. One example is “use” suggestions and requirements. Finally, they were asked to answer (actually “use_as_energy_source”). This relation exists between usability questions on a questionnaire. power stations and energy sources. So the energy source “sun” (also represented in the ontology as a class) is assigned to the Most of the experts’ criticism was aimed at the wiki-texts. Some classes of the solar power plants. By this representation the energy criticized that the wiki-texts were more suitable for laymen than source used, in this case "sun", is transmitted to all subclasses and for experts; however, such criticism was unwarranted as the texts individuals of "solar power plant". Thus the concept “sun” can be were intended for a general audience. Another point of criticism offered to the user whenever the name of a subclass of solar concerned the comparison between the use of the public version power plant or the name of an appropriate individual is entered. and the use of the master version. While the public version was rated as very understandable and intuitive to use, numerous 3. CONSTRUCTION AND EVALUATION suggestions for improvement were presented for the use of the master version. Most of these suggestions have now been OF THE ENARGUS ONTOLOGY implemented. The general principle of the EnArgus system was The construction of the domain-specific ontology is the part rated as worthwhile and sensible. The system was recommended where the collaboration between experts from information science for completion (in order to cover more fields of energy research) and energy research gets important. There are several different and rollout. ways of developing an ontology [3]. Our basic idea is that the experts for energy research hold the relevant knowledge while the As a further indicator of EnArgus' quality, the cover rate was experts for information science know how this knowledge can be evaluated, i.e., how many current projects assigned formally to the integrated into the system in a helpful way. area of energy research in the database of the BMWi were found by queries with the concepts supported by the ontology. This rate Both parties work together in the following way: the experts for was 86%, which may be rated as a positive result, since energy energy research write short articles similar to those in Wikipedia. research is a very broad and diverse field, and since, at the time of These articles serve as the source of knowledge for the experts the above-mentioned evaluation (at the end of the first project from information science. With regard to syntax and vocabulary, phase), not all areas of energy research were represented in detail these texts make use of simple structures and words of common in the ontology. Again, the evaluation led to additions and understanding, because they are also stored in the EnArgus system improvements, while more notations and abbreviations were and will later serve as a source of further information non experts. added to the concepts. It was decided to carry out cover analyses For the information scientists, it is important that those texts regularly to ensure the quality of the ontology. include the semantic relations mentioned in section 2 because they create the ontology out of the texts. Additionally, the energy The work on EnArgus proves that the fields of energy and energy experts draw mind maps containing those concepts they consider research are extraordinarily wide in scope and complexity, and crucial for the ontology. often require very specific knowledge. Therefore, it will always be necessary for the system to be re-evaluated and adapted to the In the next step, the energy experts evaluate the ontology to make latest developments. At the same time, developers and users must sure that the knowledge has been integrated correctly. For that be aware that not every detail can be represented. step the ontology is visualized as hyperbolic tree (hyper tree). The term in question is at the center of the visualization and the related 6 5. OUTLOOK [3] Schäfermeier, R. 2015. Verteilte und agile The EnArgus Project is still running and its ontology – currently Ontologieentwicklung. In: Corporate Semantic Web. Wie holding about 2.400 classes – is expected to grow to over 3,000 semantische Anwendungen in Unternehmen Nutzen stiften, by the end of 2016. We are confident that EnArgus is a valuable Hrsg. B. Humm, B. Ege, und A. Reibold. Berlin: Springer. contribution to the field of energy research, since this area will [4] Staab, S. & Studer, R. (Eds.) (2004). Handbook on grow and become even more important in the future[6], and that Ontologies in Information Systems. Berlin: Springer. our approach of having experts from energy research and [5] Studer, R., Benjamins, R. & Fensel, D. (1998). Knowledge information science working closely together provides a high Engineering: Principles and Methods. Data & Knowledge level of informational integrity. Engineering, 25, 161-198. 6. LITERATURE [6] Wietschel, M., Arens, M., Dötsch, C., Herkel, S., Krewitt, [1] Hoppe, T. (2014). Modellierung des Sprachraums von W., Markewitz, P., Möst, D., Scheufen, M. (2010). Unternehmen – Was man nicht beschreiben kann, das kann Energietechnologie 2050 – Schwerpunkte für Forschung und man auch nicht finden. In: Humm, B., Reibold, A. & Ege, B. Entwicklung. ISI-Schriftenreihe Innovationspotenziale. (Hrsg.), Corporate Semantic Web. Berlin: Springer. Karlsruhe: Fraunhofer ISI. [2] Schade, U., Bense, B., Dembach, M., Sikorski, L. 2015: Semantische Suche im Bereich der Energieforschungsförderung. Nutzen, Entwicklung und Evaluation einer Fachontologie. In: Corporate Semantic Web. Wie semantische Anwendungen in Unternehmen Nutzen stiften, Hrsg. B. Humm, B. Ege, und A. Reibold. Berlin: Springer. 7