OWL-based form generation and structured data acquisition Rafael S. Gonçalves∗, Csongor I. Nyulas, Samson W. Tu, and Mark A. Musen Stanford Center for Biomedical Informatics Research Stanford University, Stanford, California, USA ABSTRACT form answers in CSV, RDF and OWL formats. The entire process is We present a tool that is capable of generating Web forms from further described below. (question and answer) descriptions encoded in an OWL ontology. (1) Form generation – Steps to produce a form: Unlike a regular form, the input fields of the generated form are (a) Process XML configuration, gathering form layout associated with ontology concepts, and so the form is a means to information, IRIs and bindings to ontology entities acquire data to populate the ontology. The structure of this data is (b) Extract from the input ontology all relevant information given by the modeling of questions and answers in the ontology, which pertaining to each form element: makes the system flexible to different needs and goals. The tool is (b.1) Text to be displayed (e.g., section header, question text) open-source, and freely distributed as a Web application. (b.2) Options and their text, where applicable (b.3) The focus of each question 1 SYSTEM DESCRIPTION (c) Generate the appropriate HTML and JavaScript code (2) Form input handling – Once the form is filled in and submitted: The Web Ontology Language (OWL) [4], being based on (a) Process answer data and create appropriate individuals description logics (DL) [3], is not as amenable for structured (b) Produce a partonomy of the individuals created in (2.a) that data acquisition as a frame-based language; Protégé-Frames used mirrors the layout structure given in the configuration definitions of classes in an ontology to generate knowledge- (c) Return the (structured) answers to the user in a chosen format acquisition forms, which could be used to acquire instances of the classes [1]. This is not as straightforward with OWL, since class A key design choice of our system was to divide the specifications definitions are collections of axioms. of user-interface aspects of the form (given by the XML file) and the We describe a system that we implemented to: (a) generate Web content of the form (given by the OWL ontology). The user-defined forms from logical descriptions of questions and answers in an XML configuration (1.a) specifies: input and output information of OWL ontology, and (b) acquire data from generated forms that is the tool, bindings to ontology entities, and layout of form elements. structured according to concepts in the ontology. We implemented A document type definition (DTD) defines the building blocks of our form generation and data acquisition tool mostly in Java, using such configuration files, imposing necessary constraints to ensure the OWL API v4.0.1 [2].1 The automatically-generated front-end of the configuration file can be suitably interpreted. The key XML the form involves HTML, CSS and JavaScript. The source code of elements are: the tool is publicly available on GitHub.2 The inputs required from users in order to use this tool are: input: contains an ontology child element, and optionally a child firstly, an OWL representation of the form structures (questions, element named imports sections, etc), and descriptions of the meaning of those structures ◦ ontology: absolute path or URL to the form specification (that is, whether the answer should be a string, integer, an OWL ontology (e.g., DBQ ontology) individual, etc.). We provide with our system a so-called datamodel ◦ imports: contains ontology child elements, which have an ontology that users should extend in order to model their form(s), attribute iri, giving the IRI of the imported ontology that is, user-defined questions should be inferred to be instances of output: contains the following child elements datamodel:Question. Secondly, the view specification that is given ◦ file: defines, via a title attribute, the title of the form. by an XML file specifying user-interface aspects; for example, the Optionally, a path can be specified within the file element organization of questions into sections, the order of questions, and where the HTML form file should be serialized more advanced options discussed further on. So, in order to use our ◦ cssStyle: the CSS style class to be used in the output HTML software, a user will have to model questions and their descriptions bindings: defines mappings to ontology entities, such as what data in OWL, and then specify the layout and behavior of the resulting property is used to state the text of a question, or section headings form in XML. form: defines the layout and behaviors of the form The tool takes as input the mentioned user-defined XML configuration (which should contain a pointer to the ontology More detailed implementation and configuration details can be specifying the content of the form, as well as pointers to imported found in the GitHub project wiki. ontologies), generates a Web form, and then parses and outputs 2 FEATURE SUMMARY ∗ To whom correspondence should be addressed: rafaelsg@stanford.edu We briefly present the features of our system below. 1 http://owlapi.sourceforge.net Question triggering: a question can encode a key-value 2 http://github.com/protegeproject/facsimile pair where the key is “showSubquestionsForAnswer” (or Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes 1 Gonçalves et al “hideSubquestionsForAnswer”) and the value is an IRI, which 3 FUTURE PLANS informs the view that when the answer corresponding to that In the future we plan to make our software more versatile with IRI is selected, the question’s subquestions should appear (or the usage of XML Schema datatypes that are part of the OWL 2 disappear, respectively). specification datatype map. Another one of our goals is to design Question types: the allowed question types in the generated form and implement a mechanism to facilitate the specification of forms, correspond to the HTML input-element types, with the addition for instance, an interface to produce the required XML file. of a pre-styled element: “checkbox-horizontal”. By default checkbox inputs will be laid out vertically, hence the addition of the horizontal option. ACKNOWLEDGMENTS Option ordering: answer options for a question can be given by This work is supported in part by contract W81XWH-13-2-0010 an OWL enumeration, and our tool will order these options from the U.S. Department of Defense, and grants GM086587 and alphabetically by default. However, one may want to customize GM103316 from the U.S. National Institutes of Health (NIH). this order, perhaps to shift only one element or to re-order the whole set manually. This can be done in the definition of questions by inserting a key-value pair “orderOption” with the REFERENCES value being the desired order w.r.t. the default one. That is, if we [1] Eriksson, H., Puerta, A. R., and Musen, M. A. (1994). want the (alphabetically-ordered) first element to appear last, we Generation of knowledge-acquisition tools from domain would have a value “*;1”, which states: put the first element last, ontologies. Int. J. of Human-Computer Studies, 41, 425–453. and everything else as it was. [2] Horridge, M. and Bechhofer, S. (2009). The OWL API: A Java Repeated question lists: each question list can be repeated a API for working with OWL 2 ontologies. In Proc. of OWLED-09. specified number of times, for example, in order to collect details [3] Horrocks, I., Kutz, O., and Sattler, U. (2006). The even more of multiple family members. irresistible SROIQ. In Proc. of KR-06. Inline question lists: questions within “inline” question lists can [4] Motik, B., Patel-Schneider, P. F., and Parsia, B. (2009). OWL 2 be laid out horizontally rather than vertically (the default), by Web Ontology Language: Structural specification and functional- specifying the type of question list as “inline”. style syntax. W3C recommendation. 2 Copyright c 2015 for this paper by its authors. Copying permitted for private and academic purposes