Question Answering on RDF Data based on Grammars Automatically Generated from Lemon Models Mohammad Fazleh Elahi1[0000−0002−8843−9039] , Basil Ell1,2[0000−0002−8863−3157] Frank Grimm1[0000−0002−7045−8055] , and Philipp Cimiano1[0000−0002−4771−441X] 1 CITEC, Universität Bielefeld, Germany {melahi,bell,fgrimm,cimiano}@techfak.uni-bielefeld.de http://www.sc.cit-ec.uni-bielefeld.de/home/ 2 Department of Informatics, University of Oslo, Norway basile@ifi.uio.no Abstract. Many question answering (QA) systems over RDF induced from question-query pairs using some machine learning technique suf- fer from a lack of controllability, making the governance and incremen- tal improvement of the system challenging, not to mention the initial effort of collecting and providing training data. As an alternative, we present a model-based QA approach that uses an ontology lexicon in lemon format and automatically generates a lexicalized grammar used to interpret and parse questions into SPARQL queries. The approach gives maximum control over the QA system to the developer as every lexicon extension increases the coverage of the grammar, and thus of the QA system, in a predictable way. We describe our approach to gen- erating grammars from lemon lexica and show how these grammars generate specific questions that we index to support fast QA perfor- mance in a prototype that answers questions with respect to DBpedia. Keywords: question answering, RDF, grammar generation 1 Introduction As the amount of structured data on the Web increases, there is an increas- ing demand for interfaces that simplify the access and browsing of data by end-users. Approaches to QA over RDF data based on machine learning (see [3] for an overview of deep learning methods applied to QALD and [1] for an overview of recent work on natural language interfaces to databases) face however a number of limitations with respect to the governance and Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0) 2 M. F. Elahi et al. maintenance of the QA system. First of all, the provisioning of training data represents a substantial effort and although transfer learning from an exist- ing dataset to another domain can be applied [5], typically one needs at least a small amount of training data from the target domain to obtain decent per- formance. Most importantly, QA models induced from training data are not controllable in the sense that it is a priori unclear which questions the model will interpret correctly. Further, the impact of adding a single or a few train- ing examples is not predictable in terms of which additional questions the model will be able to cover. In order to overcome the problems related to machine learning-based approaches to question answering over RDF, we explore a model-based ap- proach to QA in which a developer of the QA system provides a lexicon in lemon format [7] specifying how the vocabulary elements are realized in natural language. The main benefit of the approach is that it is fully con- trollable in the sense that it can be predicted what the impact of extending the lexicon will have in terms of the questions covered by the system. To realize the system, we build on our previous work showing how question an- swering grammars can be automatically generated from lemon lexica [2]. Building on earlier results showing that if the questions of the QALD-73 dataset are reformulated in terms of questions that the grammar can cover, we can achieve F-Measures of up to 62.5%, in this paper, we present a QA system that builds on the grammar generation functionality described in previous work. As the main contribution, we show that our approach can scale to large numbers of questions and that the performance of the system is practically in real-time from an end user perspective. We apply our approach to DBpedia and describe the implementation of a QA system that can answer more than 1.8 million questions. The system is available at https://scdemo.techfak.uni-bielefeld.de/quegg/. 2 Generating Grammars from Ontology Lexica Our approach automatically generates lexicalized regular grammars from lexical entries in a lemon lexicon [7] for different parts-of-speech and syntac- tic behaviours. The approach to grammar generation for (relational) nouns (e.g. ‘capital of’ ), transitive verbs (e.g. ‘(to) direct’ ), intransitive verbs sub- categorizing a prepositional argument (e.g. ‘(to) flow through’ ), and inter- sective adjectives (e.g. ‘Spanish’ ) were described in previous work [2]. For the sake of self-containedness, we describe the generation of grammar rules for (relational) nouns and the newly deployed gradable adjectives (e.g. ‘high/ higher/highest’ ). The lemon entry for the relational noun ‘capital (of)’ states that the entry has a NounPPFrame [4] that corresponds to a copulative construction ‘X is 3 http://qald.aksw.org/ Overall Diagram RDF QA based on grammars automatically generated from lemon models 3 What is the largest city in United States? Question/Sparql Question/Sparql select ?o{res:United_States dbo:largestCity ?o} pairs pairs .. .. Rule 1 Rule 2 -What is the largest city in Sparql endpoint United States? -What is the largest city in grammar rule Prefix Tree United Arab Emirates? generator ... user input text QA System New York city Automatic QA Grammar Generation Fig. 1. The architecture of the QueGG question answering system the capital of Y’ (see [2]). The grammar generation approach for the lemon entry generates the following questions: 1) ‘What is the capital of X?’, 2) ‘What was the capital of X’, 3) ‘Which city is the capital of X?’, 4) ‘Which city was the capital of X?’. The X position can be either a particular country, e.g. ‘Germany’, or a noun phrase, e.g. ‘country where German is spoken’. A second grammar rule for relational nouns not shown in detail here (see [2]) generates noun phrases such as ‘the capital of X’. The code for our grammar generation is available on GitHub.4 Gradable adjectives are modelled along the proposal of McCrae et al. [6] and are represented using the lemonOILS5 ontology. The lexical entry high is expressed through the concept oils:CovariantScalar, indicating that the adjective is covariant with its bound property dbo:elevation. The lexical entry allows our approach covering the following questions: 1) ‘How high is X?’ and 2) ‘What is the highest X?’. At position X, the label of individuals of type ArchitecturalStructure can be inserted. 3 System Architecture and Implementation The core component of the model-based QA system (Figure 1) is the gram- mar generator, which takes a lemon lexicon as input and automatically cre- ates lexicalized grammar rules, as shown in Section 2, from which pairs of concrete questions and SPARQL queries can be instantiated. The QA com- ponent is a web application that maintains a server-side index of question- query pairs, as well as a user-facing web application. The former builds an efficient data structure in order to index the question data for later retrieval. The latter is able to a) assist the user in finding the right question through 4 https://github.com/fazleh2010/question-grammar-generator 5 http://lemon-model.net/oils 4 M. F. Elahi et al. Frame type #Entries #Grammar rules #Questions NounPPFrame 211 424 1060234 TransitiveFrame 32 107 585845 IntransitivePPFrame 52 106 151040 AdjectiveAttributiveFrame 33 130 41425 AdjectiveGradableFrame 8 24 9150 Total 336 791 1847694 Table 1. Frequencies of entries with a certain frame type auto-complete functionality and b) present results given by the SPARQL end- point in a comprehensive manner. The index of question-query pairs is a server-side prefix tree built from pre-generated questions. While initial inserts are O(n) expensive, the struc- ture allows very quick lookups. The tree is populated with lower-cased char- acter sequences of questions. Costly tree maintenance is alleviated indexing content in stages: an initial bulk import and subsequent updates. All appli- cation launches, as well as new data insertions, then rely on a previously stored state. All input in the application’s query field is periodically pushed to the server, where the tree is then queried for question nodes matching the (lower-cased) input in order to generate auto-complete suggestions. Incom- plete questions yield a number of suggestions by means of a breadth-first search limited to a maximum depth of five levels. This produces the most relevant completion paths for the given query. If the maximum number of suggestions was not reached by this search, a second one adds specific ques- tions to the list. When a user reaches a specific question or enters enough information to promote an answerable leaf node to the top of the suggestion list, the system attempts to resolve it. The SPARQL query associated with the active question is sent to the endpoint and various metadata is rendered alongside the answer. We apply our system to the DBpedia dataset (Release 2016-10; core, links, and English core-i18n) using 336 manually created lexical entries; spreadsheets available at https://scdemo.techfak.uni-bielefeld.de/quegg- resources/. Every row added to these spreadsheets increases the cover- age of the grammar and generates tens of thousands of new questions. Ta- ble 1 shows the number of grammar rules and questions generated for each syntactic type. Altogether, the approach generates 791 grammar rules and about 1.8 million questions. The source code can be obtained via GitHub.6 . The user-based evaluation of the system involved 161 students (University of Bielefeld) that were asked to enter 5 questions7 given in German into the 6 https://github.com/ag-sc/QueGG-web 7 https://forms.gle/B5cjuX5rncxHi1Bx6 RDF QA based on grammars automatically generated from lemon models 5 English-language QA interface. We evaluate two performance indicators: a) Effectiveness: the accuracy and completeness with which participants can ask a question to the system and b) Answer Satisfaction: whether partici- pants find the returned answers acceptable.The results show that the tool is intuitively usable, achieving effectiveness and satisfaction rates between 71%–99% and 46%–95%, respectively. The average SUS (System Usability Scale) score obtained is 62.06, which indicated room for improvement. 4 Conclusions We presented an approach to developing QA systems over RDF datasets that relies on the automatic generation of grammars from corresponding lemon lexica that describe how elements of the dataset are verbalized in natural language. In contrast to machine learning based approaches that induce a model from question-query pairs, our approach is declarative in that the de- veloper of the system defines questions that can be covered by the system by providing a lemon lexicon. The approach is controllable since the introduc- tion of a lexical entry increases the question coverage in a fully predictable way. We have described how an efficient QA system can be implemented on the basis of the automatically generated grammars by indexing the ques- tions and queries using a prefix tree. Our proof-of-concept implementation over DBpedia covers 1.8 million questions generated from 336 lemon entries. In future work we intend to start a community project where the community can contribute both to the extension of the lexicon and the set of grammar rules but also to adapt the grammar generation to other languages. References 1. Affolter, K., Stockinger, K., Bernstein, A.: A comparative survey of recent natural language interfaces for databases. VLDB Journal 28, 793–819 (2019) 2. Benz, V., Cimiano, P., Elahi, M.F., Ell, B.: Generating Grammars from lemon lexica for Questions Answering over Linked Data: a Preliminary Analysis. In: NLIWOD workshop at ISWC. vol. 2722, pp. 40–55. CEUR-WS.org (2020) 3. Chakraborty, N., Lukovnikov, D., Maheshwari, G., Trivedi, P., Lehmann, J., Fischer, A.: Introduction to Neural Network based Approaches for Question Answering over Knowledge Graphs. CoRR abs/1907.09361 (2019) 4. Cimiano, P., Buitelaar, P., McCrae, J.P., Sintek, M.: LexInfo: A declarative model for the lexicon-ontology interface. JWS 9(1), 29–51 (2011) 5. Maheshwari, G., Trivedi, P., Lukovnikov, D., Chakraborty, N., Fischer, A., Lehmann, J.: Learning to Rank Query Graphs for Complex Question Answering over Knowl- edge Graphs. In: ISWC Conference. pp. 487–504 (2019) 6. McCrae, J.P., Quattri, F., Unger, C., Cimiano, P.: Modelling the Semantics of Adjec- tives in the Ontology-Lexicon Interface. In: CogALex Workshop (2014) 7. McCrae, J.P., Spohr, D., Cimiano, P.: Linking lexical resources and ontologies on the semantic web with lemon. In: ESWC Conference. pp. 245–259 (2011)