Verifying Algorithm Traces and Fault Reason Determining Using Ontology Reasoning ? Oleg Sychev1[0000−0002−7296−2538] , Mikhail Denisov1[0000−0002−1216−610X] , and Anton Anikin1[0000−0003−0661−4284] Volgograd State Technical University, Lenin Ave, 28, Volgograd, 400005, Russia oasychev@gmail.com Abstract. Domain ontologies that can solve various tasks using its concepts and determine fault reason for students’ answers may serve as a good basis for creating for a testing system with extensive explanatory abilities. But capabilities of modern ontological reasoners may not be enough for this kind of task. In this study, we developed and tested an ontology able to build execution trace for the given algorithm containing sequences and alternatives and find fault reasons for incorrect traces. The study also showed problems with the used reasoner that hinder the developed ontology from becoming fully effective. Keywords: Ontology · Reasoning · Error detection · Algorithm · Exe- cution trace. 1 Introduction and Related Work Ontology-based models and formal reasoning are used for knowledge represen- tation in different domains for a wide range of applications. Ontology Driven Software Engineering (ODSE) approach [5] implies using ontology models for various aspects of the software engineering, (e.g. for modeling different parts of systems, software products, modules, algorithms). The interesting and relevant topic within this aspect is algorithms’ and pro- grams’ analysis and synthesis using ontology-based models [7]. Ontologies al- low capturing program components and their relations as a formal model. The reasoning rules defined within ontologies using inference engines (or reasoners) allow solving a wide range of tasks from system modeling to software generation. These tasks include algorithmic structures analysis (e.g., canonical loop analysis to recognize whether a loop in the program code is in a canonical form), data access pattern analysis; alias, dependence, and static code analysis; program synthesis using an ontology-based algorithm description and formal reasoning rules defined for the specific programming language. For example, in [1] software requirement specifications (SRS) are defined using an ontology that becomes the ? The reported study was funded by RFBR, project number 20-07-00764. Copyright c 2020 for this paper by its authors. Use permitted under Creative Com- mons License Attribution 4.0 International (CC BY 4.0). conceptual model of the developed software and is mapped to the entities in the program code. This allows automating SRS detection and updating. The article [3] describes an ontology combining control-flow graph representation with the static trace of the program for source code analysis, showing a trend in supple- menting the knowledge about the program structure with the knowledge about the program execution. So ontology-based reasoning is a relevant approach for many tasks both in software engineering as a professional activity and in software engineering learn- ing (e.g., to detect and explain the issues in the program code, to check the conformity of the task and conceptual model with the program implementation, etc). Ontologies are used in the learning process for different domains to generate and grade questions [2,6]. According to [2], the simplicity of the questions is one of the main limitations of existing works in question generation that must be addressed in further re- search. However, while for simple questions checking factual knowledge question generation and reasoning the correct answer are practically the same, for more complex questions, (e.g. building a program trace) question generation and ques- tion solving become different problems. Using automatically-generated questions is not practical without automating finding the correct answers, and reasoning on an ontology, capturing domain-related knowledge, is a relevant approach to this task. Using complex questions targeting high cognitive levels also allows generating non-trivial feedback that is also an important problem in question generation. Explicitly defining logical constraints for a correct answer to the given question allows detecting the broken constraints if a student’s answer is incorrect, making possible showing explanations of the domain laws that the student violated in their answer. In our research, we aim at creating a system for generating questions, rea- soning the correct answers to them using domain-dependent formal models like ontologies, determine fault reasons in students’ answers and explaining them to correct misunderstandings of important domain concepts. One of the first steps requires studying whether modern ontological reasoners are capable of perform- ing the necessary tasks. When teaching introductory programming courses, one of the complex tasks involving learning concepts of control structures is building program trace [4]. In this study, we attempted to develop an ontology for build- ing algorithm traces and finding fault reasons for mistakes in students’ traces to improve students understanding of how the algorithm is executed. 2 Method An algorithm is represented as a tree of algorithmic structures: sequences, al- ternatives, and loops. Its trace is a sequence of acts of their execution in time; simple actions are represented as acts of their executions while complex algo- rithmic structures containing nested actions have different acts for the beginning and finishing of their execution. The trace also must contain the values of control conditions for alternatives and loops. The input of the developed ontology includes: – An algorithm tree consisting of algorithmic structures and simple state- ments. – The set of unordered execution acts to be connected by the next_act to embody the correct trace. Acts executing expressions have expr_value datatype property holding the result. – The student’s trace as student_next relations. Student’s trace and correct trace share as many acts as possible. The developed SWRL rules perform the following steps. First, the correct trace is built. The rules doing that are called positive. These rules look for a correct act in the partially built trace to connect the next act to. The connected act gets one of the subclasses of correct_act class, showing the reason why the act should be there (e.g., AltBranchBegin, SequenceEnd). Second, the additional properties parent_of and corresponding_end are computed to easier navigating the correct trace (see fig. 1). Similar properties are calculated for the student’s trace. Indices for acts in both traces are calculated too. Third, the negative rules determine stu- dent’s mistakes and fault reasons. Fault rea- son detection is performed by rules that infer subclasses of Erroneous class showing the fault reason like CorrespondingEndMismatch, ExtraAct, DuplicateOfAct, MissingAct, DisplacedAct, BranchOfFalseCondition, NoBranchWhenConditionIsTrue, etc. After the reasoning, the ontology is sup- plemented with the correct trace and facts about the types of student mistakes. This information can be extracted and shown to the students in a human-readable way us- ing message templates for each fault-reason Fig. 1. Trace structure for two ac- class, for example, ”Act A can’t be executed tions (1, 2) nested to a sequence after the end of Act B because sequence B (seq) contains act A and each act of a sequence execution include executing each its element once” or ”Branch A can’t be executed at this place because the con- dition B is false” or ”Act A must be executed at this place because a do-while loop body must be executed at least once.” 3 Results The current revision of the ontology1 includes: 74 SWRL rules (28 positive, 28 negative, 18 helper rules), 66 classes (24 representing algorithm elements, 7 - 1 https://github.com/den1s0v/c owl trace acts, 18 - mistakes, 27 - explanations for correct acts), and 30 properties. The ontology supports sequences, alternatives, and loops of different types. It guarantees that at least one student’s act is marked incorrect if there are any mistakes, even if all mistakes are omissions of correct acts: that makes providing detailed information about the mistakes’ positions possible. Simultaneous rea- soning over multiple traces injected into the same ontology is supported and can be used to improve the overall performance when processing multiple answers to the same question. The developed SWRL rules require using SWRL built-ins like swrlb:add(), swrlb:notEqual(). Suitable reasoners include Pellet 2.3 and Pellet 3. Other popular reasoners like HermiT do not support these built-ins. The ontology passed 33 tests consisting of short traces with annotated mis- takes. A single run of the ontology with one trace takes about 1 second in Stanford Protégé Editor invoking Pellet 2.3 plugin and about 5 seconds while running Pellet 2.3 as a standalone process. The same ontology using Stardog with Pellet 3 cannot complete the inference in minutes. A few issues of SWRL implementation in Pellet 2 and 3 were found. – The problems with DifferentFrom: when no facts related to individuals distinction was provided a rule that uses DifferentFrom never fires; in Pellet 2 it sometimes causes ”Inconsistent Ontology” error or silently truncates some of the inference results. This problem effectively stops the ontology from working. So it’s better to avoid using DifferentFrom or SameAs unless you explicitly specify the information about the differences of individuals. – In Stardog, using SameAs in the SWRL rule causes a parsing exception. – In Stardog, the unification of ”scalar” variables works only for binding; when matching with an already bound variable, it allows any values. With rule (1), Stardog will connect all possible pairs of Product instances, regardless of whether their prices are equal, instead of selecting only instances that match at a price. This behavior was only observed when has_price is a DatatypeProperty. P roduct(?a), P roduct(?b), has price(?a, ?p), has price(?b, ?p) → → has same price(?a, ?b) (1) – Including transitive properties has a significant impact on the reason- ing time if they are inferred. Two of the determined mistakes - ExtraAct and MissingAct - require looking up any number of acts ahead that is impossible without using transitive properties. As a workaround, we created several dupli- cates of the rules, each with a fixed length of the lookup (up to 6 acts). The developed ontology differs from the ontology-based program analysis system [3] in that it performs dynamic analysis building the program trace and uses SWRL rules instead of Jena. The kinds of mistakes detected by analysis are different because our ontology is aimed at supporting teaching and it compares student-built trace with the ontology-built one instead of seeking errors in the given code. Our work differs from other ontologies used for question generation and solv- ing in that it can be used in developing a question-generating system aimed at more complex questions targeting higher cognitive levels that makes reasoning the correct answer for the generated questions non-trivial as a question-solver and grading block. It also allows for generating explanatory feedback. The proposed approach is limited to closed-ended questions because it re- quires that every possible mistake in the answer can be a result of violating a specific domain law. The developed ontology solves and finds mistakes for questions requiring ordering of a given set of program-trace strings. Open-ended questions are limited to the lower-lever feedback for now [6]. 4 Conclusion and Future Work During our research, we developed an ontology able to build correct traces for the given algorithm consisting of sequences, alternatives, and loops and find mistakes in the given traces with enough information to generate explanatory feedback. This shows that developing an ontology-based system for generating and solving questions and providing explanatory feedback to students about their mistakes is possible. However, some problems of Pellet reasoner prevent efficient implementation of the rules and limit the number of mistakes that can be detected. Future work will concern adding support for procedure calls, including re- cursive calls, and developing the tutor system using the created ontology and logical rules. References 1. Bhatia, M.P.S., Kumar, A., Beniwal, R., Malik, T.: Ontology driven software devel- opment for automatic detection and updation of software requirement specifications. Journal of Discrete Mathematical Sciences and Cryptography 23(1), 197–208 (Jan 2020) 2. Kurdi, G., Leo, J., Parsia, B., Sattler, U., Al-Emari, S.: A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education 30(1), 121–204 (Nov 2019) 3. Pattipati, D.K., Nasre, R., Puligundla, S.K.: OPAL: An extensible framework for ontology-based program analysis. Software: Practice and Experience 50(8), 1425– 1462 (Mar 2020) 4. Risha, Z., Brusilovsky, P.: Making it smart: Converting static code into an interac- tive trace table. Sixth SPLICE Workshop “Building an Infrastructure for Computer Science Education Research and Practice at Scale” at ACM Learning at Scale 2020 (Aug 2020) 5. Strmečki, D., Magdalenić, I., Kermek, D.: An overview on the use of ontologies in software engineering. Journal of Computer Science 12(12), 597–610 (Dec 2016) 6. Sychev, O., Anikin, A., Prokudin, A.: Automatic grading and hinting in open-ended text questions. Cognitive Systems Research 59, 264–272 (Jan 2020) 7. Zhao, Y., Chen, G., Liao, C., Shen, X.: Towards Ontology-Based Program Analysis. In: ECOOP 2016. Leibniz International Proceedings in Informatics, vol. 56, pp. 26:1–26:25 (Jul 2016)