=Paper=
{{Paper
|id=Vol-2941/paper6
|storemode=property
|title=Teaching Semantic Web Technologies through Interactive Jupyter Notebooks
|pdfUrl=https://ceur-ws.org/Vol-2941/paper6.pdf
|volume=Vol-2941
|authors=Lars Pieschel,Sascha Welten,Lars C. Gleim,Stefan Decker
|dblpUrl=https://dblp.org/rec/conf/i-semantics/PieschelWGD21
}}
==Teaching Semantic Web Technologies through Interactive Jupyter Notebooks==
Teaching Semantic Web Technologies through Interactive Jupyter Notebooks Lars Pieschel1 , Sascha Welten1 , Lars Gleim1 , and Stefan Decker1,2 1 Databases and Information Systems, RWTH Aachen University, Germany 2 Fraunhofer FIT, Sankt Augustin, Germany Abstract. Getting new generations of developers excited about the benefits of the Semantic Web and familiar with the underlying technologies is one of the biggest challenges for furthering its adoption. Inspired by the success of structured digital learning materials in other domains of computer science, we present SemWebNote- books, a portfolio of interactive Jupyter Notebook tutorials for teaching Semantic Web technologies interactively. By introducing the Jupyter-RDFify plugin, we seam- lessly integrate support for RDF and related standards into the Jupyter ecosystem. Motivated by the overwhelmingly positive feedback provided by the students using these materials at RWTH Aachen University and correspondingly high technol- ogy acceptance, we release both Jupyter-RDFify and SemWebNotebooks to the community as open source for open reuse and further collaborative improvement. 1 Introduction In order to drive the adoption of Semantic Web technologies, it is essential to educate new generations of developers with the underlying standards, tools, and methodologies as efficiently as possible. While a large variety of open resources such as W3C’s technology primers and massive open online courses (MOOCs) on the subject are already available today, in our own experience with teaching Semantic Web Technologies to Master-level computer science students, learners frequently struggle to apply these tools in practice. It is our understanding that part of the underlying problem is the lack of coherently structured interactive learning materials for a relevant intersection of Semantic Web technologies. In order to support the learning experience of our students, we developed SemWebNote- books, a portfolio of interactive Jupyter Notebook [3] tutorials covering RDF basics such as CURIEs, Blank Nodes, Literals and the serialization formats Turtle and JSON-LD, the query language SPARQL, the shape constraint language ShEx, as well as data modeling using OWL 2. Using Jupyter’s modular architecture, we provide convenient features such as syntax checking and RDF graph plotting directly from within the Jupyter Notebook cells through Jupyter-RDFify, a custom-developed plugin providing a seamless learning environment without any boilerplate code that could divert the attention of the students. Based on the NBGrader framework [1], we enable automatic feedback and grading for students, which reduces the teacher’s workload to a necessary minimum and increases the time for direct student mentoring. The remainder of this paper is structured as follows: Section 2 summarizes relevant related work in both the Semantic Web and in other domains employing Jupyter Notebooks to support eLearning. Section 3 details the features provided by Jupyter-RDFify, while Section 4 summarizes its usage workflow. Section 5 presents a quantitative evaluation of our contribution conducted in the context of our Semantic Web lecture at RWTH Aachen University before we conclude with a short discussion of impact in Section 6. Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). Background. The constant rising trend of Semantic Web technologies in different do- mains has emerged the need for suitable courses and tools to teach the theory and practice of Semantic Web. Semantic Web courses such as the MOOC ”Knowledge Engineering with Semantic Web Technologies” offered by OpenHPI3 deliver the theoretical knowl- edge needed to work with technologies such as RDF, Turtle, and SPARQL. As for the practical knowledge, these courses offer quizzes and exemplary SPARQL endpoints or encourage students to configure their own triple store. Especially, self-hosted triple stores (e.g. Apache Jena Fuseki4 ) allow for a maximum level of customization and adjustments according to the course contents. However, the effectiveness of these teaching strategies suffers from the lack of sufficient interactive feedback or initial barriers induced by the configuration of these tools. Jupyter Notebooks5 have been proposed to lower these barriers by providing a single exercise-sheet-style interface, which wraps a customizable and shippable backend. While read-only text fields can provide task descriptions or further explanations, the executable code cells enable an interactive possibility to enter and validate code according to the task description. The methodology of using Jupyter notebooks as an educational resource6 has been employed numerous times, especially in the domain of data science [6] but also in the Semantic Web domain [2] as a solution for interactive SPARQL queries. Another advantage of using Jupyter notebooks as explained above is, that one can use frameworks like NBGrader to automatically grade notebooks and generate feedback for them [1]. Nevertheless, these Notebooks conventionally rely on convoluted boilerplate code (e.g. Python) and additional packages in order to process, e.g., the entered RDF or SPARQL statements. The arising consequence is that the students do not get familiar with the actual syntax yielding a distraction of the actual task or the implementation of additional features like keyword highlighting is complicated. In our work, we circumvent the incorporation of additional wrapper code and present a plugin for the processing of common Semantic Web languages to enable the advantages of Jupyter Notebooks for the Semantic Web community. The features of this plugin are part of the upcoming sections. 2 Jupyter-RDFify and SemWebNotebooks Jupyter-RDFify (JRDF)7 enables parsing, validating, visualizing, and querying RDF graphs directly within the Jupyter Notebook ecosystem. To ensure compatibility with the default IPython kernel backend of Jupyter and enable the seamless handling of the different Semantic Web technologies, we make use of Jupyter’s modular plugin and its so- called magics system. Magics are special directives, which tell the IPython kernel that the following line or cell does not contain python code and should thus be treated differently. Jupyter-RDFify7 implements support for Semantic Web technologies through the IPython extension mechanism. Once loaded, the RDF magic (%%rdf or %rdf) may be used to control the extension through the IPython kernel. Our extension itself is built modular as well, providing submodules for Turtle, JSON-LD, SPARQL, ShEx, etc. by interpreting the magic like a command-line interface. The submodules use the feature-rich RDFLib7 3 https://open.hpi.de/courses/semanticweb2015 4 https://jena.apache.org/documentation/fuseki2 5 Project Jupyter: https://jupyter.org 6 For example: https://github.com/BigDataAnalyticsGroup/bigdataengineering 7 Code, Documentation, and Evaluation: https://github.com/SemWebNotebooks/Notebooks 2 Run Cell RDF Magic Use RDFLib + Plugins Interacts IPython Jupyter IPython User RDF Interface Kernel Extension Output Output Use Graphviz Fig. 1. Interaction between SemWebNotebooks components: Blue components are client-sided, orange and green components are server-sided. The contributed RDF extension is marked in green. Python package in combination with plugins and extensions like RDFLib-jsonld7 or OWL- RL7 to handle the input. To visualize graphs, Graphviz7 together with its Python interface is used. The output is then sent back and displayed to the user through the IPython kernel. Figure 1 visualizes the interaction between these components and the Jupyter framework. To simplify working with multiple RDF graphs, a graph can be associated with a label upon loading it. The graphs can then be referenced by their aforementioned label at a later stage to query or verify them. Figure 2 depicts a concrete example, in which the parsed graph is labeled ”graph1” and then queried using SPARQL. The supported magics and their usage are documented in the project repository7 . Working with Jupyter-RDFify and SemWebNotebooks. A sample exercise notebook might be structured as follows: Instructors may provide an RDF graph in Turtle notation. Due to the IPython extension, the turtle serialization can be parsed and immediately visual- ized in the Notebook. Students are then asked to create queries, which are executed against the earlier defined and visualized graph, in succeeding cells. Instead of a pre-defined graph from the instructors, an experimental and interactive playground-Notebook might also be a promising method to motivate students to create and query their own graphs without much effort. In our Semantic Web lecture at RWTH Aachen University, we used the above-mentioned methods in conjunction with the NBGrader framework [1] to create a series of automati- cally graded exercises, which we release as SemWebNotebooks. These SemWebNotebooks combine tutorials in the form of markup paragraphs together with hands-on exercises using our extension. A detailed overview of the covered topics can be found in the project repository7 . The learners get immediate feedback either through the output graphs or, if their solution contains syntactic errors, through the parser errors which IPython passes back to the user. SemWebNotebooks additionally make use of NBGrader to run unit-tests and thus automatically grade the exercises and simultaneously generate feedback. Jupyter-RDFify is available as open source software, can be installed through the python package manager (e.g. Pip), and loaded into any Jupyter Python Notebook on-demand using the predefined %load ext magic of the IPython kernel. 3 Fig. 2. Using Jupyter-RDFify: The left cell parses and visualizes an RDF graph given in Turtle format, the right cell queries the graph parsed from the left cell and outputs the variable bindings. 3 Evaluation Before the evaluation, we have applied SemWebNotebooks as mandatory assignments for each participant of our Master-level Semantic Web lecture at RWTH Aachen University. Overall, we have released six exercises7 , which represent the foundation of our evaluation. We have conducted the evaluation of our work at the end of the semester. The evaluation was realized by an online survey distributed to each student and was structured as follows. The first twelve questions are aligned to the Technology Acceptance Model (TAM) ques- tionnaire [4,5]. Firstly, these questions focus on the perceived usefulness and secondly, on the perceived ease-of-use to address the two main factors, which fuel the acceptance of our work [5]. In the second part of the survey, we asked six more specific questions about the auto-grading and the generated feedback. As the last item, we have provided a free text area such that each participant was able to add individual and more advanced feedback. The questions were answered using a Likert scale. At the end of the survey, we received 38 responses out of 104 students who submitted at least one exercise. The complete evaluation results and further dissection can be found in the project repository7 . The quantitative evaluation and the qualitative discussion of the results are presented in the following based on three key questions of our survey. We further summarize qualitative textual feedback given by our students. Here, we concentrate on three questions: I would find SemWebNotebooks in the SemWeb course useful. (A), I would like to use Jupyter Notebooks in other lectures as well. (B), and The automatically generated feedback was fair. (C). Both questions (A) and (B) show remarkable results with an average score of 4.95(σ = 0.23) and 4.87(σ = 0.34). Question (C) however, resulted in the overall lowest average score with only 3.56(σ = 1.20). The high results of questions (A) and (B) show not only that most students found the Jupyter Notebooks useful in the SemWeb course but that they would also like to work with similar notebooks in other lectures as well. The obtained quantitative feedback is also reflected in the participant’s textual qual- itative feedback. These results show that the SemWebNotebooks are very valuable for learners. The students especially benefit from the interactive design of each Notebook. Further, the immediate syntax check supports the students to produce proper and compi- 4 lable code in a practical environment. Despite the mainly positive feedback, the survey has revealed some weak points of our automatic grading system and the corresponding provided automatically generated feedback. Although it was stated that the automatic grading was provided rapidly, the grading was perceived as unfair because small errors lead to bigger point losses (e.g. case sensitivity, changed prefixes) than through man- ual corrections. This perception is particularly presented by the low score of question (C).We have tried to compensate and mitigate these very strict assessments by manual re-corrections for borderline cases. However, we strive to make our tool more flexible and robust against minor syntax and discrepancies as future work. 4 Conclusion In this work, we have presented Jupyter-RDFify, which is an IPython kernel extension for Jupyter Notebooks. Based on the well-established magics functionality of the IPython kernel, our plugin poses a solution to process Semantic Web-related languages in Jupyter Notebooks. We further extend this plugin with additional packages such as Graphviz for visualization purposes or RDFLib for querying the input data. These submodules can be implemented according to the creator’s needs and propose a flexible method to customize Jupyter Notebooks. We have published our plugin as open source with support for Turtle, JSON-LD, SPARQL, and ShEx. To evaluate our work, we have applied our SemWeb- Notebooks in combination with NBGrader to our Semantic Web course to facilitate the interaction with Semantic Web technologies for student learners and also the assessment of student’s assignments. Our evaluation has shown that our SemWebNotebooks increased the learning outcome and obtained strong support from the participants. Therefore, our work poses a promising alternative to improve the teaching of Semantic Web technologies. In future work, we will improve auto-grading and implement support for SHACL tasks. References 1. Blank, D.S., Bourgin, D., Brown, A., Bussonnier, M., Frederic, J., Granger, B., Griffiths, T.L., Hamrick, J., Kelley, K., Pacer, M., et al.: nbgrader: A tool for creating and grading assignments in the Jupyter Notebook. The Journal of Open Source Education 2(11) (2019) 2. Gray, A.J.: Using a Jupyter Notebook to Perform a Reproducible Scientific Analysis Over Semantic Web Sources. In: SemSci@ ISWC. pp. 12–24 (2018) 3. Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B., Bussonnier, M., Frederic, J., Kelley, K., Hamrick, J., Grout, J., Corlay, S., Ivanov, P., Avila, D., Abdalla, S., Willing, C., development team, J.: Jupyter Notebooks – a publishing format for reproducible computational workflows. In: Loizides, F., Schmidt, B. (eds.) Positioning and Power in Academic Publishing: Players, Agents and Agendas. pp. 87–90. IOS Press (2016) 4. Lewis, J.R.: Comparison of Four TAM Item Formats: Effect of Response Option Labels and Order. Journal of Usability Studies 14(4) (2019) 5. Park, S.Y.: An analysis of the technology acceptance model in understanding university students’ behavioral intention to use e-learning. Journal of Educational Technology & Society 12(3), 150–162 (2009) 6. Toomey, D.: Jupyter for Data Science. Packt Publishing (2017) 5