New Ontology and Knowledge Graph for University
Curriculum Recommendation
Nicolas Hubert1,2,∗ , Armelle Brun2 and Davy Monticolo1
1
Université de Lorraine, ERPI, France
2
Université de Lorraine, CNRS, LORIA, France
Abstract
Education is a complex domain where students must make their curricula choices carefully. To model the
intricacies of the educational domain, ontologies have successfully been leveraged in the past. However,
no available ontology or dataset directly addresses the critical transition from high school to university
from a decision-making perspective. Therefore, the contribution of our on-going work is twofold.
Firstly, we introduce EducOnto – an ontology that aims at modeling university curricula and students’
profiles. Secondly, we introduce EduKG – a knowledge graph inheriting the semantics of EducOnto and
instantiated with data about French students and curricula.
Keywords
Knowledge Graph, Education, Ontology, Recommender System
1. Introduction
Higher education is an intricate domain: it features a lot of distinct concepts, relationships
and structural constraints. Whenever a given application domain is too complex to be grasped
directly, ontologies established themselves as the reference approach to model the richness
of this domain. As such, they have naturally been leveraged in education, where there is a
real need for structuring information [1, 2]. However, there is a lack of ontologies specifically
concerned with the critical period ranging from the end of high school to the first years of
university, in which students commit for their future.
Besides, owing to the vast amount of provided information on university curricula, stu-
dents can find it tough to make the most appropriate decision regarding their educational
pathways [3]. Recommender Systems (RSs) have successfully been deployed to reduce the
overhead associated with decision-making by sifting through massive amounts of choices and
suggesting only the most relevant items to the active user [4]. RSs have proved to be useful
for recommending curricula to students [1]. Nevertheless, most educational datasets related to
curriculum recommendation are university-specific and remain private [5, 6].
International Semantic Web Conference (ISWC) 2022: Posters, Demos, and Industry Tracks, October 23-27, 2022
∗
Corresponding author.
Envelope-Open nicolas.hubert@univ-lorraine.fr (N. Hubert); armelle.brun@loria.fr (A. Brun); davy.monticolo@univ-lorraine.fr
(D. Monticolo)
GLOBE https://nicolas-hbt.github.io/ (N. Hubert)
Orcid 0000-0002-4682-422X (N. Hubert); 0000-0002-9876-6906 (A. Brun); 0000-0002-4244-684X (D. Monticolo)
© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
Hence, facing the need for public resources related to curriculum recommendation, we present
an on-going work related to the development and provision of both a new ontology and a
new dataset, referred to as EducOnto and EduKG, respectively. More specifically, EducOnto
focuses on modeling the pivotal period bridging the last year of high school and the first years
of university, while EduKG is a Knowledge Graph (KG) encompassing students’ profiles and
academic training information. EduKG is built on the basis of EducOnto, from which it inherits
its semantics and structural constraints.
2. EducOnto: An Ontology for High School to University
In this section, we elaborate on the methodology used to design EducOnto. We follow best prac-
tices guidelines presented in [7]. We distinguish three main phases that will respectively form
the following paragraphs: (1) Domain and purpose definitions; (2) Ontology building process; (3)
Ontology evaluation. EducOnto is publicly available at the following link: purl.org/educonto.
2.1. Domain and Purpose Definitions
In the educational domain, several ontologies are publicly available1 . However, none of them
directly models the period ranging from the end of high school to the first years of university.
EducOnto aims at modeling such a period. Together with education experts, we defined compe-
tency questions whose purpose is to frame the scope of the ontology and act as requirements
that EducOnto should meet. The exhaustive list of competency questions is made available2 .
Let us introduce two of them as an example:
CQ1. What are the recommended academic backgrounds for a specific university curriculum?
CQ2. What is the most popular university curriculum following a given high school major?
2.2. Ontology Building Process
With education experts, we first enumerate all the terms related to our application domain and
the expected purpose of EducOnto. Two main concepts stand out: student and curriculum. Both
concepts are related to other intermediate concepts such as major and specialty. Then, adopting
a top-down development process, we end up with 71 classes and 30 object properties. A general
overview of EducOnto that contains the main classes and properties is presented in Figure 1.
2.3. Ontology Evaluation
EducOnto can be considered as successfully designed if it helps to fulfill downstream tasks
related to university curricula information retrieval and recommendation. According to [8],
we rely on a task-based evaluation. To do so, we first take a general use case, after which we
seek to answer the previously mentioned competency questions by querying EducOnto using
SPARQL. The answers to the queries are retrieved from the data contained in EduKG. Below, we
1
https://lov.linkeddata.es/dataset/lov/vocabs?&tag=Academy
2
https://nicolas-hbt.github.io/educ-ontokg/competencyquestions/
University
University Curriculum
mentionedCurriculum
recommendsMajor curriculumRelatesTo
belongsTo recommendsSpecialty
keywordRelatesTo
Keyword Field of Study
High School
specialtyRelatesTo
specialtyBelongsTo
schoolSubjectRelatesTo
High School Major High School Specialty
isInterestedIn schoolSubjectBelongsTo
hasMainTopic
(dis)likedMajor (dis)likedSpecialty
School Subject
User Profile
hasFavoriteSubject
Personality
hasPersonalityTrait Student hasSkill Academic Skill
Trait
Figure 1: A conceptual overview of EducOnto. The lower, middle and upper parts of the diagram depict
the classes and properties related to the User Profile, High School and University, respectively.
present the query for CQ1. More precisely, we retrieve the recommended high school majors to
start a Bachelor’s degree in Computer Science (hereinafter referred to as curriculum:lg_cs).
Listing 1: SPARQL query to address CQ1
PREFIX ed:
PREFIX major:
PREFIX curriculum:
PREFIX rdf:
SELECT ?major
WHERE { ?major rdf:type/rdfs:subClassOf* ed:HighSchoolMajor .
curriculum:lg_cs ed:recommendsHighSchoolMajor ?major .}
2.4. Directions for Future Work
We will enrich EducOnto by linking it with other ontologies. As previously said, few ontologies
model the period from the end of high school to the first years of university. However, some
ontologies may be partially reused, e.g. EduProgression Ontology3 .
3
https://lov.linkeddata.es/dataset/lov/vocabs/edupro
In addition, we intend to develop collaborative work with other European institutions and
education experts to broaden the scope of EducOnto to countries belonging to the European
Higher Education Area4 .
3. EduKG: An Educational Knowledge Graph
In this section, we detail the whole pipeline for the construction of EduKG. EduKG is publicly
available at the following link: purl.org/edukg.
3.1. Data Collection
Data were collected in tabular format through a self-administered online survey whose content
has been developed with the education experts who helped define the competency questions
(see Section 2.1). Several French high schools and universities were asked to disseminate
the survey among their students. A wide spectrum of geographical areas was targeted to
ensure representativeness among survey respondents. From January 2022 to April 2022, 3,583
respondents have fully answered the survey and are part of EduKG.
3.2. Construction and Description
Before constructing EduKG, some modeling choices were made, such as:
• Students rated their curriculum on a 0 to 5 scale. Ratings were binarized so as to lead to
the predicates likedCurriculum (rating ≥ 3) and dislikedCurriculum (rating < 3);
• Students were allowed to provide keywords related to their educational interests. Because
they were given a freeform text field to fill, consistency among respondents was needed.
For instance, keywords that appeared too little were removed, and too precise keywords
were replaced by broader concepts;
• Publicly available datasets provided by government agencies were used to enrich the
knowledge graph with additional curriculum information.
Then we use Python 3.9 to create RDF triples meeting the conditions specified by EducOnto.
All these triples forms our knowledge graph EduKG. A comprehensive view of EduKG is
provided in Table 1: the number of distinct entities for each intermediate class (Table 1.a),
information about the number of Student-Curriculum interactions and the relative sparsity of
EduKG (Table 1.b) and the total amount of entities, relations and triplets of EduKG (Table 1.c).
3.3. Directions for Future Work
We will enrich EduKG by mapping EduKG keywords to DBpedia or Wikidata entities. This
would give more genericity to EduKG and facilitate its use in downstream tasks. Similarly to
EducOnto, we intend to extend EduKG by adding instances from other European countries,
although its feasability will depend on privacy-related and data-sharing concerns.
4
https://en.wikipedia.org/wiki/Bologna_Process
Table 1
Statistics of EduKG
#Fields Of Study #Keywords #School Subjects #Majors #Specialties
(a)
12 321 15 92 13
#Users (Students) #Items (Curricula) #Interactions
(b)
3,583 286 7,021
#Entities #Relations #KG Triples
(c)
5,452 27 36,301
On a technical side, a public SPARQL endpoint will be provided, so that information can be
retrieved by running SPARQL queries against EduKG.
Finally, recall that the ultimate goal for building EduKG was to provide a public dataset for
curriculum recommendation. Thus, the natural next step of this work is to design a RS that
specifically relies on EduKG, by refining the work initiated in [9].
Acknowledgments
This work is supported by the AILES PIA3 project (https://www.projetailes.com/) that seeks
to support French students in their educational choices. We thank all the project partners:
the University of Reims Champagne-Ardenne, the University of Lorraine, the Technological
University of Troyes and the two Academies of Reims and Nancy-Metz.
References
[1] M. E. Ibrahim, Y. Yang, D. L. Ndzi, G. Yang, M. Al-Maliki, Ontology-based personalized
course recommendation framework, IEEE Access 7 (2019) 5180–5199.
[2] E. Ilkou, H. Abu-Rasheed, M. Tavakoli, S. Hakimov, G. Kismihók, S. Auer, W. Nejdl, Educor:
An educational and career-oriented recommendation ontology, LNCS (2021) 546–562.
[3] N. Hubert, A. Brun, D. Monticolo, Vers un système de recommandation explicable pour
l’orientation scolaire, in: Workshop EXPLAIN’AI - EGC, Blois, France, 2022.
[4] F. Ricci, L. Rokach, B. Shapira, Introduction to Recommender Systems Handbook, Springer
US, Boston, MA, 2011, pp. 1–35.
[5] B. Ma, Y. Taniguchi, S. Konomi, Course recommendation for university environment, in:
EDM, 2020.
[6] Z. A. Pardos, W. Jiang, Designing for serendipity in a university course recommendation
system, in: Proc. 10th ACM LAK, Association for Computing Machinery, 2020, p. 350–359.
[7] N. F. Noy, D. L. McGuinness, Ontology development 101: A guide to creating your first
ontology, 2001.
[8] K. Dellschaft, S. Staab, Strategies for the evaluation of ontology learning, in: Proc. of the
Conf. on Ontology Learning and Population, IOS Press, 2008, p. 253–272.
[9] N. Hubert, P. Monnin, A. Brun, D. Monticolo, New Strategies for Learning Knowledge
Graph Embeddings: the Recommendation Case, in: EKAW - 23rd International Conf. on
Knowledge Engineering and Knowledge Management, Bolzano, Italy, 2022.