<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Grade Level Filtering for Learning Object Search using Entity Linking</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ratan J. Sebastian</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ralph Ewerth</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anett Hoppe</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>L3S Research Center, Leibniz Universität Hannover</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>TIB Leibniz Information Centre for Science and Technology</institution>
        </aff>
      </contrib-group>
      <fpage>2</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>More and more Learning Objects like lessons, exercises, worksheets and lesson plans are available online. Finding them, however, is a challenge as they often lack metadata concerning format, content and, in the K-12 context: grade-levels or age ranges for which they are appropriate. This work studies the automatic content-based assignment of this last aspect of Learning Object metadata. For this purpose, we (a) collected a dataset of physics lessons, (b) explored a set of text-based features for their automatic analysis (derived from both dense vector representations and entity linking methods) and (c) trained a machine learning model with diferent subsets of these features to predict a resource's target grade level. We compare and discuss the results.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;information retrieval</kwd>
        <kwd>learning object</kwd>
        <kwd>search</kwd>
        <kwd>metadata enrichment</kwd>
        <kwd>machine learning</kwd>
        <kwd>classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        metadata fields using knowledge graphs [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ].
      </p>
      <p>In this study we address one of the metadata fields commonly absent in and very important
to Learning Object search: grade level i.e. which school grade a certain object is appropriate
for. We found that only 32% of Physics related material on Elixier (a central referatory run by
the German federal government) have grade level information. For an initial study, we limit
ourselves to Learning Objects that provide some sort of textual content. We acquire a dataset by
scraping LeifiPhysik.de, a popular learning resource for German school students which provides
textual instructional content covering a large portion of the physics syllabus for each grade.</p>
      <p>The dataset has 539 lessons labelled with a grade level for each state. We experiment with
diferent types of features, machine learning algorithms and machine learning problem
formulations to test their relative performance. We discuss in detail the features provided by entity
linking with a focus on their interpretability.</p>
      <p>In summary, the contributions of this paper are three-fold:
1. Introduce the grade level prediction task with the goal of enriching Learning Object
metadata with a grade level.
2. Use entity linking to generate interpretable and filterable features and examine their
efectiveness.
3. Report empirical results on the accuracy of text-based grade level prediction using a
dataset collected from LeifiPhysik.de</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <sec id="sec-2-1">
        <title>2.1. Learning Object Search</title>
        <p>
          The lack of efective search systems have been identified as a barrier to teachers finding good
Learning Objects online [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ][
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Anderson et al. [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ] identify specifically the "poor incorporation
of educational standards and learning outcomes into metadata" as a reason for existing solutions
being inefective. A study in the German context [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ] finds that most teachers use the search
functions in Learning Object Repositories with keywords but often find this insuficient and fall
back to using Google. A 2018 survey of the field [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] identifies several search systems that can
use metadata like educational standards or grade level for search but these systems are limited
by missing metadata. Ochoa et al. (2008) [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] chart a history of learning object search in which
metadata-enabled, precision-favouring faceted search gives way to full-text, keyword search
which favours recall. They point out pros and cons to both approaches and propose a learning
to rank approach that addresses the latter but do not discount the efectiveness of complete
metadata to deliver on the precision of results promised by the former.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Metadata Enrichment</title>
        <p>
          Many learning material platforms allow authors and curators to describe material with metadata.
Typical fields used are: content type, subject area, source, grade level and tags. For learning
objects, there are a number of commonly used metadata standards such as IEEE LOM, ADL
SCORM and LRMI [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>
          Ochoa et al. (2009) [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] identify metadata quality in Learning Object Repositories to be a
problem and propose quality measures for them. A quantitative analysis [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] of all resources
annotated with a popular metadata standard, LRMI, found that most repositories that use this standard
do not provide fields like educationalAlignment, alignmentType or typicalAgeRange
all of which indicate the age bracket or grade level a resource is appropriate for.
        </p>
        <p>
          To address the problem of missing metadata, Ochoa et al. (2005) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] propose a framework
for creating Learning Object metadata from course metadata. However, the problem of inferring
metadata values in cases when no clear source exists in the course metadata is out of the paper’s
scope. Classifying Learning Objects using Machine Learning was explored by Hassan et al. [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
However, the study aimed to predict a Learning Object’s "educational value". The authors find
this can be reliably assigned using ML methods. Other work that tries to solve the missing
metadata problem focuses on linking to knowledge graphs to improve keyword search [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
or error correction based on existing metadata [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
        </p>
        <p>All of these works either focus on enriching metadata fields other than grade level or propose
general methods that might might infer a grade level but a.) do not study its performance in
detail and b.) do not take advantage of all the sources of information available in a learning
object to perform this inference.</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Entity Linking</title>
        <p>
          Entity Linking has been used to help teachers author learning material in [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. The authors
use DBpedia Spotlight [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] to link fragments of text to DBpedia Resources and in turn use
those Resources to recommend similar content from other textbooks to teachers. Similarly,
Alpizar-Chacon et al. [17] use DBpedia to link textbooks to each other. Entity linking has been
used in other domains [18] as a component in a larger machine learning system for document
retrieval and classification.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Study Design</title>
      <sec id="sec-3-1">
        <title>3.1. Problem Definition</title>
        <p>In grade level prediction, the input is formatted textual content and the output is an ordinal
variable or a list of ordinal variables that indicate which grade level(s) this text is suited for.
The grade level is an integer that ranges from the lowest to the highest grade applicable for the
educational context.</p>
        <p>In this work, we restrict the context to Physics Education in grades 5 and up in all 16 German
states. The states vary widely in their curricula. Lessons are taught at diferent grades in
diferent states. Some lessons are taught across more grades in some states as opposed to others.
For instance a lesson on string vibration 4 is taught at grades 5,6,9 and 10 in Berlin but only at
grade 8 in Hessen.
4https://www.leifiphysik.de/akustik/akustische-phaenomene/grundwissen/saitenschwingung</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Collection</title>
        <p>Data is collected by crawling the LeifiPhysik website. The website organises lessons by state
and then by grade level for that state. These two attributes as well as the HTML content of the
page are collected during the crawl. There are 539 lessons with 11,667 annotations.</p>
        <p>The main text of each web page is extracted using Trafilatura [ 19], a software package
intended to create text corpora from web crawls. Equations are stripped out using a simple
regex, as the presented analysis is limited to the textual content of the lessons. Of the remaining
lessons, only those with at least 25 words are included in the dataset. This leaves 515 lessons
with a median of 181 words and a max of 1,227. Per state, the number of lessons ranges between
308 and 474. Further details about the data broken down by state and grade level are available
in Tables 1 and 2.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Entity Linking</title>
        <p>Entity linking is used in order to a.) filter out entities based on their Wikipedia categories and
b.) get interpretable features.</p>
        <p>Two diferent entity linkers explored in this study:
1. Wikifier [20] is a web service5 that annotates a document with relevant Wikipedia
concepts. Wikifier uses PageRank [ 21] to find a coherent set of inter-related concepts to
annotate the document. Entities returned have a PageRank score which indicates how
central they are in the network of links between concepts. Default values are used for all
dbc:Subfields_of_physics dbc:Physical_quantities dbc:Physics
dbc:Concepts_in_physics dbc:Classical_mechanics dbc:Metrology
dbc:Physical_sciences dbc:Engineering_disciplines dbc:Universe
dbc:Electrical_engineering dbc:Applied_sciences dbc:Electromagnetism
parameters except language which is set to German. Various settings of the PageRank
threshold are experimented with and an optimal one is chosen (details in Section 4.2 ).
2. Babelfy [22] is a web service 6 that similarly annotates text with entities in the BabelNet
knowledge base which in turn – when possible – link to DBpedia Resources. It focuses on
multi-lingual word disambiguation and was chosen since it is able to link more German
words to entities than Wikifier.</p>
        <p>In the context of the LeifiPhysik dataset, relevant entities are ones that a linked to one of the
physics related categories in Table 3. The link could be a direct one between the resource and
the category through the http://purl.org/dc/terms/subject property. Or it could be
an indirect link from one of those categories to an ancestor category through a chain (up to
5 links long) of skos:broader properties. skos:broader connects a category to a broader
parent category.</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Modelling</title>
        <p>Lesson documents are first encoded into a vector representation. The following vector
representations are used:
3.4.1. Features
1. TF-IDF: Term Frequency-Inverse Document Frequency generated using sklearn’s 7
TfIdf Vectorizer and the standard German tokenizer and stop word list.
2. SentenceTransformers (SBERT): Dense vector representation generated using SBERT8.
3. YAKE: Keywords are extracted from documents using YAKE [23], a one hot encoding of
which is used to represent the document. YAKE only uses syntactic features to identify
keywords and does not rely on an external knowledge base. It is used as a baseline to
compare the entity linker features to.
4. Babelfy &amp; Wikifier : The services mentioned in section 3.3 are used to perform entity
linking and then the linked entities for each document are encoded into vectors using a
one hot encoding.</p>
        <sec id="sec-3-4-1">
          <title>3.4.2. Experimental Settings</title>
          <p>The XGBoost [24] library was used for all model training. XGBoost was used since performance
was comparable with or better than other modelling methods like Random Forests and Support
Vector Machines (see Table 4) and the focus of this paper is on examining the impact of various
features and formulations rather than ML algorithms. Hyperparameter tuning is employed in
each case to decide an optimal n_depth hyperparameter for the model. In all experiments,
the train-validation-test split is 66.66%-16.66%-16.66%. Hyperparameter tuning is done on the
validation set and accuracies are reported on the test set. All experiments were run with 3-fold
cross validation and results were averaged across folds.</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Experiments</title>
        <sec id="sec-3-5-1">
          <title>3.5.1. Feature Comparison</title>
          <p>In order to see which features capture information relevant to the grade level prediction task,
the performances of the feature sets described in Section 3.4.1 are compared. Results are shows
7https://scikit-learn.org/stable/index.html
8https://www.sbert.net/
in Table 5. The specific evaluation metric used is described based on the problem formulation
discussed in Section 3.5.2.</p>
        </sec>
        <sec id="sec-3-5-2">
          <title>3.5.2. Problem formulation comparison</title>
          <p>Since the grade level prediction task can be formulated as diferent machine learning tasks, we
explore the mapping from grade level prediction task to machine learning task and the trade-ofs
of each choice. We then compare them in terms of performance to see if one formulations
models the problem better than any other. The machine learning formulations considered are:
• Categorical: This formulation treats grade level prediction as a single-label categorical
classification. Each of the grade levels is treated as a separate category with no ordering
relationship. Since each document can have only one label in this formulation, lessons
that are assigned to more than one grade are filtered out. Since this formulation requires
omission of data and does not use ordering constraints on the classes, it is intended as a
baseline which which the ordinal and multi-label formulations can be compared.
• Ordinal: This formulation takes into account the ordering relationship between grades.</p>
          <p>A series  − 1 of categorical binary classifiers are combined to predict  ordinal variables
as outlined in [25].
• Multi-Label: This formulation accommodates the fact that lessons can be assigned to
multiple grades. It treats grade level prediction as a multi-label categorical classification
problem by training one classifier for each label using SKLearn’s MultiOutputClassifier
9.
• Coarse-grained: Practical deployment of grade level classification could benefit from
higher accuracy at the cost of a coarser definition of grade-level. Hence in this formulation,
5th grade is considered Primary, 6th - 10th is considered Secondary-I and 11th-13th is
Secondary-II making this a 3-label categorical classification problem rather than a 9-label
one.
• W/ states: In all previous settings the input was just the vector encodings of the lessons.</p>
          <p>A model is trained separately for each state and the average performance across states is
reported. Here, the state is encoded as part of the input using a one hot vector and the
problem is treated as a multi-label categorical classification.</p>
          <p>The performance metric in all formulations is class-weighted accuracy. This rewards higher
accuracy for grades that appear more frequently in the data. This is chosen for two reasons:
(a) we found that accuracy is correlated with the number of annotations available for a class
(See Section 4.5) and (b) The focus in this experiment is on comparing various feature sets and
formulations. Hence, it did not make sense to penalize an experiment setting for a lack of data.
9https://scikit-learn.org/stable/modules/generated/sklearn.multioutput.MultiOutputClassifier.html
Features
SBERT
Babelfy
Wikifier
YAKE
TF-IDF</p>
        </sec>
        <sec id="sec-3-5-3">
          <title>3.5.3. Feature Interpretability</title>
          <p>In order to see what kinds of features the model uses to distinguish between grades, the 10 most
important features as learnt by the XGBoost model are extracted and reported for the YAKE,
Babelfy and Wikifier feature sets in Table 6. Further, the occurrence of these features at various
grade levels is plotted in Figure 2 to make it possible to manually verify that the features that
the model considers important are indeed key concepts taught at certain grade levels.</p>
        </sec>
        <sec id="sec-3-5-4">
          <title>3.5.4. Entity Filtering</title>
          <p>The confidence scores attached to linked entities are in some sense a measure of the centrality
of the annotated entity in the graph of candidate entities for a given document. Since centrality
might not directly relate to utility in grade level prediction we would like to see which confidence
score threshold yields the best grade level prediction accuracy. The relation between these two
quantities is plotted in Figure 1.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <sec id="sec-4-1">
        <title>4.1. Feature Comparison</title>
        <p>We see from Table 5 that Sentence Transformers consistently outperform entity linking features.
Sentence transformers capture signals like authorial style and a broader array of content
non-specific complexity features [ 26]. There is even some evidence to support the idea that
entity relationships from open knowledge bases are also captured by these embeddings [27];
relationships that we hoped to capture and exploit by filtering entity linking features by domain.
This result shows that entity linking and domain-adjusted filtering does not capture and boost
enough information about domain terms in lessons to outweigh the general language model
understanding of the same sentences in these lessons.</p>
        <p>SBERT features still have the problem of interpretability and future work incorporating more
features of related entities such as the number of relations between entities [28] and prerequisite
relationships between them [29] could improve their performance while helping us construct a
more understandable mapping from concepts to grade level.</p>
        <p>However, the entity linked feature sets (Babelfy and Wikifier) perform better than the TF-IDF
or YAKE baselines for most problem formulations. TF-IDF uses all tokens as features and ranks
them based on frequency and YAKE finds keywords based on frequency, sentence and document
structure. The entity linking features represent an additional source of information by finding
DBPedia nodes that can be linked to those keywords and ranking them by coherence to the
main topic of the lesson. Further it allows entities to be filtered by domain. This additional
step has an overall positive but complicated efect on grade level prediction performance that is
discussed in the next section.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Entity Linking &amp; Filtering</title>
        <p>It is clear from Table 7 that the entity filtering step described above helps with overall
performance. This is expected since it strips out noisy features that the model would otherwise have
to deal with.</p>
        <p>However, using thresholds on confidence scores reported by the entity linkers to filter out
incoherent entities works less well. Figure 1 shows that prediction accuracy largely decreases
in the case of both Wikifier and Babelfy as the threshold is increased. Optimal values found
were 0.01 for the PageRank of Wikifier and 0.7 for the confidence score reported by Babelfy.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Comparison of interpretable features</title>
        <p>The best performing XGBoost model for each of the feature sets was queried for its most
important features. The ranks of the features across each of the cross-validation folds were
averaged and the top 10 features for each set are shown in Table 6.</p>
        <p>• YAKE shows features unrelated to physics concepts such as "basic knowledge" and "Fig."
(from figure titles). While Babelfy and Wikifier yield top features that are obviously more
physics related even though these features were retrieved from a feature set that did not
iflter for domain specific features.
• Between Babelfy and Wikifier, there are more specific physics concepts found from the
Babelfy features. This could be explained by Babelfy performing better on multilingual
entity link benchmarks [22].
• Figure 2 shows how frequently diferent features show up at diferent grade levels.
Visualization such as these could help domain experts verify that the model is learning useful
distinctions and improve confidence in such a metadata enrichment step. For instance,
the figure shows that the model thinks that Electrons are discussed at higher grade levels
and Heat Conduction is discussed earlier which can be easily verified by an expert.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Problem formulation comparison</title>
        <p>Ordinal classification performs better than the baseline categorical as expected since it takes
into account the ordering of classes.</p>
        <p>Multi-label classification shows very interesting results. There is a wide divergence in the
performance of Babelfy and Wikifier which is not the case in any of the other formulations.
Wikifier performs the worst in fact. This could reflect the fact that Babelfy performs better at
German entity linking than Wikifier [22].</p>
        <p>The use of coarse-grained categories unsurprisingly leads to the best overall accuracy and
might be the most viable solution in a practical metadata enrichment system. A more detailed
discussion about this model’s fitness for purpose follows in Section 4.5.</p>
        <p>The W/ states formulation shows considerably worse performance and leads us to believe
that the interactions between the content based features and the state encoding features are
complex and are best not handled within a machine learning solution but rather separately.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.5. Error Analysis</title>
        <p>Precision</p>
        <p>Recall</p>
        <p>To better understand what the grades that the model has a hard time predicting we took
a closer look at how the model performed with each grade in one of the ordinal setting with
SBERT features (Table 8). The precision and recall values are significantly greater than a random
guess or weighted guess classifiers showing that are model is clearly learning the task.</p>
        <p>The F1 scores correlate very well with the number of annotations available for each grade
(Pearson Coeficient: 0.942) which indicates that more data would be needed for the lower and
higher grades to improve performance.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>Learning Object metadata enrichment is important for precise Learning Object search. A key but
under-investigated part of this is grade level prediction which is studied here. The efectiveness
of various feature representations of the textual content, of various ML formulations and the
impact of entity linking on the this task are covered. We show that dense vector representations
of textual content lead to the best performance. However, if interpretability is desired, entity
linking approaches perform better than TF-IDF or keyword extraction. We show filtering
entities by domain improves performance at this task. This is a first attempt at a solution
for this task. There are many avenues for improvement including: enhancing entity linking
features with pedagogically relevant information, assembling a more balanced dataset and
representing more of the lesson like equations, images, animations and lesson structure. These
improvements would ultimately raise metadata quality which directly contributes to search
features in Learning Object Repositories being more efective.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgments</title>
      <p>This work has been partly supported by the Ministry of Science and Education of Lower Saxony,
Germany, through the Graduate training network “LernMINT: Data-assisted classroom teaching
in the MINT subjects”. We would like to thank the reviewers for their valuable feedback.
multilingual entity extraction, in: Proceedings of the 9th International Conference on
Semantic Systems (I-Semantics), 2013.
[17] I. A. Chacon, S. A. Sosnovsky, Knowledge models from PDF textbooks, New Rev.
Hypermedia Multim. 27 (2021) 128–176. URL: https://doi.org/10.1080/13614568.2021.1889692.
doi:10.1080/13614568.2021.1889692.
[18] R. Reinanda, E. Meij, M. de Rijke, Knowledge graphs: An information retrieval perspective,
Found. Trends Inf. Retr. 14 (2020) 289–444. URL: https://doi.org/10.1561/1500000063. doi:10.
1561/1500000063.
[19] A. Barbaresi, Generic web content extraction with open-source software, in: Proceedings of
the 15th Conference on Natural Language Processing, KONVENS 2019, Erlangen, Germany,
October 9-11, 2019, 2019. URL: https://corpora.linguistik.uni-erlangen.de/data/konvens/
proceedings/papers/kaleidoskop/camera_ready_barbaresi.pdf.
[20] R. Mihalcea, A. Csomai, Wikify!: linking documents to encyclopedic knowledge, in: M. J.</p>
      <p>Silva, A. H. F. Laender, R. A. Baeza-Yates, D. L. McGuinness, B. Olstad, Ø. H. Olsen, A. O.
Falcão (Eds.), Proceedings of the Sixteenth ACM Conference on Information and Knowledge
Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007, ACM, 2007, pp. 233–242.</p>
      <p>URL: https://doi.org/10.1145/1321440.1321475. doi:10.1145/1321440.1321475.
[21] L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: Bringing order
to the web., Technical Report, Stanford InfoLab, 1999.
[22] A. Moro, F. Cecconi, R. Navigli, Multilingual word sense disambiguation and entity linking
for everybody, in: M. Horridge, M. Rospocher, J. van Ossenbruggen (Eds.), Proceedings
of the ISWC 2014 Posters &amp; Demonstrations Track a track within the 13th International
Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014, volume
1272 of CEUR Workshop Proceedings, CEUR-WS.org, 2014, pp. 25–28. URL: http://ceur-ws.
org/Vol-1272/paper_30.pdf.
[23] R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, A. Jatowt, Yake! keyword
extraction from single documents using multiple local features, Inf. Sci. 509 (2020) 257–289.</p>
      <p>URL: https://doi.org/10.1016/j.ins.2019.09.013. doi:10.1016/j.ins.2019.09.013.
[24] T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: B. Krishnapuram,
M. Shah, A. J. Smola, C. C. Aggarwal, D. Shen, R. Rastogi (Eds.), Proceedings of the 22nd
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San
Francisco, CA, USA, August 13-17, 2016, ACM, 2016, pp. 785–794. URL: https://doi.org/10.
1145/2939672.2939785. doi:10.1145/2939672.2939785.
[25] E. Frank, M. A. Hall, A simple approach to ordinal classification, in: L. D. Raedt, P. A. Flach
(Eds.), Machine Learning: EMCL 2001, 12th European Conference on Machine Learning,
Freiburg, Germany, September 5-7, 2001, Proceedings, volume 2167 of Lecture Notes in
Computer Science, Springer, 2001, pp. 145–156. URL: https://doi.org/10.1007/3-540-44795-4_
13. doi:10.1007/3-540-44795-4\_13.
[26] E. Terreau, A. Gourru, J. Velcin, Writing style author embedding evaluation, in: Proceedings
of the 2nd Workshop on Evaluation and Comparison of NLP Systems, 2021, pp. 84–93.
[27] F. Petroni, T. Rocktäschel, P. S. H. Lewis, A. Bakhtin, Y. Wu, A. H. Miller, S. Riedel, Language
models as knowledge bases?, CoRR abs/1909.01066 (2019). URL: http://arxiv.org/abs/1909.
01066. arXiv:1909.01066.
[28] C. K. Pereira, B. P. Nunes, S. W. M. Siqueira, R. Manrique, J. F. Medeiros, ’a little knowledge
is a dangerous thing’: A method to automatically detect knowledge compartmentalization
and oversimplification, in: 20th IEEE International Conference on Advanced Learning
Technologies, ICALT 2020, Tartu, Estonia, July 6-9, 2020, IEEE, 2020, pp. 140–144. URL: https:
//doi.org/10.1109/ICALT49669.2020.00048. doi:10.1109/ICALT49669.2020.00048.
[29] R. Manrique, B. P. Nunes, O. Mariño, Exploring knowledge graphs for the identification
of concept prerequisites, Smart Learn. Environ. 6 (2019) 21. URL: https://doi.org/10.1186/
s40561-019-0104-3. doi:10.1186/s40561-019-0104-3.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>[1] The Learning Object Metadata standard - IEEE Learning Technology Standards Committee</article-title>
          , ???? URL: https://www.ieeeltsc.org/working-groups/wg12LOM/lomDescription/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Anderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Leachman</surname>
          </string-name>
          ,
          <article-title>Strategies for supporting oer adoption through faculty and instructor use of a federated search tool</article-title>
          ,
          <source>Journal of Librarianship and Scholarly Communication</source>
          <volume>7</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.-H. G. . F.-F. N. . A.</given-names>
            <surname>Ernest</surname>
          </string-name>
          ,
          <article-title>Repositories of open educational resources: An assessment of reuse and educational aspects</article-title>
          ,
          <source>International Review of Research in Open and Distributed Learning</source>
          <volume>18</volume>
          (
          <year>2017</year>
          )
          <fpage>84</fpage>
          -
          <lpage>120</lpage>
          . doi:https://doi.org/10.19173/irrodl.v18i5.
          <fpage>3063</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tavakoli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Elias</surname>
          </string-name>
          , G. Kismihók,
          <string-name>
            <given-names>S.</given-names>
            <surname>Auer</surname>
          </string-name>
          ,
          <article-title>Quality prediction of open educational resources A metadata-based approach</article-title>
          ,
          <source>in: 20th IEEE International Conference on Advanced Learning Technologies, ICALT</source>
          <year>2020</year>
          , Tartu,
          <source>Estonia, July 6-9</source>
          ,
          <year>2020</year>
          , IEEE,
          <year>2020</year>
          , pp.
          <fpage>29</fpage>
          -
          <lpage>31</lpage>
          . URL: https://doi.org/10.1109/ICALT49669.
          <year>2020</year>
          .
          <volume>00007</volume>
          . doi:
          <volume>10</volume>
          .1109/ICALT49669.
          <year>2020</year>
          .
          <volume>00007</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ochoa</surname>
          </string-name>
          , E. Duval,
          <article-title>Automatic evaluation of metadata quality in digital repositories</article-title>
          ,
          <source>Int. J. Digit. Libr</source>
          .
          <volume>10</volume>
          (
          <year>2009</year>
          )
          <fpage>67</fpage>
          -
          <lpage>91</lpage>
          . URL: https://doi.org/10.1007/s00799-009-0054-4. doi:
          <volume>10</volume>
          .1007/ s00799-009-0054-4.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dietze</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Taibi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barker</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>d'Aquin, Analysing and improving embedded markup of learning resources on the web</article-title>
          , in: R. Barrett,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cummings</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Agichtein</surname>
          </string-name>
          , E. Gabrilovich (Eds.),
          <source>Proceedings of the 26th International Conference on World Wide Web Companion, Perth, Australia, April 3-7</source>
          ,
          <year>2017</year>
          , ACM,
          <year>2017</year>
          , pp.
          <fpage>283</fpage>
          -
          <lpage>292</lpage>
          . URL: https: //doi.org/10.1145/3041021.3054160. doi:
          <volume>10</volume>
          .1145/3041021.3054160.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Behr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Trigo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cascalho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Guerra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Parente</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Botelho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Vicari</surname>
          </string-name>
          ,
          <article-title>Recommending metadata contents for learning objects through linked data</article-title>
          , in: F.
          <string-name>
            <surname>de la Prieta</surname>
            ,
            <given-names>A. E.</given-names>
          </string-name>
          <string-name>
            <surname>Bolock</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Durães</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Carneiro</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Lopes</surname>
          </string-name>
          , V. Julián (Eds.),
          <article-title>Highlights in Practical Applications of Agents, Multi-Agent Systems, and</article-title>
          <string-name>
            <given-names>Social</given-names>
            <surname>Good</surname>
          </string-name>
          .
          <source>The PAAMS Collection - International Workshops of PAAMS</source>
          <year>2021</year>
          , Salamanca, Spain, October 6-
          <issue>9</issue>
          ,
          <year>2021</year>
          , Proceedings, volume
          <volume>1472</volume>
          of Communications in Computer and Information Science, Springer,
          <year>2021</year>
          , pp.
          <fpage>115</fpage>
          -
          <lpage>126</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -85710-3_
          <fpage>10</fpage>
          . doi:
          <volume>10</volume>
          . 1007/978-3-
          <fpage>030</fpage>
          -85710-3\_
          <fpage>10</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J. F. H.</given-names>
            <surname>Cubides</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. A. G.</given-names>
            <surname>García</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E. M.</given-names>
            <surname>Marín</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sánchez-Alonso</surname>
          </string-name>
          ,
          <article-title>Improving OER descriptions to enhance their availability, reuse, and enrichment</article-title>
          ,
          <source>Educ. Inf. Technol</source>
          .
          <volume>27</volume>
          (
          <year>2022</year>
          )
          <fpage>1811</fpage>
          -
          <lpage>1839</lpage>
          . URL: https://doi.org/10.1007/s10639-021-10641-w. doi:
          <volume>10</volume>
          .1007/ s10639-021-10641-w.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Richter</surname>
          </string-name>
          , U. D. Ehlers,
          <article-title>Barriers and motivators for using open educational resources in schools</article-title>
          , Open Ed (
          <year>2010</year>
          )
          <fpage>1</fpage>
          -
          <lpage>12</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Osorio-Zuluaga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. D.</given-names>
            <surname>Duque-Mendez</surname>
          </string-name>
          ,
          <article-title>Search and selection of learning objects in repositories: A review, in: 2018 XIII Latin American Conference on Learning Technologies (LACLO)</article-title>
          , IEEE,
          <year>2018</year>
          , pp.
          <fpage>513</fpage>
          -
          <lpage>520</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ochoa</surname>
          </string-name>
          , E. Duval,
          <article-title>Relevance ranking metrics for learning objects</article-title>
          ,
          <source>IEEE Trans. Learn. Technol</source>
          .
          <volume>1</volume>
          (
          <year>2008</year>
          )
          <fpage>34</fpage>
          -
          <lpage>48</lpage>
          . URL: https://doi.org/10.1109/TLT.
          <year>2008</year>
          .
          <article-title>1</article-title>
          . doi:
          <volume>10</volume>
          .1109/TLT.
          <year>2008</year>
          .
          <volume>1</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>R. D. C. M. Rueda</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Lujan</surname>
          </string-name>
          ,
          <article-title>A quantitative analysis of the use of microdata for semantic annotations on educational resources</article-title>
          ,
          <source>J. Web Eng</source>
          .
          <volume>17</volume>
          (
          <year>2018</year>
          )
          <fpage>45</fpage>
          -
          <lpage>72</lpage>
          . URL: http://www. rintonpress.com/xjwe17/jwe-17-12/
          <fpage>045</fpage>
          -
          <lpage>072</lpage>
          .pdf.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ochoa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Cardinaels</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Meire</surname>
          </string-name>
          , E. Duval,
          <article-title>Frameworks for the automatic indexation of learning management systems content into learning object repositories, in: EdMedia+ Innovate Learning, Association for the Advancement of Computing in Education (AACE</article-title>
          ),
          <year>2005</year>
          , pp.
          <fpage>1407</fpage>
          -
          <lpage>1414</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Mihalcea</surname>
          </string-name>
          ,
          <article-title>Learning to identify educational materials</article-title>
          ,
          <source>ACM Trans. Speech Lang. Process</source>
          .
          <volume>8</volume>
          (
          <issue>2011</issue>
          ) 2:
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          :
          <fpage>18</fpage>
          . URL: https://doi.org/10.1145/2050100.2050101. doi:
          <volume>10</volume>
          . 1145/2050100.2050101.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Grévisse</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Manrique</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Mariño</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rothkugel</surname>
          </string-name>
          ,
          <article-title>Knowledge graph-based teacher support for learning material authoring</article-title>
          , in: Colombian Conference on Computing, Springer,
          <year>2018</year>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>191</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>J.</given-names>
            <surname>Daiber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Jakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Hokamp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. N.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <article-title>Improving eficiency and accuracy in</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>