<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshop on Adaptive Lifelong Learning, July</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Personalized Learning in K-12 Education: Exploring Weak-Labels for a Random Forest-based Collaborative Filtering Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pedro Ilídio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alireza Gharahighehi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Felipe Kenji Nakano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Celine Vens</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Itec, imec research group at KU Leuven</institution>
          ,
          <addr-line>Etienne Sabbelaan 53, 8500 Kortrijk</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>KU Leuven</institution>
          ,
          <addr-line>Campus Kulak</addr-line>
          ,
          <institution>Department of Public Health and Primary Care</institution>
          ,
          <addr-line>Etienne Sabbelaan 53, 8500 Kortrijk</addr-line>
          ,
          <country country="BE">Belgium</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <volume>0</volume>
      <fpage>8</fpage>
      <lpage>12</lpage>
      <abstract>
        <p>Education, a cornerstone of human development, increasingly leverages digital learning tools, generating valuable data from student interactions. This data can enhance learning eficiency through adaptive and personalized systems, moving beyond the traditional "one-size-fits-all" model. Recommendation systems learn user profiles to suggest relevant items and personalize students' learning experiences. In the context of implicit feedback, binary interactions are used in weak-label learning, where negative label annotations are unreliable. This paper proposes a weak-label learning method for recommending learning materials and trajectories, combining local and global Random Forests in a multi-step collaborative filtering process. The proposed approach is named PentaForest, and outperforms other popular collaborative filtering methods in terms of NDCG and recall.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Random Forest</kwd>
        <kwd>weak-label learning</kwd>
        <kwd>k-12 education</kwd>
        <kwd>educational recommendation</kwd>
        <kwd>collaborative filtering</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        One of the most important aspects of human development and sustainable development goals
is education. In the age of digitization, the use of digital learning tools has become more
popular among students, either as a primary means of completing learning assignments or as an
auxiliary learning environment. This facilitates the creation of data based on learners’ behaviors
and interactions with these learning platforms. Such data can be used to provide data-driven
interventions, making learning more eficient and efective. There is a growing interest in
adaptive and personalized learning systems due to their presumed benefits on cognitive and
non-cognitive learning outcomes [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. Education is continuously moving from the classic
"one-size-fits-all" model to more adaptive and personalized learning approaches. As learners
have diferent needs and preferences, they can be served accordingly.
      </p>
      <p>Recommendation systems are machine learning methods that learn users’ profiles based on
their previous interactions and recommend items that best fit these profiles. These systems
are generally categorized into two main types: content-based filtering and collaborative
filtering. The former aims to match item features with user profiles, recommending items whose
features best match the profiles, while the latter models user preferences and needs based on
collaborative information between users and items—i.e., their interactions—to recommend items.
User feedback on items is usually implicit, meaning that, in most of the cases, we do not have
explicit ratings from users. In such cases, binary feedback (user interaction with an item) is
the only available signal to learn user profiles. This setting is known as one-class collaborative
ifltering and is also referred to as positive-unlabeled (PU) learning or weak-label learning in
machine learning. In this context, the given labels are all positive and missing labels are either
from the negative class (i.e., the user observed the item but deliberately did not interact with
it) or are missing positive labels (i.e., the user did not observe the item, otherwise (s)he would
have interacted with it). Furthermore, because the given labels are based solely on simple
interactions between users and items, they may be unreliable and weak positive labels. In this
paper, we propose a weak-label learning method to recommend learning materials and learning
trajectories to students based on their past interactions within the system. We combine local and
global Random Forests in a multi-step procedure for collaborative filtering, utilizing self-learned
label probabilities to address label unreliability. The resulting procedure is named PentaForest.</p>
      <p>The following sections are organized as follows. Section 2 provides an overview of related
studies. Section 3 then defines the specific learning problem being approached, and Section 4
presents our method proposal. We present our experimental setup in Section 5, and experimental
results are provided in Section 6. Finally, Section 7 concludes our work and proposes future
research directions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>
        We propose using ensembles of randomized decision trees, called Random Forests [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], to perform
collaborative filtering in weakly-supervised settings. Decision tree-based methods for
weaklysupervised tasks have not received much attention in recent years [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Two general approaches
are usually proposed: i) impute new labels in a self-supervised manner; or ii) consider the
structure of the feature space when growing each tree. Here, we focus on the first approach,
where the label matrix is either completed to yield a dense representation or new positive
annotations are added before training the final estimator. In this context, Tanha et al. [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
proposed an iterative approach, where the most confident predictions are imputed before
building the next tree. Wang et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] focused specifically on the problem of weak-labels, where
only negative annotations are unreliable. Their proposal is based on a deep forest model, in
which multiple decision forests are trained sequentially and the predicted probabilities at each
layer are appended as new features for the next one. The authors adapted this procedure to
weak-labels by performing label imputation after each layer of the deep forest. We extended
this idea and addressed limitations of the original proposal in a recent work [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In the context
of PU interaction prediction, Pliakos and Vens [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] used Neighborhood-Regularized Logistic
Matrix Factorization to complete the label annotations, converting the binary label matrix to a
dense representation. This representation was then used to build a global multi-output forest
to serve as the final estimator. Global estimators consider both item-related and user-related
information during training [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In contrast, local models take either users or items as input
instances, predicting its interactions as multiple outputs [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Gharahighehi et al. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] proposed
a two-step approach to address the cold-start problem in a PU learning context. First, the
interaction matrix between users and warm items is reconstructed using SLIM [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Then, an
inductive multi-target regressor is trained on this reconstructed interaction matrix to predict
interactions for new items that enter the system. In the context of Massive Open Online Course
(MOOC) recommendations, Gharahighehi et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] considered censored time-to-event data
(time to dropout from MOOCs) as weak-labels and extended Bayesian Personalized Ranking [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ],
which is a learning-to-rank collaborative filtering approach, to incorporate these weak-labels in
training the model.
      </p>
      <p>
        In the field of recommendation systems, Random Forests were employed by Li et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] as a
dimensionality reduction strategy, preprocessing the data before employing similarity-based
collaborative filtering. Panagiotakis et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], on the other hand, completed the label matrix
with Synthetic Coordinate Recommendations and then trained a local Random Forest as an
item recommender given a user as input. However, in both cases, Random Forests are employed
to explore side-information or context-information on the problem, in contrast to the current
scenario where only interactions are utilized.
      </p>
      <p>We combine multi-output local forests to complete the label matrix, and then leverage the
completed annotations to train a global single output forest. We now define the learning problem
under study (Section 3), and a detailed description of our method is presented in Section 4.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Problem Definition</title>
      <p>Let  be a set of users {1, 2, · · · , } and ℐ be a set of items {1, 2, · · · , }. The  × 
binary matrix  = () then represents known interactions between the user  and item
. In this matrix, 1 is assumed to be a confirmed interaction. 0-valued entries, however, can
be either non-occurring relationships or unannotated positive values, which characterizes the
weak-label scenario. Each item is considered a diferent label and each user is considered a
diferent sample.</p>
      <p>Our goal is to indicate  new items for each user  in  , representing the annotations
that are most-likely to be missing or interactions that are most-likely to occur in the future.
The set of  indicated items for  is called the recommendations for the user . To generate
recommendations, we only receive the label matrix  , no side-information is assumed. Having
only  characterizes collaborative filtering techniques based on implicit feedback.</p>
      <p>In the present setting, users represent students in an online learning platform. For items, two
scenarios are separately explored:
• Learning materials: items are learning activities available at the platform;
• Learning trajectories: items are sets of learning materials. Each trajectory represents a
path of materials manually defined by a teacher to be followed by the students.
We say a user interacted with a learning material if the material was accessed by the user,
independently of the activity being concluded or not, and independently of the number of
accesses. For learning trajectories, a positive value indicates the trajectory was concluded.
Weak-labels arise from the user not knowing the item, or having not interacted with it yet at
the time that the dataset was built.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Proposed Method: PentaForest</title>
      <p>
        The proposed procedure1 employs five Random
Forest (RF) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] models to perform collaborative filtering.
Usually, training each RF requires a feature matrix
describing the input samples and a label matrix to be
modeled. In this case, however, we use  as both the
feature and the label matrices. The algorithm is divided
into three main steps:
      </p>
      <sec id="sec-4-1">
        <title>1. Train primary item and user recommenders</title>
        <p>• Forest 1: In the first step, a RF is trained to
predict the probabilities of a given user to
interact with each of the items. We call this
RF an "item recommender".
• Forest 2: Also in this step, another RF is
built to solve the transposed problem: it
predicts the probabilities of a given item to
interact with each of the users. It is called
an "user recommender".
2. Train secondary item and user
recommenders
• Forest 3: In the second step, the
probabilities predicted by the item recommender
are used as targets for a second user
recommender.
• Forest 4: We also train a second item
recommender using the probabilities predicted
by the user recommender.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3. Train single output predictor</title>
        <p>Y</p>
        <p>T Transpose
Forest 2</p>
        <p>T</p>
        <sec id="sec-4-2-1">
          <title>Step 1.</title>
          <p>Forest 1</p>
          <p>Predicted
probabilities
for each input</p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Step 2.</title>
          <p>Forest 3
Forest 4
T
T</p>
          <p>Averaged
probabilities
from step 2
Final
probabilities</p>
        </sec>
        <sec id="sec-4-2-3">
          <title>Step 3.</title>
          <p>Forest 5
Algorithm 1: PentaForest: Random Forests for Collaborative Filtering
Input: User-item interaction matrix 
Output: Completed matrix ˜
˜ users, 1 ←</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Step 1: Train primary recommenders</title>
        <p>˜ items, 1 ←  . __(,  )
 . __(  ,   )
˜ users, 2 ←
˜
 items, 2 ←</p>
      </sec>
      <sec id="sec-4-4">
        <title>Step 2: Train secondary recommenders</title>
        <p>. __(  , items, 1)
 . __(, users, 1)
˜ avg ←
˜
 final ←</p>
      </sec>
      <sec id="sec-4-5">
        <title>Step 3: Final prediction</title>
        <p>(˜ users, 2, ˜ items, 2)
 . __(,   , ˜ avg)</p>
        <p>
          Return ˜ final
We note that the concatenations in the third step are not performed as such. Instead, the
Bipartite Global Single Output procedure [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] is used to generate the forests more eficiently
from both user and item "feature matrices" directly. In each step, the user feature matrix is
always the original binary label matrix itself ( ), and the transposed label matrix (  ) is taken
as the item feature matrix. We call local the forests that utilize either users or items as input
samples, predicting multiple outputs for each of them [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The global forest, on the other hand,
considers both at the same time [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. The whole procedure is summarized in algorithm 1 and
illustrated in Figure 1.
        </p>
        <p>The reason for using multiple steps of reconstruction is to generate several diverse models.
We then encourage the diferent types of learned information to be exchanged between the
forests, controlling overfitting and improving their ability to generalize. Further discussion is
presented by Section 6. We also provide four reasons for utilizing the original label matrix as
features instead of its completed version:
1. the original (confirmed) positive annotations are prioritized, being used to define the
decision boundaries;
2. the lower cardinality of the binary labels reduces the tendency of one forest to overfit the
predictions of the previous;
3. the lower cardinality also allows for faster training of the forests;
4. relying on the original annotations makes the model more easily interpretable, allowing
us to look only at the final estimator’s structure to gain insights on the learning task.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Dataset and Experimental Setup</title>
      <p>We used two datasets from an educational K-12 platform in Belgium. The first dataset contains
students’ interactions with learning materials, while the second dataset includes students’
interactions with learning trajectories, each comprising a series of learning materials defined
by teachers. We excluded students and materials with fewer than 10 interactions from the
ifrst dataset and students and tracks with fewer than 5 interactions from the second dataset.
This is done to ensure that a reasonable amount of information remains on the training set
after masking the interactions to be used for scoring. Table 1 provides a description of the
two datasets after preprocessing. To form our training, validation, and test sets, we kept one
interaction per user for the test set, one interaction per user for the validation set, and all
remaining interactions for the training set. This way of splitting ensures that the same set of
users and items appears in the training, test, and validation sets.</p>
      <p>For evaluation, we applied two measures: normalized discounted cumulative gain (NDCG)
(Eq. 1) and recall (Eq. 2).</p>
      <p>=
 = |∑︁| 2 − 1 ,
=1 2( + 1)
|∑_︁| 2 − 1 ,</p>
      <p>2( + 1)
  =
 =
=1
1 ∑︁  .
| | ∈ 
| | ∈
1 ∑︁ | ∩ | .</p>
      <p>||
(1)
(2)
where  is the set of users,  is the test items for user ,  is the recommendation list for
user ,  is the real rating value of the ℎ item in ,  is the ideal DCG value, and
_ is the ideal recommendation list for user , that one can create based on the ground
truth. Therefore, the  value for each user is normalized with the ideal value () to
get the   value for that user.</p>
      <p>
        As competing methods, we included five approaches:
• UKNN: user-based (UKNN) collaborative filtering (CF) [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] is a memory-based CF
methods that impute missing interactions between users and items based on the interactions
of neighbor users.
• IKNN: item-based (UKNN) collaborative filtering (CF) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is a memory-based CF methods
that impute missing interactions between users and items based on the interactions of
neighbor items.
• WRMF: weighted regularized matrix factorization (WRMF) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] is a model-based CF
method that utilizes the alternating-least-squares optimization algorithm to learn its
parameters.
• EASE: Embarrassingly Shallow Autoencoders (EASE) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] is a linear collaborative filtering
model for implicit feedback datasets based on shallow autoencoders [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
• MVAE: multi-variational autoencoders (MVAE) [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] is a CF-based recommender system
for implicit feedback, based on variational autoencoders, with the main assumption that
user interactions follow a multinomial distribution.
      </p>
      <p>
        We selected these methods because they demonstrated high performance in collaborative
ifltering tasks, as reported in the award-winning paper by Dacrema et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], which showed
that memory-based approaches (UKNN and IKNN) and MVAE outperform recent complex deep
neural network-based approaches .
      </p>
      <p>As for the hyperparameters of our method, each forest was composed of 1000 trees. The
forests employed bootstrapping to resample the set of input instances, and were grown until
their maximum depth. The remaining hyperparameters were left to their default values in the
scikit-learn package. For instance, the objective criterion was set to the mean squared error,
and no feature sampling was performed.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Results and Discussion</title>
      <p>The results of applying the proposed method and the competing approaches are reported in
Tables 2 and 3 for the learning material and trajectory recommendation tasks, respectively. The
proposed PentaForest method is shown to clearly outperform the other competing approaches
in both tasks. In terms of recall, PentaForest is especially superior when fewer recommendations
are selected, which represents the most dificult tasks under study (see top 3 in Table 2 and
Table 3). Regarding NDCG, the superiority of our method is evident and consistent across all
scenarios. Again, this suggests that the proposed technique is especially proficient for fewer
recommendations, for the reason that NDCG prioritizes the recommendations predicted to be
most likely. As such, NDCG is less sensitive to the number of recommendations we select for
evaluation.</p>
      <p>The promising results corroborate the hypothesized benefits of weak-label techniques in
recommendation contexts. Note that using the label matrix as features for the Random Forests
means that each tree will cluster instances with similar labels. When we use self-learned label
probabilities to train a new forest, we are instructing this forest to cluster labels that are expected
to be similar, even if not similar from the original label annotations. This is what mitigates the
efect of uncertain annotations.</p>
      <p>Furthermore, we argue that our model proficiency is in great part due to the diversity of the
forests employed. This diversity might not be apparent at a first glance, since all the component
models are based on the same Random Forest algorithm. Notwithstanding, notice that the
secondary user recommender is trained on the outputs of the primary item recommender, and
vice-versa. This compels the secondary estimators to learn a representation of the problem
that is diferent from the primary models. That is, the predictions made based on the user-wise
information now need to be explained from the item information alone, and vice-versa. Similarly,
the representation learned at the final step by the global Random Forest is also diferent. This
forest will induce a biclustering of the interaction matrix, grouping interactions that are similar
both in terms of user and item profiles. This difers from the local forests used in the first two
steps, that only consider either user-user or item-item label similarity.</p>
      <p>Finally, using a global forest that receives the original binary labels yields a crucial advantage
for the deployment of these systems: it enables acknowledging new interactions, that were
added to the dataset after training the models. This is due to the fact that the final estimator
only needs the user and item interaction profiles to estimate new interaction probabilities.
As such, updated interaction profiles could be provided to obtain updated recommendations
without rebuilding the models. Even interactions between unseen items and unseen users can
be inferred, based on their interactions with the users and items in the training set.</p>
      <p>The global forest also allows estimating probabilities only for a subset of items and users.
This can be used, for example, to provide recommendations within a user-specified category of
activities, without generating probabilities for all activities in the training set.</p>
      <p>An educational system could also benefit from the known transparency of forest estimators.
It could, for instance, prioritize items with higher importance when making recommendations,
instead of only relying on the predicted probabilities. This could enable the system to discover
new interests of a user, making recommendations that are not necessarily similar to the user’s
previous interactions, but are useful for profiling the user’s preference.</p>
      <p>In summary, the trained final forest of a PentaForest model can be employed to create highly
adaptive recommendation systems, for a dynamic and personalized learning experience.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>
        In this paper we proposed PentaForest, a new collaborative filtering approach for recommending
learning material and learning trajectories to students in a web-based educational platform.
Our method combines local and global Random Forests in a way to acknowledge the
weaklysupervised nature of our problem. Our results suggest state-of-the-art performance in our
current learning settings, prompting future studies to further investigate decision forests for
recommendation tasks in general. In subsequent work, we would like to explore semi-supervised
split evaluation metrics, since they have been shown to be useful for other PU-learning
scenarios [
        <xref ref-type="bibr" rid="ref16">24, 16</xref>
        ]. We would also like to evaluate the potential of deeper ensembles of decision
forests as recommender systems, adapting deep forests presented by Wang et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], Ilídio et al.
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In contrast, shallower versions of our estimators might also be an interesting topic of study
in a broader variety of problems, performing more detailed ablation analysis to elucidate the
efectiveness of the reconstruction steps. Furthermore, exploring the explainability of decision
trees could be a valuable method for selecting the most informative activities to recommend
when profiling new users. Finally, the use of side-features is also a topic for further exploration,
as a way of mitigating the cold start problem or possibly improving transductive predictions.
      </p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>The work received funding from the Flemish Government (AI Research Program). The authors
would like to also thank FWO, 1235924N.
[24] A. Alves, P. Ilidio, R. Cerri, Semi-supervised hybrid predictive bi-clustering trees for
drug-target interaction prediction, in: Proceedings of the 38th ACM/SIGAPP Symposium
on Applied Computing, ACM/SIGAPP, Tallinn, 2023, pp. 1163–1170.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gharahighehi</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. Van Schoors</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Topali</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ooge</surname>
          </string-name>
          ,
          <article-title>Adaptive lifelong learning (all)</article-title>
          ,
          <source>in: International Conference on Artificial Intelligence in Education</source>
          , Springer,
          <year>2024</year>
          , pp.
          <fpage>452</fpage>
          -
          <lpage>459</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. S.</given-names>
            <surname>Chai</surname>
          </string-name>
          , M. S. Y. Jong, A. Istenic,
          <string-name>
            <given-names>M.</given-names>
            <surname>Spector</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.-B. Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Yuan</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>A review of artificial intelligence (ai) in education from 2010 to 2020</article-title>
          ,
          <year>Complexity 2021</year>
          (
          <year>2021</year>
          )
          <fpage>1</fpage>
          -
          <lpage>18</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Aslan</surname>
          </string-name>
          ,
          <article-title>Ai technologies for education: Recent research &amp; future directions</article-title>
          ,
          <source>Computers and Education: Artificial Intelligence</source>
          <volume>2</volume>
          (
          <year>2021</year>
          )
          <fpage>100025</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          , Random forests,
          <source>Machine Learning</source>
          <volume>45</volume>
          (
          <year>2001</year>
          )
          <fpage>5</fpage>
          -
          <lpage>32</lpage>
          . Publisher: Springer.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>V. G.</given-names>
            <surname>Costa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. E.</given-names>
            <surname>Pedreira</surname>
          </string-name>
          ,
          <article-title>Recent advances in decision trees: an updated survey</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          <volume>56</volume>
          (
          <year>2023</year>
          )
          <fpage>4765</fpage>
          -
          <lpage>4800</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tanha</surname>
          </string-name>
          , M. van Someren,
          <string-name>
            <given-names>H.</given-names>
            <surname>Afsarmanesh</surname>
          </string-name>
          ,
          <article-title>Semi-supervised self-training for decision tree classifiers</article-title>
          ,
          <source>International Journal of Machine Learning and Cybernetics</source>
          <volume>8</volume>
          (
          <year>2017</year>
          )
          <fpage>355</fpage>
          -
          <lpage>370</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Q.-W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.-F.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Learning from weak-label data: a deep forest expedition</article-title>
          ,
          <source>in: Proceedings of the 34th AAAI Conference on Artificial Intelligence</source>
          , New York,
          <year>2020</year>
          , pp.
          <fpage>6251</fpage>
          -
          <lpage>6258</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ilídio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cerri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F. K.</given-names>
            <surname>Nakano</surname>
          </string-name>
          ,
          <article-title>Deep forests with tree-embeddings and label imputation for weak-label learning</article-title>
          ,
          <source>in: Proceedings of the 2024 International joint conference on neural networks, IJCNN</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Pliakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vens</surname>
          </string-name>
          ,
          <article-title>Drug-target interaction prediction with tree-ensemble learning and output space reconstruction</article-title>
          ,
          <source>BMC Bioinformatics 21</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          . Publisher: Springer.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gharahighehi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Pliakos</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.</surname>
          </string-name>
          <article-title>Vens, Addressing the cold-start problem in collaborative ifltering through positive-unlabeled learning and multi-target prediction</article-title>
          ,
          <source>Ieee Access</source>
          <volume>10</volume>
          (
          <year>2022</year>
          )
          <fpage>117189</fpage>
          -
          <lpage>117198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Ning</surname>
          </string-name>
          , G. Karypis, Slim:
          <article-title>Sparse linear methods for top-n recommender systems</article-title>
          ,
          <source>in: 2011 IEEE 11th international conference on data mining, IEEE</source>
          ,
          <year>2011</year>
          , pp.
          <fpage>497</fpage>
          -
          <lpage>506</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gharahighehi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Venturini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ghinis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Cornillie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vens</surname>
          </string-name>
          ,
          <article-title>Extending bayesian personalized ranking with survival analysis for mooc recommendation</article-title>
          ,
          <source>in: Adjunct Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>56</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>S.</given-names>
            <surname>Rendle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Freudenthaler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Gantner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Schmidt-Thieme</surname>
          </string-name>
          ,
          <article-title>Bpr: Bayesian personalized ranking from implicit feedback</article-title>
          ,
          <source>arXiv preprint arXiv:1205.2618</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A</given-names>
            <surname>Multi-Dimensional</surname>
          </string-name>
          Context-Aware
          <source>Recommendation Approach Based on Improved Random Forest Algorithm, IEEE Access 6</source>
          (
          <year>2018</year>
          )
          <fpage>45071</fpage>
          -
          <lpage>45085</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Panagiotakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Papadakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Fragopoulou</surname>
          </string-name>
          ,
          <article-title>A dual hybrid recommender system based on SCoR and the random forest</article-title>
          ,
          <source>Computer Science and Information Systems</source>
          <volume>18</volume>
          (
          <year>2021</year>
          )
          <fpage>115</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Ilídio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cerri</surname>
          </string-name>
          ,
          <article-title>Fast Bipartite Forests for Semi-supervised Interaction Prediction</article-title>
          ,
          <source>in: Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing</source>
          , SAC '24,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2024</year>
          , pp.
          <fpage>979</fpage>
          -
          <lpage>986</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>B.</given-names>
            <surname>Sarwar</surname>
          </string-name>
          , G. Karypis,
          <string-name>
            <given-names>J.</given-names>
            <surname>Konstan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Riedl</surname>
          </string-name>
          ,
          <article-title>Item-based collaborative filtering recommendation algorithms</article-title>
          ,
          <source>in: Proceedings of the 10th international conference on World Wide Web</source>
          ,
          <year>2001</year>
          , pp.
          <fpage>285</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lops</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. De Gemmis</surname>
          </string-name>
          , G. Semeraro,
          <article-title>Content-based recommender systems: State of the art and trends, Recommender systems handbook (</article-title>
          <year>2011</year>
          )
          <fpage>73</fpage>
          -
          <lpage>105</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>R.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. N.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Lukose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scholz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>One-class collaborative ifltering</article-title>
          , in: 2008 Eighth IEEE International Conference on Data Mining, IEEE,
          <year>2008</year>
          , pp.
          <fpage>502</fpage>
          -
          <lpage>511</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>H.</given-names>
            <surname>Steck</surname>
          </string-name>
          ,
          <article-title>Embarrassingly shallow autoencoders for sparse data</article-title>
          ,
          <source>in: The World Wide Web Conference</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>3251</fpage>
          -
          <lpage>3257</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <surname>H</surname>
            .-T. Cheng, L. Koc,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Harmsen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Shaked</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Chandra</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Aradhye</surname>
            ,
            <given-names>G.</given-names>
            Anderson, G.
          </string-name>
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ispir</surname>
          </string-name>
          , et al.,
          <article-title>Wide &amp; deep learning for recommender systems</article-title>
          ,
          <source>in: Proceedings of the 1st workshop on deep learning for recommender systems</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>7</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>D.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. G.</given-names>
            <surname>Krishnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Hofman</surname>
          </string-name>
          , T. Jebara,
          <article-title>Variational autoencoders for collaborative filtering</article-title>
          ,
          <source>in: Proceedings of the 2018 world wide web conference</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>689</fpage>
          -
          <lpage>698</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Jannach</surname>
          </string-name>
          ,
          <article-title>Are we really making much progress? a worrying analysis of recent neural recommendation approaches</article-title>
          ,
          <source>in: Proceedings of the 13th ACM conference on recommender systems</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>101</fpage>
          -
          <lpage>109</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>