<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Actionable Explanations for Student Success Prediction Models: A Benchmark Study on the Quality of Counterfactual Methods</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mustafa Cavus</string-name>
          <email>mustafacavus@eskisehir.edu.tr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jakub Kuzilek</string-name>
          <email>jakub.kuzilek@hu-berlin.de</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Eskisehir Technical University, Department of Statistics</institution>
          ,
          <addr-line>Eskisehir</addr-line>
          ,
          <country country="TR">Turkiye</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Humboldt University of Berlin</institution>
          ,
          <addr-line>Unter den Linden 6, Berlin</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>14</volume>
      <issue>2024</issue>
      <abstract>
        <p>Digital transformation in higher education resulted in a surge of information technology solutions suited for the needs of academia. The massive use of digital technology in education leads to the production of vast amounts of education and learner-related data, enabling advanced data analysis methods to explore and support the learning processes. When focusing on supporting at-risk students, the dominant research focuses on predicting student success. Enabling prediction models to help at-risk students involves a reliable technical solution and a transparent and explainable solution to build trust among the target learners and educators. Counterfactual explanations (aka counterfactuals) from explainable machine learning tools promise to enable trustful explainable models, provided the features are actionable and causal. However, determining the most suitable counterfactual generation method for student success prediction models remains unexplored. This study evaluates standard counterfactual methods -Multi-Objective Counterfactual Explanations, Nearest Instance Counterfactual Explanations, and What-If Counterfactual Explanations. The methods are evaluated using a black-box machine learning model trained on the Open University Learning Analytics dataset, demonstrating their practical usefulness and suggesting concrete steps for model prediction alteration. Our results indicate that the Nearest Instance Counterfactual Explanation method based on the sparsity metric provides the best results regarding several quality criteria. Detailed statistical analysis finds statistically significant diferences between all methods except the diference between the Nearest Instance Counterfactual Explanation and the Multi-Objective Counterfactual Explanation method, which suggests that the methods might be interchangeable in the context of the given dataset. counterfactual explanations, explainable artificial intelligence, contrastive explanations, learning analytics Attribution 4.0 International (CC BY 4.0).</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The pace of digital transformation in higher education
increased over the decade. With this increase, the data
generated by the learners, lecturers, and educational institutions
are multiplied. The data growth enabled the use of advanced
Data Science methods for the analysis within the field of
Learning Analytics [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. With the extensive use of analytical
tools in all areas of human life concerns about security and
privacy emerged, resulting in new data protection
regulations (e.g., GDPR in EU) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Consequently, trust in advanced
analytical tools and Machine Learning methods in higher
education has been reduced. To overcome the distrust, a
new approach called Trusted Learning Analytics emerged
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. The TLA approach emphasizes using ‘white box‘
Machine Learning (ML) methods and systems. Within this
focus, the Explainable Artificial Intelligence (XAI) methods
play a crucial role because they unlock the potential of the
‘black box‘ models for use within the TLA systems [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
      </p>
      <p>
        A typical task in Learning Analytics (LA) is the predictive
modelling of learner success, which enables identifying the
learners needing help with their studies [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The ML model
is trained with historical data collected within the same
educational context. This model is then used as a trigger
for educational intervention to support needy learners (i. e.
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] or [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        In the ML modelling process, black box models, known
for their high predictive accuracy, are often preferred over
interpretable models [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
        ]. The XAI tools are
primarHuman-Centric eXplainable AI in Education Workshop (HEXED 2024),
∗Corresponding author.
†These authors contributed equally.
      </p>
      <p>
        The use of counterfactual explanations in LA has been
explored in several studies [
        <xref ref-type="bibr" rid="ref15 ref16 ref17">15, 16, 17</xref>
        ]. Yet, the focus of
counterfactual explanations is in the frame of delivering
actionable insights to the relevant stakeholders. None of the
studies have investigated the quality of the generated
counterfactual explanations. Facing numerous counterfactual
explanations due to the nature of optimization problems
requires selecting those explanations that fulfil specific criteria
beneficial for the stakeholder. Because of their background,
challenges, and needs diferences, each learner requires
personalized counterfactual [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. Thus, several desired quality
measures that a counterfactual explanation must satisfy.
      </p>
      <p>
        To explore how the typical ML black box model trained
for the predictive modelling of student success within the
frame of TLA, we employed the open-access dataset Open
University Learning Analytics Dataset (OULAD) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] to
answer the following research questions:
© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
      </p>
      <p>RQ1: What is the most appropriate method for generating
CEUR</p>
      <p>ceur-ws.org</p>
      <p>the counterfactual explanations?
RQ2: What is the most relevant quality measure of the
methods for generating counterfactual explanations?</p>
      <p>
        This study compares the qualities of diferent
counterfactual generation methods for students whose success
prediction model developed on the OULAD anticipates failing. It
is essential in two ways: (1) because the missing evaluation
of the counterfactual quality can lead to ineficient
explanations, and this may compromise their trustworthiness [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ],
and (2) there is no uniformly better method for each domain
[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] and this is the first benchmark in the domain of LA.
      </p>
      <p>The remainder of the paper introduces our approach for
analysis and selecting the most appropriate counterfactual
generation method followed by the results and their
discussion. Finally, the conclusions are presented.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methods</title>
      <p>2.1. Data
Dataset. We employed the OULAD dataset released by the
Open University, the largest distance learning institution in
the United Kingdom, to analyse counterfactual generating
methods. The typical courses at OU take approximately
nine months and consist of multiple assignments and a
final exam. The most crucial assignments are Tutor Marked
Assignments (TMAs), which represent milestones in the
course schedule. The dataset contains data about
learners’ demographics, assessment results, and interaction with
Moodle-like Learning Management System (LMS). For the
analysis, we selected STEM course FFF and its presentation
2013J studied by 2283 students. The course contains five
TMAs in weeks 2, 5, 13, 18, and 24. The last TMA was used
as a target variable for model training. Learners can achieve
scores from 0 to 100; we set a threshold for passing to 40
points. The following groups of students were excluded
from the data set: actively withdrawn students (n = 675)
and students who did not submit all TMAs (n = 500). The
resulting dataset contains the data of 1108 students. It
consists of 14 predictors from which 6 of categorical variables
are encoded numerically. The online interactions of
learners with the LMS (i.e., ‘n_clicks_xy‘ variables) have been
computed for the top five most common activity types in
the VLE, and they represent 95% of all student click-stream
data. Table 1 presents the details of selected variables.</p>
      <sec id="sec-2-1">
        <title>2.2. Counterfactual Explanations</title>
        <p>Let  = [ 1,  2, ...,   ] be a data matrix of  observations
from  variables, and  be the response vector. The goal is
to find  ∶  →  that minimizes the expected value of the
loss function  in predictive modelling. A counterfactual
 ′ ∈ ℝ of an observation  ∈ ℝ  is calculated through an
optimization problem:
 
 ′∈ℝ [ (
′),  ′] + (, 
′)
(1)
where ℝ denotes the p-dimensional real space,  denotes
a loss function that penalizes deviation of the prediction
 ( ′) from the interested outcome  ′, and  , represents a
distance function between the observation and its
counterfactual. A counterfactual explanation can be briefly defined
as the necessary changes in one or more than one variable to
lfip the model prediction. The distance function  controls
the distance between the target observation and the
counterfactual. Figure 1 illustrates a counterfactual generation
example. The value of the variable  3 must be changed to
 3′ to flip the model’s prediction  to  ′. To illustrate this in
the context of the OULAD dataset: An at-risk student can
pass the course if the student increases assessment results or
the total number of clicks in the discussion forum before the
ifnal exam .</p>
        <p>
          Counterfactuals aim to minimize the distance between
the target observation and the counterfactual; however,
there are more properties for a counterfactual explanation
[
          <xref ref-type="bibr" rid="ref22 ref23">22, 23</xref>
          ]. Sparsity advocates for a minimal number of
variable alterations, thereby maintaining its simplicity.
Minimality focuses on the smallest possible changes in
variable values. Validity is maintained by minimizing the
disparity between the counterfactual instance, denoted
as  ′, and the observation  while ensuring the model
output aligns with the desired label  ′. Proximity denotes
the necessity of a slight divergence between the factual
and counterfactual features. Plausibility mandates that
counterfactual explanations remain realistic and adhere
closely to the underlying data distribution. There are more
than known 120 counterfactual generation methods; see
[
          <xref ref-type="bibr" rid="ref24">24</xref>
          ] for details. However, we considered three commonly
used counterfactual methods to make comparing the quality
of counterfactuals feasible.
        </p>
        <p>
          What-if counterfactual explanations. What-if method
(WhatIf) finds the observations closest to the observation
 from the other observations in terms of Gower distance,
solving the following optimization problem [
          <xref ref-type="bibr" rid="ref25">25</xref>
          ]:
 ′ ∈  
∈ (, 
′).
        </p>
        <p>
          Multi-objective counterfactual explanations. The
multi-objective counterfactual explanations method (MOC)
objects to find counterfactuals corresponding to the
validity, proximity, sparsity, and plausibility of solving a
multiobjective optimization problem [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]:
 ′ ∈ 
 [  ( ( )̂,
        </p>
        <p>′),   (,  ′),   (,  ′),   (,  )]
where the objectives correspond to the desired properties,
validity, proximity, sparsity, plausibility, respectively.
Thus, it generates valid, proximal, sparse, and plausible
counterfactuals.</p>
        <p>
          Nearest instance counterfactual explanations. The
nearest instance counterfactual explanations method (NICE)
ifnds the observations most similar to the observation in
terms of the heterogenous Euclidean overlap method [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ].
Because of the NICE method, there are two options in the
object function based on the properties proximity, and sparsity,
which can be used in these two ways.
        </p>
        <p>
          The WhatIf method generates valid, proximal, and
plausible counterfactuals. It is shown that the MOC method
generates more counterfactuals than other counterfactual
methods that are closer to the training data and require
(2)
(3)
fewer feature changes [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ]. Moreover, NICE generates the
proximity counterfactuals. However, there is no uniformly
better method in the datasets from diferent domains [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ].
Thus, evaluating the quality of the generated
counterfactual is necessary, and we conduct the experiments in the
following section.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.3. Experiment design</title>
        <p>
          This study focuses on which method provides the highest
quality counterfactual explanations for the student success
prediction model trained using the OULAD dataset. Thus,
our approach is (1) selecting the most appropriate ML model,
(2) generating the counterfactuals, and (3) producing the
evaluation criteria. Modeling. We used forester [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] for
model selection and hyperparameter optimization. It is an
AutoML tool that adjusts the hyperparameters of tree-based
models using Bayesian optimization. The reason for
using this tool instead of manual modelling is its ability to
make Bayesian optimization highly practical with its
relevant parameters. Additionally, the fact that tree-based
models exhibit lower prediction performance than alternative
complex models in classifying tabular datasets [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] supports
the idea that using this tool does not limit model selection.
The number of optimization rounds bayes_iter is taken
as 5, and the number of trained models random_evals is
taken as 10 in the AutoML tool, respectively. forester
returns 28 models, including decision trees, random forests,
XGBoost, LightGBM, and their fine-tuned versions with
Bayesian optimization and random search in Table 2.
Because the best-performing one is a fine-tuned random forest
model with random search —accuracy 0.900, AUC 0.771, and
F1 0.946— the counterfactuals are generated on it.
        </p>
        <p>
          Counterfactual generation. We used
counterfactuals package [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ] to generate the
counterfactual explanations for the at-risk students using the
counterfactual generation methods WhatIf,
proximitybased NICE (NICE_pr), sparsity-based NICE (NICE_sp), and
MOC. The non-actionable variables that are impossible to
change are kept constant, such as gender, disability,
region, age_band, education, imd_band,
num_of_prev_attempts, cummulative_assessment_results.
The MOC, NICE_pr, NICE_sp, and WhatIf methods generate
191, 39, 19, and 120 counterfactuals for the 12 failed students
predicted by the student success prediction model. It is
essential to compare the counterfactual generation methods
in terms of the number of generated counterfactuals because
it shows the diversity of alternative ways to flip the model
decision. The higher number of counterfactuals is better.
The materials for reproducing the experiments performed
and the dataset are accessible in the following anonymized
repository: https://github.com/mcavs/HEXED2024_paper.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Results and discussion</title>
      <p>
        The quality metrics minimality, plausibility, proximity,
sparsity, validity are calculated to evaluate the generated
counterfactuals by the methods WhatIf, NICE_pr, NICE_sp, and
MOC. It should be highlighted that the lower values are
better for each metric. Some user studies have shown that
the users prefer to use the counterfactuals, which perform
well on the criteria in [
        <xref ref-type="bibr" rid="ref30 ref31">30, 31</xref>
        ]. Thus, we compared their
qualities in two steps. First, we used the average values and
the standard deviations of these metrics given in Table 3,
and second, we compared the distribution of the results in
Figure 2.
      </p>
      <p>It is seen that the quality of counterfactuals is quite good
in terms of proximity, plausibility, and validity. However,
the results are not promising for WhatIf in minimality and
sparsity. It is expected because it is known the WhatIf
method generates valid, proximal, and plausible
counterfactuals. Therefore, we do not recommend using this method
in this domain. On the other hand, counterfactuals
generated by the NICE method that optimizes based on sparsity
showed better results in sparsity and other quality metrics
than the one that optimizes based on proximity. There are
diferences between the NICE_pr and NICE_sp in terms of
minimality and sparsity. NICE_sp shows better performance
because it optimizes based on sparsity and the metrics
sparsity and minimality are quite related metrics. Sparsity refers
to the changes in the number of variables while
minimality refers the the smallest possible changes in the variable
values. Therefore, using the NICE_sp method may be
preferred to obtain better-quality explanations in this domain.
Although the MOC method shows results competing with
NICE_sp, it is poor on average.</p>
      <p>Figure 2 shows the distribution of the quality metrics of
the counterfactuals, providing deeper insights. The WhatIf
method appears to produce explanations that are not
minimal compared to the others. Although the NICE_pr was
better than the WhatIf method in this regard, it performed
worse than the other methods. When the methods are
compared in terms of plausibility, it is seen that the WhatIf
is better than the others, but the diference is low. While
the WhatIf method produced fewer proximity explanations,
other methods produced proximity explanations at a similar
level. A similar pattern against the WhatIf has also been
observed for sparsity. As expected, the NICE_sp method
shows the best performance in terms of sparsity.
Surprisingly, no method other than the MOC produced non-validity
explanations. This is the most problematic quality feature
for the MOC. The intriguing observation is the quality of
counterfactuals generated by the MOC is better than the
NICE_pr in terms of proximity, even though the NICE_pr
method aims to create the proximity counterfactuals.</p>
      <p>In summary, the quality of the explanations produced
by the methods compete with each other in terms of both
average and distribution properties, and it is not possible
to say that the NICE_sp method produces the best quality
explanations based on visual outputs alone. Therefore,
using the Kruskal-Wallis test and the pairwise Wilcoxon test,
we statistically test whether the explanations made by the
methods difer. A Kruskal-Wallis test was performed on the
quality metric values of the four methods (MOC, NICE_pr,
NICE_sp, and WhatIf). The diferences between the rank
totals of the methods were significant,  (24) = 48.823,  &lt; .001 .
Post hoc comparisons were conducted using Wilcoxon Tests
with a Benjamini-Hochberg adjusted alpha level of .016. The
diference between the MOC and NICE_pr was no
statistically significant ( = .115) . The other comparisons were
significant. The results of the statistical tests support the
previous results.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Conclusions</title>
      <p>In this study, we explored the possibilities of using XAI tools
in the frame of the TLA research. Our research focused on
deploying the counterfactual explanation methods on the
OULAD dataset containing the demographics, results, and
learner interactions with LMS to answer the following
research questions: 1) What is the most appropriate method
for generating the counterfactual explanations? Selection
of the most suitable method depends on the stakeholder
requirements and the educational context. However,
selecting the most appropriate methods is generally guided
by evaluating standard counterfactual properties: Sparsity,
Validity, Proximity, and Plausibility. The evaluation of our
approach on the OULAD dataset resulted in the finding that
explanations generated using the NICE method based on
sparsity are of higher quality in terms of all considered
metrics than explanations generated through other methods
(Table 3). 2) What is the most relevant quality measure of
the methods for generating counterfactual explanations? As
mentioned before, selecting a method depends highly on the
educational setting. Yet, it might be defined by the relevant
stakeholder as the most essential criteria chosen from those
used as a standard evaluation measure. In addition, the
statistical hypothesis testing results indicate no statistically
significant diference between the Nearest Instance
Counterfactual Explanation and the Multi-Objective Counterfactual
Explanations method, which indicates the requirement for
the deep validation of generated counterfactual
explanations for the at-risk students to avoid misconceptions. This
suggests that the human-in-the-loop is needed even when
selecting the most optimal method in technical validation.
In addition, the counterfactuals provide a simple way to
understand and uncover the issues about learner learning and
open the path to recommendations for possible educational
interventions. Finally, the study has some limitations. Due
to the focus of the study, data drift was not considered, and
only the most common counterfactual explanation methods
were used. Furthermore, we believe that conducting
qualitative studies and evaluating the explanations solely based
on quality metrics would provide further validation for the
ifndings.</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgments</title>
      <p>The work in this paper is supported by the German
Federal Ministry of Education and Research (BMBF), grant no.
16DHBKI045.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>G.</given-names>
            <surname>Siemens</surname>
          </string-name>
          , R. S. d. Baker,
          <article-title>Learning analytics and educational data mining: towards communication and collaboration</article-title>
          ,
          <source>in: Proceedings of the 2nd international conference on learning analytics and knowledge</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>252</fpage>
          -
          <lpage>254</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hoel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Grifiths</surname>
          </string-name>
          , W. Chen,
          <article-title>The influence of data protection and privacy frameworks on the design of learning analytics systems</article-title>
          ,
          <source>in: Proceedings of the seventh international learning analytics &amp; knowledge conference</source>
          ,
          <year>2017</year>
          , pp.
          <fpage>243</fpage>
          -
          <lpage>252</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Drachsler</surname>
          </string-name>
          ,
          <article-title>Trusted learning analytics</article-title>
          ,
          <source>Universität Hamburg</source>
          ,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Papamitsiou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Economides</surname>
          </string-name>
          ,
          <article-title>Learning analytics and educational data mining in practice: A systematic literature review of empirical evidence</article-title>
          ,
          <source>Journal of Educational Technology &amp; Society</source>
          <volume>17</volume>
          (
          <year>2014</year>
          )
          <fpage>49</fpage>
          -
          <lpage>64</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>K. E.</given-names>
            <surname>Arnold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Pistilli</surname>
          </string-name>
          ,
          <article-title>Course signals at purdue: Using learning analytics to increase student success</article-title>
          ,
          <source>in: Proceedings of the 2nd international conference on learning analytics and knowledge</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>267</fpage>
          -
          <lpage>270</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H.</given-names>
            <surname>Waheed</surname>
          </string-name>
          , S.-U. Hassan,
          <string-name>
            <given-names>N. R.</given-names>
            <surname>Aljohani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hardman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Alelyani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nawaz</surname>
          </string-name>
          ,
          <article-title>Predicting the academic performance of students from vle big data using deep learning models</article-title>
          ,
          <source>Computers in Human behavior 104</source>
          (
          <year>2020</year>
          )
          <fpage>106189</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Adnan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Habib</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ashraf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mussadiq</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Raza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bashir</surname>
          </string-name>
          , S. U. Khan,
          <article-title>Predicting at-risk students at diferent percentages of course length for early intervention using machine learning models</article-title>
          ,
          <source>Ieee Access</source>
          <volume>9</volume>
          (
          <year>2021</year>
          )
          <fpage>7519</fpage>
          -
          <lpage>7539</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R.</given-names>
            <surname>Guidotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Monreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ruggieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Turini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Giannotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pedreschi</surname>
          </string-name>
          ,
          <article-title>A survey of methods for explaining black box models, ACM computing surveys (CSUR) 51 (</article-title>
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>42</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>P.</given-names>
            <surname>Biecek</surname>
          </string-name>
          , T. Burzykowski,
          <article-title>Explanatory model analysis: explore, explain, and examine predictive models, Chapman</article-title>
          and Hall/CRC,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Holzinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Saranti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Biecek</surname>
          </string-name>
          , W. Samek,
          <article-title>Explainable ai methods-a brief overview</article-title>
          , in: International workshop on extending explainable
          <source>AI beyond deep models and classifiers</source>
          , Springer,
          <year>2022</year>
          , pp.
          <fpage>13</fpage>
          -
          <lpage>38</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>C. Molnar,</surname>
          </string-name>
          <article-title>Interpretable machine learning</article-title>
          ,
          <source>Lulu. com</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>A. Bhattacharya,</surname>
          </string-name>
          <article-title>Applied Machine Learning Explainability Techniques: Make ML models explainable and trustworthy for practical applications using LIME, SHAP, and more</article-title>
          ,
          <source>Packt Publishing Ltd</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cavus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Stando</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Biecek</surname>
          </string-name>
          ,
          <article-title>Glocal explanations of expected goal models in soccer</article-title>
          ,
          <source>arXiv preprint arXiv:2308.15559</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Artelt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <article-title>On the computation of counterfactual explanations-a survey</article-title>
          , arXiv preprint arXiv:
          <year>1911</year>
          .
          <volume>07749</volume>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M.</given-names>
            <surname>Tsiakmaki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Ragos</surname>
          </string-name>
          ,
          <article-title>A case study of interpretable counterfactual explanations for the task of predicting student academic performance</article-title>
          ,
          <source>in: 2021 25th International Conference on Circuits, Systems</source>
          , Communications, and
          <string-name>
            <surname>Computers</surname>
          </string-name>
          (CSCC),
          <year>2021</year>
          , pp.
          <fpage>120</fpage>
          -
          <lpage>125</lpage>
          . doi:
          <volume>10</volume>
          .1109/CSCC53858.
          <year>2021</year>
          .
          <volume>00029</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bai</surname>
          </string-name>
          ,
          <article-title>Visual analytics of potential dropout behavior patterns in online learning based on counterfactual explanation</article-title>
          ,
          <source>Journal of Visualization</source>
          <volume>26</volume>
          (
          <year>2023</year>
          )
          <fpage>723</fpage>
          -
          <lpage>741</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>F.</given-names>
            <surname>Afrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hamilton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Thevathyan</surname>
          </string-name>
          ,
          <article-title>Exploring counterfactual explanations for predicting student success</article-title>
          ,
          <source>in: International Conference on Computational Science</source>
          , Springer,
          <year>2023</year>
          , pp.
          <fpage>413</fpage>
          -
          <lpage>420</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B. I.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chimedza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Bührmann</surname>
          </string-name>
          ,
          <article-title>Individualized help for at-risk students using model-agnostic and counterfactual explanations</article-title>
          ,
          <source>Education and Information Technologies</source>
          (
          <year>2022</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kuzilek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hlosta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zdrahal</surname>
          </string-name>
          , Open university learning analytics dataset,
          <source>Scientific data 4</source>
          (
          <year>2017</year>
          )
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Artelt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vaquet</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Velioglu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hinder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Brinkrolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schilling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hammer</surname>
          </string-name>
          ,
          <article-title>Evaluating robustness of counterfactual explanations</article-title>
          ,
          <source>in: 2021 IEEE Symposium Series on Computational Intelligence (SSCI)</source>
          , IEEE,
          <year>2021</year>
          , pp.
          <fpage>01</fpage>
          -
          <lpage>09</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hofheinz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Binder</surname>
          </string-name>
          , G. Casalicchio,
          <source>counterfactuals: An R Package for Counterfactual Explanation Methods</source>
          ,
          <year>2023</year>
          .
          <source>R package version 0.1.2.</source>
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>S.</given-names>
            <surname>Wachter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Mittelstadt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <article-title>Counterfactual explanations without opening the black box: Automated decisions and the gdpr</article-title>
          ,
          <source>Harv. JL &amp; Tech. 31</source>
          (
          <year>2017</year>
          )
          <fpage>841</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <surname>A</surname>
          </string-name>
          .
          <string-name>
            <surname>-H. Karimi</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Barthe</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Balle</surname>
            ,
            <given-names>I. Valera</given-names>
          </string-name>
          ,
          <article-title>Modelagnostic counterfactual explanations for consequential decisions</article-title>
          ,
          <source>in: International conference on artificial intelligence and statistics</source>
          , PMLR,
          <year>2020</year>
          , pp.
          <fpage>895</fpage>
          -
          <lpage>905</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>G.</given-names>
            <surname>Warren</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Keane</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gueret</surname>
          </string-name>
          , E. Delaney,
          <article-title>Explaining groups of instances counterfactually for xai: a use case, algorithm and user study for groupcounterfactuals</article-title>
          ,
          <source>arXiv preprint arXiv:2303.09297</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>J.</given-names>
            <surname>Wexler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pushkarna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Bolukbasi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wattenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viégas</surname>
          </string-name>
          , J. Wilson,
          <article-title>The what-if tool: Interactive probing of machine learning models</article-title>
          ,
          <source>IEEE Transactions on Visualization and Computer Graphics</source>
          <volume>26</volume>
          (
          <year>2019</year>
          )
          <fpage>56</fpage>
          -
          <lpage>65</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dandl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Molnar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Binder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Bischl</surname>
          </string-name>
          ,
          <article-title>Multiobjective counterfactual explanations</article-title>
          ,
          <source>in: International Conference on Parallel Problem Solving from Nature</source>
          , Springer,
          <year>2020</year>
          , pp.
          <fpage>448</fpage>
          -
          <lpage>469</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>D.</given-names>
            <surname>Brughmans</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Leyman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Martens</surname>
          </string-name>
          ,
          <article-title>Nice: an algorithm for nearest instance counterfactual explanations, Data mining and knowledge discovery (</article-title>
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>39</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kozak</surname>
          </string-name>
          , H. Ruczyński,
          <article-title>forester: A novel approach to accessible and interpretable automl for tree-based modeling</article-title>
          ,
          <source>in: AutoML Conference 2023 (ABCD Track)</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>L.</given-names>
            <surname>Grinsztajn</surname>
          </string-name>
          , E. Oyallon, G. Varoquaux,
          <article-title>Why do tree-based models still outperform deep learning on typical tabular data?</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          <volume>35</volume>
          (
          <year>2022</year>
          )
          <fpage>507</fpage>
          -
          <lpage>520</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>N.</given-names>
            <surname>Spreitzer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Haned</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. van der Linden</surname>
          </string-name>
          ,
          <article-title>Evaluating the practicality of counterfactual explanations</article-title>
          ,
          <source>in: Workshop on Trustworthy and Socially Responsible Machine Learning</source>
          ,
          <source>NeurIPS</source>
          <year>2022</year>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>M.</given-names>
            <surname>Förster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Hühn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Klier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kluge</surname>
          </string-name>
          ,
          <article-title>Capturing users' reality: A novel approach to generate coherent counterfactual explanations (</article-title>
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>