<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Early Warning System that combines Machine Learning and a Rule-Based Approach for the Prediction of Cancer Patients' Unplanned Visits</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>H. F. Witschel</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>E. Laurenzi</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>S. Jüngling</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Y. Kadvany</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>A. Trojan</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>FHNW University of Applied Sciences and Arts Northwestern Switzerland</institution>
          ,
          <addr-line>Riggenbachstrasse 16, CH-4600 Olten</addr-line>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>mobile Health AG</institution>
          ,
          <addr-line>Falkenstrasse 21, CH-8008 Zürich</addr-line>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we present the early results of a hybrid intelligent approach that consists of an interpretable rule-based machine learning model for the prediction of unplanned visits of cancer patients. The approach is contextualized within the area of personalized medicine and will contribute to the development of an early-warning system (EWS) whose goal is to support cancer patients to cope with their daily symptoms remotely, by avoiding as much as possible physician visits. The interpretability of rules makes it possible to involve medical experts in the learning process who can accept, reject or modify rules, e.g. by adding conditions to increase their precision. The results appear to be promising as the discovered rules provide value in the identification of critical situations of patients. Experts also suggested the modification of rules for recommending not only visits to a physician, but also other (less costly) actions, such as increasing the dosage of pain killers - an extension to the EWS that would not have been possible without our hybrid approach. Overall, our first experiments showed how a new form of “dialogue” between the experts and the machine learning algorithm started to emerge.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>In this position paper, we describe ongoing work in the area of personalized medicine. Our
goal is to develop an early-warning system that can help cancer patients get additional medical
insights – above all judge the criticality of their current health status – making them aware of
when exactly contacting or visiting a physician is required.</p>
      <p>We intend to reach this goal by implementing a rule-based warning system, where rules are
constructed via a combination of knowledge engineering, based on the expertise of physicians,
and machine learning applied to patient diary data.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        There is ample work around the automated prediction of medical events and conditions, e.g.
septic shock [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], cardiac arrest [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], suicide attempts [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] or hospital re-admission [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. A survey
of clinical risk prediction approaches in general [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], including a wide variety of approaches
based on data analyses, concludes that machine learning (ML) is generally the most successful
of these approaches.
      </p>
      <p>
        While ML seems to be successful in improving predictions, early-warning systems (EWS)
based on such predictions are not necessarily considered helpful by clinicians [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], e.g. because the
reasons for raised alerts are not understood. The study revealed a lack of trust, caused primarily
by poor transparency of ML models. As argued by Rudin [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], one should not rely on explanation
models applied to black-box ML models, but rather on genuinely human-interpretable models
to create such transparency. Rudin also points out that such interpretable models can be
competitive with e.g. deep learning when meaningful structured features exist and that their
transparency can result in better models through improved insights from the testing phase.
In fact, other researchers point out that the value of transparent or directly interpretable ML
models might be limited or even harmful when human experts are trying to challenge an ML
model based on a (sometimes overwhelming) explanation. [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. Instead, as Ghassemi et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
point out, “these methods are incredibly useful for model troubleshooting and systems audit,
both of which can be used to improve model performance.” That is, there is a strong belief
that studying an ML model can be an inspiration for experts and expert feedback can lead to
improved models.
      </p>
      <p>
        Expert knowledge is frequently used to improve ML models, e.g. by weighting features [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] or
augmenting training data [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Gennatas et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] have shown how to improve a rule-based ML
model by integrating human expertise, by studying discrepancy in human and ML assessment
of clinical risk.
      </p>
      <p>
        Our study is based on a similar idea. However, contrary to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and following the arguments
of Rudin [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], we choose to work with a learning algorithm that produces directly interpretable
rules. With this, a rich set of possibilities for interaction between human and ML emerges. Of
course, other directly human-interpretable models exist that would lend themselves to such
interactions, above all Bayesian Networks, where experts may intervene by estimating prior
probabilities, and which have been successfully used for such purpose [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ]. However, as
pointed out e.g. by Botsas et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], Bayesian approaches address and require knowledge
about (internal) parameters of the ML model whereas rules can be interpreted, formulated and
enhanced by experts based solely on an understanding of input-output associations. This is
why we focus on rule-based models.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Background</title>
      <p>The focus of our early-warning system (EWS) is on cancer patients, undergoing diverse
treatments with sometimes considerable side efects. While some of these side efects are “normal”, we
want to predict actually critical situations, which may be due to side efects or other conditions
in which the patients should quickly consult a physician.</p>
      <sec id="sec-3-1">
        <title>3.1. Data set</title>
        <p>
          Our starting point is a diary kept by cancer patients in the form of a mobile app where they
enter, on a daily basis, their general well-being, current symptoms, treatments applied and
Attribute(s)
Birth year
Sex
Primary tumor
Wellbeing
Therapy form
Drugs
Symptom strengths
Diagnosis terms
Note terms
Unplanned visit
free-text notes. When patients entered the study, their initial diagnosis is captured as free text,
together with their date of birth, sex, primary tumor, diagnosis and therapy start date, as well
as frequency of therapy (from daily to 4-weekly, see also Table 1). For more details regarding
the origin of the data, see [
          <xref ref-type="bibr" rid="ref15">15, 16</xref>
          ]. On the whole, our dataset comprises 16,670 diary entries of
266 patients, most of them sufering from breast cancer.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Task definition</title>
        <p>The data also contains information regarding unplanned visits of these patients to their physician
or to the hospital, which we equate with situations where a warning should have been raised
and which we thus want to learn to predict. The ground truth for training our early-warning
system (EWS) comes from these unplanned visits of patients.</p>
        <p>In our data, each combination of patient and day represents an instance, i.e. a training
example. Based on our knowledge of unplanned visits, we associate a class attribute “unplanned
visit” with each such instance. We set the value of this attribute to “yes” not only on the day the
unplanned visit occurred but also on the three days before – based on the goal of constructing
an early warning system that should foresee problems at least some days ahead. The medical
expert involved in our study estimated that a time horizon of three days should be realistic.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Attributes</title>
        <sec id="sec-3-3-1">
          <title>All attributes are summarised in Table 1.</title>
          <p>Note that drugs and symptom strengths are encoded via a series of attributes, each
representing the presence or strength on the given day. For encoding symptom strengths on a scale
from 0 to 100, patients receive a guideline with definitions and descriptions of values that have
been carefully developed by oncologists and based on the Common Terminology Criteria for
Adverse Events (CTCAE1).</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>1https://ctep.cancer.gov/protocolDevelopment/electronic_applications/ctc.htm</title>
          <p>Since patients will only actively enter symptoms and medication that have actually occurred
on a given day, symptom strengths and values for drugs are mostly unavailable. We have chosen
to represent these as missing values instead of assigning a value of 0 because discovered rules
will otherwise contain conditions such as “stomach ache = 0”. While one can envision situations
in which such conditions might be useful, representing the non-presence of symptoms or drugs
by 0s resulted in too many useless rule conditions in our first experiments.</p>
          <p>We have used the free-text attributes “diagnosis” and “note” which represent a patient’s
diagnosis details written by a physician upon entering the study (has the same values for all
days where the given patient was part of the study) and notes (optionally) captured by the
patients on a daily basis. We have vectorised these string attributes using TD/IDF weights,
resulting attributes were prefixed with “ _” and “_”, respectively.</p>
          <p>Although enhancing semantics through e.g. word embeddings might have improved the
results, such approaches were deliberately not used to ensure a maximum of readability of
discovered rules.</p>
          <p>Obviously, the set of attributes used here represents a first “starting point” for the analysis.
Advanced feature engineering may e.g. also incorporate a certain history of wellbeing or
symptom development over time or look at missing patient entries of previous days etc. However,
such feature engineering is not in the center of the current study and will be done at a later
stage.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Approach</title>
      <p>While our work is ongoing, we have already gathered insights during a workshop with a
medical expert and while analysing the diary data with a rule-based classifier [ 17]: In our initial
experiments and discussions, we found that
• expert-defined rules tend to be more generic than ML-provided ones, i.e. ML may be able
to contribute specific patterns that experts do not readily think of
• it is possible to derive a manageable set of interpretable rules with the above-mentioned
ML algorithm including also textual features from e.g. notes in patients’ diaries. Some of
these rules are shown and discussed in Section 5 below.
• building an early-warning system usually implies working with a heavily imbalanced
data set. This is also true in our case: critical situations are rare, i.e. out of the available
16,670 patient-date combinations, only 166 (1%) represent cases where the patient had an
unplanned visit within the next 3 days, i.e. a situation where a warning should be raised.</p>
      <p>To account for the last of these points, we use cost-sensitive classification [ 18] to make the
rule learner more sensitive, assigning a far higher cost to false negatives (i.e. unplanned visit
not recommended when it is necessary) than false positives (i.e. unplanned visit recommended
when it is not necessary). However, we do configure the rule learner to generate rules with a
minimum support of 5, such that rules originating from just one unplanned visit of one patient
are ruled out.</p>
      <p>However, this will still result in a possibly high number of rules that have rather low support
and may not generalize well, requiring human inspection. For rules with medium or low support,
one of the major contributions of the medical expert is thus to judge whether a rule describes a
critical situation that is rare, but valid (i.e. may occur also in other patients and should lead to
raising a warning) or whether the rule is based on peculiarities of the training data that will not
generalize to other patients.</p>
      <p>Based on these findings, we propose the following division of labor between ML and expert:
1. The human expert states a set of rules .
2. We apply the rule learner [17] to generate a set of rules .
3. Rules from both  and  are evaluated based on a cost matrix where false negatives
(FNs) have higher cost than false positives (FPs, false alarms). Rules are ranked by cost.
4. ML rules are inspected by the human expert in the ranked order. The expert can suggest
to drop a rule, but also to modify it, e.g. dropping or adding a condition. Modified rules
will be evaluated and accepted if their cost on the test set is acceptable.
5. A detailed error analysis is performed on each resulting rule, eliciting causes of both FNs
and FPs. We expect that this can result in e.g. creation of new features or re-sampling of
training data.
6. Modifications are made based on insights from step 5 and the entire process is repeated
from step 2 until no further improvement results in step 3. Given that the human expert
might receive new insights in the process, iteration may even start from step 1.</p>
      <p>While several steps in this process are common in interactive ML, the novelty of our approach
lies in the explicitness and degree of interpretability of the ML model, enabling fine-grained
interventions in steps 4 and 5, where ML and human can exchange knowledge using the same
language (of rule conditions). These interventions will be illustrated in the next sections.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Preliminary Findings</title>
      <p>As a first step towards validating our suggested approach, we performed its first 4 steps. Step
1 of the process was done by means of a workshop where the medical expert was invited to
ifrst look at a number of example cases where unplanned visits had happened. In a second step,
the expert formulated some rules, which are shown in Figure 5. The expert developed these
rules by thinking of the most frequent categories of problems that his patients tend to have, e.g.
lung, cardiac, respiratory, kidney problems, infections or side efects of cancer drugs – see last
column of the table. Interestingly, although the rules were at least in some part inspired by the
examples that we previously discussed, the rules themselves – when applied to the data – do
not match any of the situations where an unplanned visit occurred within 3 days. This may
indicate several issues, e.g. that there might be critical situations in the data that did not lead to
an unplanned visit – but where raising a warning might have nevertheless been beneficial. On
the other hand, it may also indicate how hard it can be to formulate rules that are capable of
predicting unplanned visits.</p>
      <p>In the second step of our process, we applied the rule learner to our data set. We applied
cost-sensitive classification with a range of diferent cost matrices. The results we report here
were obtained by assigning a cost of 1 to false positive predictions (“false alarms”) and a cost of
10 to false negatives, i.e. critical situations that are not recognised. This approach was confirmed
by our medical experts who stated that having several more false alarms is acceptable when one
can discover additional critical situations and thus alleviate patients’ problems more efectively.</p>
      <p>While using the same cost for both types of errors (i.e. a ratio of 1:1) did not produce any
rules – i.e. the rule learner would always predict “no” – the number and properties or rules
were rather similar when using e.g. a ratio of 1:20 instead of 1:10.</p>
      <p>We then computed the cost of each single rule, as well as its precision, recall (which is expectly
small for each individual rule) and F-measure. Following step 3 of our process, we then sorted
the rules by cost. Figure 5 shows the 6 rules with lowest cost.</p>
      <sec id="sec-5-1">
        <title>5.1. Performance of machine-learned rule set</title>
        <p>The results in Figure 5 were obtained by learning rules from and applying them to the entire
training set. To get an impression of the performance of the corresponding model, we
additionally performed a 10-fold cross-validation. The confusion matrix obtained in this evaluation is
shown in the lower left corner of Figure 5.1.</p>
        <p>Overall, our machine-learned rule set discovered 47 out of the 166 critical situations (28.3%
recall), while generating 263 false alarms (15.2% precision). By applying the 1:10 cost matrix to
the confusion matrix, we see that a cost of 1453 results for our rule set, compared to a cost of
1660 that the baseline achieves. When we use a ratio of 1:20 for the cost-sensitive rule learner,
the model will recognise 54 critical situations (i.e. 7 more than with the 1:10 model), but at a
cost of an extra 104 false alarms, i.e. overall 367 false positives.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Interpretation and modification by experts</title>
        <p>Finally, we performed step 4 of our process by discussing the 12 best rules (according to their
cost) with two medical experts. The findings of our discussion can be summarised as follows:
• Out of the 12 rules, 3 were accepted in their original version
• Another 3 rules were rejected. This was because symptoms were not deemed critical and
in some cases, with unclear interpretation.
• The remaining 6 rules were accepted with modifications. In 2 cases, these modifications
were additional conditions (e.g. additional symptoms that would make a situation truly
critical). The additional conditions additionally involved new attributes in both cases and
these were both related to the trend, i.e. will require to construct new features that take
the historical development of symptom strengths into account. In the remaining 4 cases,
we discovered an interesting new insight: initially, we built rules that predict unplanned
patient visits to their physician or a hospital. However, the medical experts suggested
that sometimes rules do predict a situation that requires action, but not necessarily a
visit. Thus, a diferent kind of alert could be raised advising a patient or the home care
institution to e.g. increase the dose of pain killers – instead of seeing a doctor. It could
also advise patients to further monitor certain symptoms and only contact the doctor
when they get worse. This is an adjustment to the system that would not be possible
when working with black-box machine learning models.</p>
        <p>To illustrate these two types of modifications (see bold-printed terms above), let us look more
closely at the first two rules in Figure 5:
• The first rule suggests that an unplanned visit will occur if the term “clip marker” appears
in the diagnosis details of a patient and if her wellbeing drops below a level of 75 (which
is still relatively high). This was explained by our medical experts by saying that breast
cancer patients receive a neoadjavant chemotherapy before a surgery, during which the
tumor shrinks (which is why its position is marked with a “clip marker” – i.e. that term
correlates with a specific treatment). The chemotherapy impacts the wellbeing negatively.
Since this is to be expected, the rule was judged as only partially useful. However, the
experts remarked that a warning should be raised if two additional conditions apply,
namely when a) the trend of wellbeing is negative over several days and when b) nausea
or fatigue appear as accompanying symptoms. This serves as an example of
humanrecommended additional conditions in ML-discovered rules.
• The second rule recommends to raise a warning when the drug “Endoxan” is taken by a
patient, the term “lymphangiosis” appears in the diagnosis and the patient’s wellbeing
drops below 47. The experts identified this as a situation of palliative care. Again, a
visit to the physician did not seem necessary. However, raising an alert may make sense
indicating – e.g. to the home care service – to intensify measures to alleviate the sufering
and ensure a higher wellbeing. This is an example of a newly discovered type of alert.</p>
        <p>In summary, we can see that a) there is hope that a machine learning approach to the
discovery of rules may provide value and that a corresponding model will be able to discover
several critical situations, that b) the utility of a machine-learned rule set will be limited because
increasing its coverage (recall) is possible, but comes at the cost of lower precision, i.e. more false
alarms and that c) the analysis of rules by medical experts not only results in rule modifications
and suggestions for feature engineering, but also – in our case – in specific types of actions
entailed by predictions that allow for more fine-grained recommendations to be made by the
Early-Warning System.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>In this paper, we presented our work-in-progress toward the development of an Early-Warning
System (EWS) that aims to detect critical situations for cancer patients, based on a diary of
symptoms. For this, we developed a hybrid intelligent approach that takes the form of an
interpretable rule-based ML model. On the one hand, rules are learned based on the correlation
between symptoms and unplanned visits. On the other hand, the rules are checked and improved
or discarded by medical experts. Results are promising: the discussions with medical experts
have shown that ML-based rule discovery makes experts think of more specific contexts than
when prompted for rules in a general way. Several of the discovered rules are not only capable
of predicting unplanned visits in our data, but were also accepted by experts as being generally
valid. Other rules had to be modified by adding further conditions. The most interesting
discovery was that some rules were considered not to predict truly critical situations, but
nevertheless useful to trigger the recommendation of certain actions other than a visit to the
physician or hospital. This shows how a “knowledge exchange” between humans and machine
can lead to a better overall understanding of how an EWS can be optimised.</p>
      <p>Next steps comprehend advanced feature engineering in which we will e.g. consider the
history of well-being, symptom development over time, and missing patient entries of previous
days. It will also be interesting to investigate the efect of further knowledge engineering
activities, such as grouping symptoms in a meaningful way and including the resulting symptom
categories as new features.
systemic therapy: Prospective, multicenter, observational clinical trial, Journal of Medical
Internet Research 23 (2021) e29271.
[16] A. Trojan, B. Bättig, M. Mannhart, B. Seifert, M. N. Brauchbar, M. Egbring, et al., Efect
of collaborative review of electronic patient-reported outcomes for shared reporting in
breast cancer patients: descriptive comparative study, JMIR cancer 7 (2021) e26950.
[17] W. W. Cohen, Repeated incremental pruning to produce error reduction, in: Machine</p>
      <p>Learning Proceedings of the Twelfth International Conference ML95, 1995.
[18] P. Domingos, Metacost: A general method for making classifiers cost-sensitive, in:
Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery
and data mining, 1999, pp. 155–164.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Ginestra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Giannini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. D.</given-names>
            <surname>Schweickert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Meadows</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Lynch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Pavan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. J.</given-names>
            <surname>Chivers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Draugelis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. J.</given-names>
            <surname>Donnelly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. D.</given-names>
            <surname>Fuchs</surname>
          </string-name>
          , et al.,
          <article-title>Clinician perception of a machine learning-based early warning system designed to predict severe sepsis and septic shock</article-title>
          ,
          <source>Critical care medicine 47</source>
          (
          <year>2019</year>
          )
          <fpage>1477</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chae</surname>
          </string-name>
          , H.-W. Gil,
          <string-name>
            <given-names>N.-J.</given-names>
            <surname>Cho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Machine learning-based cardiac arrest prediction for early warning system</article-title>
          ,
          <source>Mathematics</source>
          <volume>10</volume>
          (
          <year>2022</year>
          )
          <year>2049</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ye</surname>
          </string-name>
          , M. Liu,
          <string-name>
            <given-names>M.</given-names>
            <surname>Xia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Sabo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Markovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Stearns</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kanov</surname>
          </string-name>
          , et al.,
          <article-title>Development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records</article-title>
          ,
          <source>Translational psychiatry 10</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Lodhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ansari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. M.</given-names>
            <surname>Keenan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Wilkie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. A.</given-names>
            <surname>Khokhar</surname>
          </string-name>
          ,
          <article-title>Predicting hospital re-admissions from nursing care data of hospitalized patients</article-title>
          ,
          <source>in: Industrial Conference on Data Mining</source>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>181</fpage>
          -
          <lpage>193</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L. M.</given-names>
            <surname>Bull</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lunt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. P.</given-names>
            <surname>Martin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Hyrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Sergeant</surname>
          </string-name>
          ,
          <article-title>Harnessing repeated measurements of predictor variables for clinical risk prediction: a review of existing methods</article-title>
          ,
          <source>Diagnostic and prognostic research 4</source>
          (
          <year>2020</year>
          )
          <fpage>1</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>
          ,
          <source>Nature Machine Intelligence</source>
          <volume>1</volume>
          (
          <year>2019</year>
          )
          <fpage>206</fpage>
          -
          <lpage>215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ghassemi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Oakden-Rayner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. L.</given-names>
            <surname>Beam</surname>
          </string-name>
          ,
          <article-title>The false hope of current approaches to explainable artificial intelligence in health care</article-title>
          ,
          <source>The Lancet Digital Health</source>
          <volume>3</volume>
          (
          <year>2021</year>
          )
          <fpage>e745</fpage>
          -
          <lpage>e750</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Poursabzi-Sangdeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. G.</given-names>
            <surname>Goldstein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Hofman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. W. Wortman</given-names>
            <surname>Vaughan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <article-title>Manipulating and measuring model interpretability</article-title>
          ,
          <source>in: Proceedings of the 2021 CHI conference on human factors in computing systems</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>52</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Shi</surname>
          </string-name>
          , S.-y. Zhang, L.-m. Qiu,
          <article-title>Credit scoring by feature-weighted support vector machines</article-title>
          , Journal of Zhejiang University SCIENCE C
          <volume>14</volume>
          (
          <year>2013</year>
          )
          <fpage>197</fpage>
          -
          <lpage>204</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mollaysa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kalousis</surname>
          </string-name>
          , E. Bruno,
          <string-name>
            <given-names>M.</given-names>
            <surname>Diephuis</surname>
          </string-name>
          ,
          <article-title>Learning to augment with feature side-information</article-title>
          ,
          <source>in: Asian Conference on Machine Learning, PMLR</source>
          ,
          <year>2019</year>
          , pp.
          <fpage>173</fpage>
          -
          <lpage>187</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E. D.</given-names>
            <surname>Gennatas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. H.</given-names>
            <surname>Friedman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. H.</given-names>
            <surname>Ungar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Pirracchio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Eaton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Reichmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Interian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Luna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. B.</given-names>
            <surname>Simone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Auerbach</surname>
          </string-name>
          , et al.,
          <article-title>Expert-augmented machine learning</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences</source>
          <volume>117</volume>
          (
          <year>2020</year>
          )
          <fpage>4571</fpage>
          -
          <lpage>4577</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>M. J. Flores</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Nicholson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Brunskillc</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Korbb</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Mascarod</surname>
          </string-name>
          ,
          <article-title>Incorporating expert knowledge when learning Bayesian network structure: Heart failure as a case study</article-title>
          ,
          <source>Technical Report, Technical Report</source>
          <year>2010</year>
          /3,
          <string-name>
            <given-names>Bayesian</given-names>
            <surname>Intelligence</surname>
          </string-name>
          ,
          <year>2010</year>
          , http://dx. doi. org/10 . . . ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Deng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rainey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. Lu,</surname>
          </string-name>
          <article-title>Integrating machine learning with human knowledge</article-title>
          ,
          <source>Iscience</source>
          <volume>23</volume>
          (
          <year>2020</year>
          )
          <fpage>101656</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Botsas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. R.</given-names>
            <surname>Mason</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. K.</given-names>
            <surname>Matar</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Pan</surname>
          </string-name>
          ,
          <article-title>Rule-based evolutionary bayesian learning</article-title>
          ,
          <source>arXiv preprint arXiv:2202.13778</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>A.</given-names>
            <surname>Trojan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Leuthold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Thomssen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rody</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Winder</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Jakob</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Egger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>U.</given-names>
            <surname>Held</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jackisch</surname>
          </string-name>
          ,
          <article-title>The efect of collaborative reviews of electronic patient-reported outcomes on the congruence of patient-and clinician-reported toxicity in cancer patients receiving</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>