<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Pavia, Italy
* Corresponding author.
$ a.abu-hanna@amsterdamumc.nl (A. Abu-Hanna); i.vagliano@amsterdamumc.nl (I. Vagliano)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Large language models versus static word embeddings to predict Acute Kidney Injury in the Intensive Care Unit: Does context matter?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rick van Slobbe</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Drahomira Herrmannova</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Elia S. Lima-Walton</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ameen Abu-Hanna</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Iacopo Vagliano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Amsterdam Public Health Research Institute</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Amsterdam UMC, University of Amsterdam</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Elsevier B.V.</institution>
          ,
          <addr-line>Amsterdam</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Vrije Universiteit Amsterdam</institution>
          ,
          <addr-line>De Boelelaan 1105, 1081 HV Amsterdam</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>We investigated the efect of representing clinical notes using contextualized and static word embeddings. We focus on the early prediction of Acute Kidney Injury (AKI) in the intensive care unit (ICU). We also studied the impact of combining clinical notes with clinical variables. We developed six models based on convolutional neural networks. Our models achieved good predictive performance (AUROC 0.77-0.89). Surprisingly, contextualized and static word embeddings yielded similar performance in all our experiments. Combining text with clinical variables improved the results. Our models can support clinicians in promptly recognizing and treating patients with deteriorating AKI and improving patient outcomes in the ICU.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Acute Kidney Injury</kwd>
        <kwd>Contextualized Word Embeddings</kwd>
        <kwd>Deep Learning</kwd>
        <kwd>Natural Language Processing</kwd>
        <kwd>Clinical Prediction Models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Increased monitoring of intensive care unit (ICU) patients holds great potential for the early prediction
of medical outcomes. Deep learning models have emerged to predict clinical outcomes based on
high-dimensional data. In addition to collecting clinical variables’ measurements, physicians also
write free-text clinical notes during an ICU stay. Clinical notes contain information not recorded
via clinical variables, such as previous diagnoses, medication treatments, and disease progression.
Traditionally, clinical text has been analyzed using static word embeddings, such as GloVe [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. More
recently, contextualized word embeddings have been developed, also known as large language models
(LLMs), such as the Bidirectional Encoder Representations from Transformers (BERT).1
      </p>
      <p>
        Acute kidney injury (AKI) is a sudden reduction in kidney function, measured by increased serum
creatinine (SCr) or decreased urine output. AKI is common in the ICU, where it can occur in up to 30%
of the stays [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Convolutional neural networks (CNNs) proved efective to predict AKI [
        <xref ref-type="bibr" rid="ref3 ref4 ref5">3, 4, 5</xref>
        ].
      </p>
      <p>AKI is a particularly relevant case. SCr and/or urine output defines AKI according to widespread
clinical guidelines [6]. However, using these variables as predictors in a prediction model is a form of
incorporation bias and can potentially lead to inflated performance. Furthermore, SCr tests are typically
required when clinicians already have suspicions of a problem related to the kidneys, while urine output
is not commonly measured outside ICU. In the context of early diagnosis, the model can be most useful
if it can timely signal the risk of AKI to clinicians anticipating their suspicions. Thus, a better evaluation
of AKI models would also assess performance without incorporating SCr and urine output. When such
variables are not included, the models would most profit from relying on clinical notes rather than
structured clinical variables, and/or from combining clinical notes with clinical variables. To the best of
our knowledge, an in-depth analysis of AKI models under these conditions has not yet been conducted.</p>
      <p>To address all these problems and conduct such an analysis, we focus on predicting whether a patient
has AKI or will develop AKI during the ICU stay. The prediction occurs after the first 48 hours of a
stay. More specifically, we address the following research question: What efect do contextualized word
embeddings have on the predictive performance compared to static word embeddings to predict AKI within
the ICU stay based on the first 48 hours of the stay?</p>
      <p>To answer this question, we develop six CNN models to process individual and multiple data modalities
(unstructured clinical notes and structured clinical variables) for predicting AKI within the ICU stay.
We compare static and contextualized embeddings using GloVe and BERT, respectively. In a previous
work, we investigated multi-modal models to predict AKI [7]. This study provides the first in-depth
comparison between models using diferent word representations. Despite LLMs’ potential, smaller
embedding might be a promising alternative in clinical settings due to their adaptability, eficiency, and
lower resource demands, especially in resource-constrained environments and/or when the performance
gap is small. We also both include and exclude SCr and urine output.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related work</title>
      <p>
        Various work on AKI prediction in the ICU using machine learning has been published [8]. Most studies
used only clinical variables as input for the proposed models. Sato et al. proposed a CNN architecture
based on one-dimensional convolutional filters [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] using structured clinical data from the first 48 hours
of the ICU stay. Vagliano et al. developed and externally validated interpretable and continuous models
for AKI in the ICU [9, 10].
      </p>
      <p>
        Other studies have leveraged unstructured clinical notes. Le et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] developed a CNN that used
data from the first 53 hours of the ICU stay to predict AKI up to 48 hours before onset. Li et al. proposed
multiple models to predict AKI based on clinical notes from the first 24 hours of the ICU stay [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Some studies also incorporated domain knowledge into their text representations. Vagliano et al. [11]
compared using only clinical variables, versus clinical variables and notes for predicting AKI. Notes
were also enhanced with external knowledge from the Unified Medical Language System (UMLS) and
SNOMED CT knowledge graphs. Brancato et al.[12] further extended the model proposed in [11] to
handle clinical notes enriched with multi-word concepts (in contrast to the single-word concepts used
before).</p>
      <p>The studies previously listed relied on diferent tasks (e.g., predicting AKI after the first 24 vs after the
ifrst 48 hours), preprocessing, and internal validation strategies (e.g., splitting into training validation
and test sets versus cross-validation versus bootstrapping, diferent splits, folds, etc.). Given this method
variation, their performances are not directly comparable even when using the same data. In contrast,
we provide directly comparable results for various modalities. We do so both when including and
excluding SCr and urine output, while previous studies included them. To the best of our knowledge,
such an in-depth comparison for the prediction of AKI has not been performed.</p>
      <p>Recently, among LLMs, transformer architectures have become increasingly popular for many general
text classification tasks with substantial performance increases compared to general machine learning
methods. Mao et al. and Li et al. investigated the eficacy of using BERT in the disease-specific domain
of early AKI prediction [13, 14]. To the best of our knowledge, (1) there is currently no research that
compared static and contextualized embeddings of clinical notes for the prediction of AKI and (2) no
previous work has proposed a multi-modal model based on BERT to predict AKI.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Data, patient cohort and outcome definition</title>
        <p>The data used to develop the model was from the MIMIC-III database [15], which holds de-identified
data of ICU stays in a US hospital between 2001 and 2012. Data includes vital sign measurements,
laboratory results, and clinical notes. We included patients older than 18 years whose stays contained
at least one measurement of SCr or urine output. After applying the inclusion criteria, 44,303 stays
of 32,664 unique patients were selected. The patients’ characteristics can be found in Table 1. The
outcome was whether patients developed AKI during their stay in the ICU. AKI was defined via the
KDIGO guidelines [6].
2Peripheral capillary oxygen saturation</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Text preprocessing and word embeddings</title>
        <p>Clinical notes include clinician progress notes, nursing assessments, radiology reports, and laboratory
results. Discharge summaries were discarded to prevent information leakage from outside the first 48
hours of the stay. Only text from the first 48 hours of the stay was used. Table 2 outlines the descriptive
statistics of the selected clinical notes.</p>
        <p>GloVe embeddings were trained on all selected clinical notes. To ensure a fixed length input, shorter
clinical notes were padded and longer ones were truncated keeping only the first 24 words of each
of the first 150 sentences. About 90% of the sentences contained less than 24 words, and roughly the
great majority of the notes contained less than 150 sentences. Using more sentences and words did
not improve performance. The resulting vectors represent words in a 100-dimensional vector space,
capturing their semantic relationships. Only words with more than 5 occurrences were kept to limit
unreliable representations of rare words.</p>
        <p>BERT models require tokenization of words before being input to the model, this process involves
breaking down a sequence of words into smaller units called tokens, which can be words, subwords, or
characters. For our BERT model, we used 1,536 tokens at most. We considered multiples of 512 as BERT
imposes a maximum input sequence length of 512 tokens. We split the 1,536-tokens-long clinical notes
into three 512-token chunks. About 85% of notes contained less than 1,536 words. Using more tokens
did not improve performance. To ensure a fixed-length input, shorter clinical notes were padded and
longer notes were truncated keeping the first 1,536 tokens.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Preprocessing of structured data</title>
        <p>We selected the same 35 clinical variables and applied the same preprocessing as in our previous
work [7]. These variables covered a variety of factors, including temporal measurements such as heart
rate and platelet count, as well as categorical information such as sex, ethnicity, and the urgency of
stay in the ICU. The full list of variables can be found in another study that used the same population,
variables and preprocessing [7].</p>
        <p>
          The categorical variables were encoded using the one-hot encoding approach, forming a separate
binary feature for each value of the original categorical feature. Temporal features were resampled to
1-hour intervals with a mean aggregation. Variables with less than 50% missing values were imputed
by carrying forward the last available value. Variables with 50% missing values or more were imputed
using mean imputation. Data was capped at the 1 and 99ℎ percentiles to filter out outliers, and
normalized to the [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] range using min-max scaling: ′ = max(−m)− inm(in)() , where  is the original value
and ′ is the normalized value. Only selected predictors from the first 48 hours of each stay were used,
independent of whether AKI occurred afterward.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Model development</title>
        <sec id="sec-3-4-1">
          <title>3.4.1. Models with clinical notes</title>
          <p>To predict AKI using clinical notes, a CNN architecture (GloVe-CNN) was developed (Figure 1) [7].
GloVe-CNN incorporated a GloVe embedding layer to represent words as 100-dimensional vectors. Two
convolutional layers for feature extraction followed this embedding layer. The first layer independently
maps sentences to sentence embeddings, while the second layer combines sentences into a single patient
representation. For both levels, we used max pooling. A dropout layer for regularization follows the
two convolutional layers, and then a classification layer to predict the probability that a patient has or
will develop AKI. GloVe-CNN adopted target replication, which allows the computation of the loss at
the sentence level [16]. Through it, GloVe-CNN efectively learned sentence representations tailored for
the AKI prediction.</p>
          <p>Clinical BioBERT was fine-tuned to predict AKI [ 17]. The Clinical BioBERT model followed the
standard BERT architecture [18] and included an additional classification layer. Max pooling was applied
to the 768-dimensional outputs to retain the most important features. A final 768-dimensional vector
was obtained and used to fine-tune the Clinical BioBERT model for our task. We also tried using BERT
embeddings with the same CNN architecture used for GloVe but it performed worse than the standard
BERT architecture.</p>
        </sec>
        <sec id="sec-3-4-2">
          <title>3.4.2. Extraction of clinical variables</title>
          <p>To extract clinical variables and generate a (partial) patient representation from them, a CNN (Var-CNN)
was developed (Figure 2) [7]. Var-CNN employed two-dimensional filters and incorporated three stacked
convolutional layers, with a subsequent 3x3 max-pooling operation following the final convolutional
layer. We relied on two-dimensional filters instead of the more typical one-dimensional filters used
with tabular data. This decision was based on initial tests that revealed that two-dimensional filters
consistently outperformed one-dimensional filters for this task [7].
The multi-modal network (Figure 3) builds upon the single-modality models with an intermediate
fusion approach. We combined modalities after CNN feature extraction and the first linear layer, or
at the last layer of the BERT architecture, excluding the additional classification layer. Each modality
branch contributed an output representation, which was then concatenated with the others and passed
through a fully connected layer to obtain the final prediction.</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Internal validation</title>
        <p>We split the data into 80%, 10%, and 10% for the training, validation and test set, respectively. Both
the GloVe-CNN and Clinical BioBERT were evaluated when using only clinical notes and when using
clinical notes in combination with clinical variables to assess the performance in a single- vs multi-modal
setting. The predictive performance was measured by the area under the receiver operating curve
(AUROC), the area under the precision-recall curve (AUPRC), and the Brier score.</p>
        <p>The hyperparameters of the models were tuned on the validation set via grid search. The GloVe-CNN
model was set to have 50 filters, as in [ 16]. For all models, we tested max pooling kernels of size 2, 3,
and 4 as well as learning rates of 0.01, 0.001, and 0.0001. The experiments with clinical variables were
performed with the number of initial filters of the Var-CNN varying from 8 to 64 with an incremental
step of 8. The learning rate was set to 0.0001 for the Var-CNN and all multi-modal models, and to 0.001
for the GloVe-CNN.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>The performance metrics for predictive models using BERT and GloVe embedding methods are displayed
in Table 3. The findings are shown for models that exclusively utilize clinical notes, as well as models
that incorporate clinical notes in conjunction with clinical variables, with and without the inclusion of
SCr and urine output. The results in Table 3 indicate that similar performance metrics are obtained for
models using contextualized and static word embeddings for all combinations of modalities. Glove was
selected for further experiments based on these results because training those embeddings requires a 20
times lower computational time than training BERT embeddings.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Discussion</title>
      <p>This study concerned the efects on the predictive performance of using contextualized word embeddings
(LLMs) compared to static word embeddings. Our results showed no diferences between the GloVe and
BERT approaches (Table 3). Gao et al. came to a similar conclusion that models such as word-level CNNs
and hierarchical self-attention networks often perform equally or better compared to BERT models for
medical classification tasks [ 19]. They argue that these results can be attributed to two main reasons.
First, in the context of clinical text classification, a small subset of words is highly important for the
specific task, while the remaining text can be considered as noise. However, BERT models inherently
struggle to focus on specific biomedical keywords due to their nature to instead focus on complex
word relationships [19]. This focus on word relationships shifts BERT’s attention away from these
keywords [19]. The second reason is that the tokenizer employed for BERT models splits unknown
words into subwords. While static word embeddings can learn specific keywords important for the
given task, BERT is unable to do so because keywords are often split into subwords. BERT models need
to discern accurate labels for each subword rather than single keywords, increasing the complexity of
the task.</p>
      <p>Previous studies have also shown that as the amount of training data increases, the performance gap
between static and contextualized word embeddings shrinks [20]. This might indicate that the amount
of training data used in this study was suficient for the static word embeddings to perform equally
well compared to the contextualized word embeddings.</p>
      <p>The main strenght is the internal validation of all modality combinations without the SCr and
urine output variables. These variables were used to define the AKI label, which introduced a form of
incorporation bias. Evaluation without these variables ofers information on the comparison of diferent
modalities without this incorporation bias. There are also limitations. First, internal validation was based
on a simple split into training, validation, and test sets. While cross-validation or bootstrap are better
validation methods, using these methods to train Clinical BioBERT was unfeasible in our environment.
Second, GloVe embeddings and the Clinical BioBERT model are pre-trained on the MIMIC-III dataset.
The issue of data leakage is mitigated since no labels were exposed during the pre-training phase. Since
GloVe and the clinical BioBERT had encountered the test data before, the model’s performance might
be overestimated. We trained GloVe on all the clinical notes to fairly compare it with Clinical BioBERT.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions and future work</title>
      <p>We compared using LLMs versus static word embeddings to predict AKI in the ICU when using only
clinical notes and when combining clinical variables and notes. We also showed the efect of including
versus excluding SCr and urine output. Surprisingly, static word embeddings performed similarly to
LLMs, both when including and excluding clinical variables as well as when including and excluding
SCr and urine output. Combining text with clinical variables improved the results.</p>
      <p>Our models can support clinicians to promptly recognize and treat patients with deteriorating AKI
and consequently improve patient outcomes in the ICU. Our extensive comparison of modalities and
text representations may further guide researchers and practitioners in leveraging multi-modal models
to predict AKI and inspire them to investigate multi-modality and LLMs for other tasks.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>The computational resources used were financed by the NWO programme Computing Time on National
Computer Facilities (grant 2024.15).</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>The authors have not employed any Generative AI tools.
[6] A. Khwaja, Kdigo clinical practice guidelines for acute kidney injury, Nephron Clinical Practice
120 (2012) c179–c184.
[7] R. van Slobbe, D. Herrmannova, D. Boeke, E. Lima-Walton, A. Abu-Hanna, I. Vagliano, Multimodal
convolutional neural networks for the prediction of acute kidney injury in the intensive care,
International Journal of Medical Informatics 196 (2025) 105815.
[8] I. Vagliano, N. C. Chesnaye, J. H. Leopold, K. J. Jager, A. Abu-Hanna, M. C. Schut, Machine learning
models for predicting acute kidney injury: a systematic review and critical appraisal, Clinical
Kidney Journal 15 (2022) 2266–2280.
[9] I. Vagliano, O. Lvova, M. C. Schut, Interpretable and continuous prediction of acute kidney injury
in the intensive care, in: Public Health and Informatics, volume 281, 2021, pp. 103–107.
[10] I. Vagliano, C. Byrne Salsas, T. Wünn, M. C. Schut, External validation and transportability of
models to predict acute kidney injury in the intensive care unit, in: Inf and Tech in Clin Care and
Public Health, volume 295, 2022, pp. 148–151.
[11] I. Vagliano, W.-H. Hsu, M. C. Schut, Machine learning, clinical notes and knowledge graphs for
early prediction of acute kidney injury in the intensive care, Stud. Health Technol. Inform 289
(2022) 329–332.
[12] L. Brancato, I. Calixto, A. Abu-Hanna, I. Vagliano, Leveraging multi-word concepts to predict acute
kidney injury in intensive care, in: Healthcare Transformation with Informatics and Artificial
Intelligence, volume 305, 2023, pp. 10–13.
[13] C. Mao, L. Yao, Y. Luo, Aki-bert: a pre-trained clinical language model for early prediction of acute
kidney injury, arXiv:2205.03695 (2022).
[14] Y. Li, R. M. Wehbe, F. S. Ahmad, H. Wang, Y. Luo, Clinical-longformer and clinical-bigbird:</p>
      <p>Transformers for long clinical sequences, arXiv:2201.11838 (2022).
[15] A. E. Johnson, T. J. Pollard, L. Shen, L.-w. H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits,
L. Anthony Celi, R. G. Mark, Mimic-iii, a freely accessible critical care database, Scientific data 3
(2016) 1–9.
[16] P. Grnarova, F. Schmidt, S. L. Hyland, C. Eickhof, Neural document embeddings for intensive
care patient mortality prediction, arXiv:1612.00467 (2016).
[17] E. Alsentzer, J. R. Murphy, W. Boag, W.-H. Weng, D. Jin, T. Naumann, M. McDermott, Publicly
available clinical bert embeddings, arXiv:1904.03323 (2019).
[18] J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers
for language understanding, in: Proceedings of the 2019 Conference of the North American Chapter
of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long
and Short Papers), Association for Computational Linguistics, Minneapolis, Minnesota, 2019, pp.
4171–4186. URL: https://aclanthology.org/N19-1423/. doi:10.18653/v1/N19-1423.
[19] S. Gao, M. Alawad, M. T. Young, J. Gounley, N. Schaeferkoetter, H. J. Yoon, X.-C. Wu, E. B. Durbin,
J. Doherty, A. Stroup, et al., Limitations of transformers on clinical text classification, J Biomed
and Health Inform 25 (2021) 3596–3607.
[20] S. Arora, A. May, J. Zhang, C. Ré, Contextual embeddings: When are they worth it?,
arXiv:2005.09117 (2020).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pennington</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , Glove:
          <article-title>Global vectors for word representation</article-title>
          .,
          <source>in: EMNLP</source>
          , volume
          <volume>14</volume>
          ,
          <year>2014</year>
          , pp.
          <fpage>1532</fpage>
          -
          <lpage>1543</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Case</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Khalid</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Khan</surname>
          </string-name>
          , et al.,
          <article-title>Epidemiology of acute kidney injury in the intensive care unit</article-title>
          ,
          <source>Critical care research and practice</source>
          <year>2013</year>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>N.</given-names>
            <surname>Sato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Uchino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kojima</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hiragi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Yanagita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Okuno</surname>
          </string-name>
          ,
          <article-title>Prediction and visualization of acute kidney injury in intensive care unit using one-dimensional convolutional neural networks based on routinely collected data</article-title>
          ,
          <source>Computer Methods and Programs in Biomedicine</source>
          <volume>206</volume>
          (
          <year>2021</year>
          )
          <fpage>106129</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Le</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Allen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Calvert</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. M.</given-names>
            <surname>Palevsky</surname>
          </string-name>
          , G. Braden,
          <string-name>
            <given-names>S.</given-names>
            <surname>Patel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pellegrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Green-Saxena</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hofman</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. Das</surname>
          </string-name>
          ,
          <article-title>Convolutional neural network model for intensive care unit acute kidney injury prediction</article-title>
          ,
          <source>Kid Int Rep</source>
          <volume>6</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <article-title>Early prediction of acute kidney injury in critical care setting using clinical notes</article-title>
          , in: BIBM,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>