<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Integrating XAI for Predictive Conflict Analytics⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Luca Macis</string-name>
          <email>luca.macis@unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Marco Tagliapietra</string-name>
          <email>marco.tagliapietra@unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alessandro Castelnovo</string-name>
          <email>alessandro.castelnovo@intesasanpaolo.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Daniele Regoli</string-name>
          <email>daniele.regoli@intesasanpaolo.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Greta Greco</string-name>
          <email>greta.greco@intesasanpaolo.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea Claudio Cosentini</string-name>
          <email>andrea.cosentini@intesasanpaolo.com</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paola Pisano</string-name>
          <email>paola.pisano@unito.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Edoardo Carroccetto</string-name>
          <email>edoardo.carroccetto@edu.unito.it</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Economics and Statistics, University of Turin</institution>
          ,
          <addr-line>Via Lungo Dora Siena 100, 10153 Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Intesa Sanpaolo S.p.A.</institution>
          ,
          <addr-line>C.so Inghilterra 3, 10138 Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Turin</institution>
          ,
          <addr-line>Via Giuseppe Verdi 8, 10124 Torino</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Predicting global conflicts through data-driven approaches has the potential to aid political decisionmakers in formulating more efective and targeted policies. However, high-performance models that derive patterns from data often become highly complex, making it challenging to extract understandable rationales behind their outcomes. In this paper, we suggest integrating a transformer-based Artificial Intelligence Early Warning System (AI-EWS) with integrated gradients, an eXplainable Artificial Intelligence (XAI) technique attributing model predictions to specific features at a given time in the input data, thereby enhancing interpretability. To validate our methodology, we conduct experiments on a prominent geopolitical dataset: ACLED. This dataset provides comprehensive insights into global conflict events, facilitating efective pattern learning and generalization by our model. Leveraging these explainability techniques, our goal is to bridge the gap between complex, high-performance models and the practical needs of policymakers in conflict prevention and resolution. Predictive analytics algorithms in conjunction with an XAI approach can foresee the impact of decisions on various population segments, fostering equity, and inclusion and supporting a data-driven approach, along with a culture of openness and accountability within the public administration.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;eXplainable Artificial Intelligence</kwd>
        <kwd>Transformers</kwd>
        <kwd>Time Series Forecasting</kwd>
        <kwd>Integrated Gradients</kwd>
        <kwd>Conflict Prediction</kwd>
        <kwd>Early Warning System</kwd>
        <kwd>Public Policy</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Predicting potential conflicts has played a crucial role in the landscape of peace research since
Singer’s work in the early 70’s [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This historical backdrop sets the stage for understanding the
evolution of conflict forecasting methodologies, encompassing diverse approaches including
algorithms for event data coding [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Ward and co-authors marked a significant turning point,
bringing prediction methodologies into the mainstream of peace research [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Subsequently,
various organizations contribute to the field through the development of comprehensive data
analysis and interpretation systems. The most current and prominent example is the Violence
and Impacts Early-Warning System (VIEWS) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] developed by Uppsala University; on our end
in collaboration with the Italian Ministry of Foreign Afairs we implemented a new AI-EWS
employing transformer models, built upon a multi-headed attention mechanism [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. However,
the usage of such sophisticated techniques, if on one side brings the benefit of increased
accuracy of conflict predictions, on the other it opens new challenges to be faced. One of
such challenges lies in the inherent complexity of such models: if predictions of conflicts are
not accompanied by detailed explanations of the choices that led the model to make those
predictions, policymakers are unlikely to trust such models to build robust policies and efective
actions. Against this background, we introduce XAI approaches — in particular, those based on
integrated gradients [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] — to enhance the transparency and comprehension of the AI-EWS’s
outcomes. In constructing our dataset to predict conflict, we opted for a publicly available
disaggregated dataset that is regularly updated: the Armed Conflict Location and Event Data
Project (ACLED), that collects real-time data on the locations, dates, actors, fatalities, and types
of all reported political violence and protest events around the world [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>The field of conflict prediction has explored the usage of various machine learning models,
including Random Forest [8], naive Bayes classifiers [ 9] and Neural Networks [10]. Notably, in
the realm of time series forecasting, researchers have recently applied transformer architectures
to univariate time series forecasting tasks. For instance, Li and co-authors solution showcased
superior performance compared to classical statistical methods like ARIMA, as well as recent
approaches such as TRMF, DeepAR, and DeepState, on four public forecasting datasets [11].
In our work we extend the application of transformers for multivariate time-series tasks as
done by Zerveas and co-authors [12], where they use only the encoder part of the original
transformer architecture. We use the same approach with some modification: in particular,
we decided to include residual connections between input and output, ensuring that a purely
linear model is always a subclass of our model [13]. The reason is that a simple linear models
surprisingly may outperform existing sophisticated transformer-based models for long
timeseries forecasting problems [14]. Due to the intricate nature of the model, it’s necessary to
exploit XAI approaches to provide trustworthy explanations of its output. Regarding the use of
integrated gradients within transformers, a self-attention attribution method was proposed and
demonstrated on BERT [15]. Integrated hessians, an extension of integrated gradients, explain
pairwise feature interactions in DistilBERT and demonstrating its efectiveness in sentiment
analysis [16]. Following this trend, a recent work focuses on applying model-agnostic XAI
techniques [17], such as SHAP [18] and LIME [19], to interpret predictions from
transformerbased models in mental healthcare monitoring on social networks. The study underscores the
social and public importance of explainability for the adoption of AI-based diagnostic systems.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. Geopolitical data collection and preparation</title>
        <p>The value of data selection in defining a data-driven conflict prediction model’s performance is
recognized. Such models usually are dependent on either social media or diplomatic datasets.
Social media datasets, specifically Twitter, were historically utilized due to their convenience
and utility in examining factors influencing civil unrest [ 20, 21]. However, owing to restrictions
on violent content and monitoring by authoritarian regimes, their eficacy has been mitigated
[22]. Consequently, our study leans towards diplomatic datasets. Our chosen dataset, ACLED,
is disaggregated, regularly updated, and emphasizes disorder events. ACLED data has proven
valuable in predicting conflict [ 23], and is publicly accessible via their API1. The data chronicles
various conflict events with distinct descriptions, location, and time, and ensures reliability
through a rigorous verification process. Although coverage periods vary across nations, the
detailed structure and transparency foster academic research and informed decision-making.
Our research aggregates these data weekly and categorizes them by event types, resulting in a
dataset where each observation corresponds to the number of a specific event type within that
week in a particular country.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Transformer Model</title>
        <p>Our AI-EWS employs a transformer model that focuses on predicting the number of fatalities
across all countries over twelve weeks. Inspired majorly by the Time-series Dense Encoder
(TiDE) model prominent in the domain of long-term forecasting [13], our design substitutes
the dense encoder conventionally used in TiDE with an attention-based encoder, due to better
results with the dataset in use in our study. The model, as in the original TiDE
implementation, incorporates residual connections from input to output ensuring the preservation of
linear activation, an approach backed by empirical evidence for its eficiency in time-series
forecasting [14]. For a comprehensive understanding of the model’s operation and data flow,
please refer to the detailed explanation provided under Figure 1. Overall, the model’s design
is geared towards robust long-term prediction while maintaining a fundamental simplicity in
its architecture, balancing advanced modeling techniques with practical forecasting reliability.
The model is trained using the Negative Log Likelihood (NLL) loss function to optimize its
probabilistic forecasts. Additionally, it’s worth noting that for each country, 12 weeks were
retained for testing, 24 for validation, and the remaining weeks were allocated for training.
1Armed Conflict Location and Event Data Project (ACLED); https://acleddata.com/</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Integrated Gradients</title>
        <p>
          Integrated Gradients (IG) is a technique utilized for attributing predictions of deep neural
networks to their input features, facilitating a deeper understanding of their behavior [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ].
Addressing the challenge of empirical evaluation inherent in attribution techniques, IG adopt
an axiomatic approach. Two fundamental axioms guide attribution methods:
        </p>
        <p>Sensitivity: This axiom dictates that for inputs and baselines difering in a single feature yet
yielding diferent predictions, the difering feature must receive a non-zero attribution.</p>
        <p>Implementation Invariance: Attributions should remain consistent for functionally
equivalent networks. Networks are functionally equivalent if their outputs coincide for all inputs,
despite potential diferences in their implementations. Failure to satisfy this axiom may indicate
sensitivity to insignificant model aspects. IG computes attributions by following a straight-line
path from a baseline input ′ to the input , evaluating gradients along this path. Specifically,
IG along the ℎ dimension for inputs  and ′ are calculated as:
() = ( − ′) ×
∫︁ 1
 =0
 (′ +  ( − ′))

d,
(1)
where  () represents the gradient of  () along the ℎ dimension.</p>
        <p>Furthermore, IG adheres to an additional axiom:</p>
        <p>Completeness: Attributions sum up to the discrepancy between the output of  at input 
and baseline ′. This axiom serves as a sanity check, ensuring the method comprehensively
accounts for diferences.</p>
        <p>In our study, the baseline was established as the mean matrix, a critical decision considering
the MinMax scaling applied to our dataset. Notably, employing a baseline filled with mean values
for each rescaled feature can assign significance to count features with zero values, especially
in a dataset subjected to MinMax scaling. This decision was based on the assumption that the
absence of unrest events might hold relevance for the model’s prediction. Therefore, we opted
for this baseline matrix rather than a zero baseline matrix. The choice of IG is motivated by its
transparency, simplifying the comprehension of attributions to input features. While SHAP and
LIME, as leading and commonly used methods in XAI, provide in-depth explorations of model
behaviors, IG’s clear computational approach provides an easier understanding, making it an
efective preliminary step before advancing to more complex explanatory methods.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>This study sets out to forecast the potential number of fatalities from February 13, 2024, to
March 30, 20242, focusing on 168 countries that have recorded at least one fatality throughout
their historical time series3.</p>
      <p>The primary aim of this research is not to benchmark the prediction accuracy against other
leading models but to explore the insights provided by the predictions of AI-EWS. Our
investigation is centered around the application of integrated gradients, the chosen XAI methodology,
to reveal the reasons behind these forecasts. The analysis initiates by pinpointing the crucial
variables influencing the forecasting during the testing period. As depicted in Figure 3, these
variables are visualized using boxplots and are arranged in descending order of their influence.
For clarity, take the instance of a specific country: a value of 1 marks the highest absolute shift
in Integrated Gradients, showcasing that a variable is critically influential in making predictions;
a value of 0 suggests no shift, indicating the variable’s non-involvement in the prediction model;
any value between 0 and 1 highlights the variable’s proportional relevance compared to the
most impactful variable in that country.
2This timeframe spans the most recent twelve weeks of ACLED data available for each country up to April 9, 2024.
3This criterion is crucial as predicting deviations from zero fatalities where no prior occurrences exist is statistically
improbable. Therefore, countries are assessed individually based on their historical data.</p>
      <p>Additionally, the relevance of diferent time intervals within the forecasting model is
scrutinized. The model encompasses a lookback period of 48 weeks to integrate the data leading
up to a prediction. Figure 4 clarifies the weighting assigned to each subsequent week,
arranged chronologically, which assists in understanding the adaptive significance throughout
the considered period.
To summarize, initial findings illuminate prominent patterns concerning the importance of
variables and the dynamics of time intervals within the prediction framework. As elucidated in
Figure 3, certain variables evidently carry more weight consistently across all analyzed countries.
However, a review of Figure 4 portrays a more complex landscape. Although there exists a mild
preference for recent weeks, variable importance demonstrates relative uniformity regardless
of the elapsed time since the event. This reflection reveals a coherent strategy by the AI-EWS
to value variables uniformly, irrespective of their temporal proximity to the predicted event.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions</title>
      <p>
        In this study, we proposed a novel approach to conflict prediction on a global scale,
leveraging advanced transformer models and XAI methodologies. We applied our approach to a
comprehensive geopolitical dataset implemented using data obtained from the ACLED API.
The transformer model proposed is inspired by its original architecture [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and incorporates
insights from the TiDE model [13]. The AI-EWS goal is to forecast the number of fatalities
over a 12-week horizon. This metric provides an immediate and objective index for gauging
a country’s unrest situation. Integrated gradients were employed as the XAI methodology
to enhance interpretability, ofering significant insights into how specific features impact the
model’s predictions and the temporal influence dynamics. Our analysis of the conflict dataset
unveiled several key insights. We observed that certain features consistently hold importance
across diferent countries. However, a detailed examination into the importance attributed
to varying time frames indicates a subtle preference for recent data, suggesting the AI-EWS
maintains consistent variable prioritization regardless of temporal proximity. As we move
forward, IG may serve as a foundational tool, enabling clear initial explanations that pave the
way for engaging with more advanced XAI techniques in future research while mitigating
the complexities often encountered with newer methods. In future studies, to evaluate the
comprehensibility of our feature rankings for users such as policymakers, we plan two key
activities: user studies, for collecting feedback through surveys and interviews to assess their
understanding of the model’s feature rankings; usability testing, where users make decisions
based on the model’s outputs, evaluating how efectively they can utilize the provided feature
rankings. The findings provide valuable insights into the interpretability and performance of
advanced machine learning techniques in addressing high-stakes global challenges. As the
public sector increasingly relies on AI for decision making, there will be a growing need for
mechanisms that can explain AI decisions in a transparent and understandable way. In summary,
XAI can make a significant contribution to more responsive and accountable public services.
Not only it can deliver accessible and meaningful explanations to non-expert audiences,
including the general public and policymakers, but it can also guarantee greater compliance with
regulatory evolutions and principles such as fairness, accountability and privacy.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This study was funded by the European Union (EU) - NextGenerationEU, in the framework of
the GRINS - Growing Resilient, INclusive and Sustainable project (GRINS PE00000018 – CUP
D13C22002160001). The views/opinions expressed are solely those of the authors and do not
necessarily reflect those of the EU, nor can the EU be held responsible for them.
[8] D. Muchlinski, D. Siroky, J. He, M. Kocher, Comparing random forest with logistic
regression for predicting class-imbalanced civil war onset data, Political Analysis 24 (2016)
87–103. doi:10.1093/pan/mpv024.
[9] C. Perry, Machine learning and conflict prediction: A use case, Stability: International</p>
      <p>Journal of Security &amp; Development 2 (2013) 56. doi:10.5334/sta.cr.
[10] N. Beck, G. King, L. Zeng, Improving quantitative studies of international conflict: A
conjecture, The American Political Science Review 94 (2000) 21. doi:10.2307/2586378.
[11] S. Li, X. Jin, Y. Xuan, X. Zhou, W. Chen, Y.-X. Wang, X. Yan, Enhancing the locality and
breaking the memory bottleneck of transformer on time series forecasting, in: H. Wallach,
H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, R. Garnett (Eds.), Advances in Neural
Information Processing Systems, volume 32, Curran Associates, Inc., 2019.
[12] G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, C. Eickhof, A transformer-based
framework for multivariate time series representation learning, in: Proceedings of the 27th
ACM SIGKDD Conference on Knowledge Discovery &amp; Data Mining, KDD ’21, Association
for Computing Machinery, 2021, p. 2114–2124. doi:10.1145/3447548.3467401.
[13] A. Das, W. Kong, A. Leach, S. Mathur, R. Sen, R. Yu, Long-term forecasting with tide:</p>
      <p>Time-series dense encoder, 2023. arXiv:2304.08424.
[14] A. Zeng, M. Chen, L. Zhang, Q. Xu, Are transformers efective for time series forecasting?,
AAAI Conference on Artificial Intelligence 37 (2023) 11121–11128. doi: 10.1609/aaai.
v37i9.26317.
[15] Y. Hao, L. Dong, F. Wei, K. Xu, Self-attention attribution: Interpreting information
interactions inside transformer, 2021. arXiv:2004.11207.
[16] J. D. Janizek, P. Sturmfels, S.-I. Lee, Explaining explanations: Axiomatic feature interactions
for deep networks, The Journal of Machine Learning Research 22 (2021) 4687–4740.
[17] A. Malhotra, R. Jindal, Xai transformer based approach for interpreting depressed and
suicidal user behavior on online social networks, Cognitive Systems Research (2023)
101186. doi:10.1016/j.cogsys.2023.101186.
[18] S. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, 2017.</p>
      <p>arXiv:1705.07874.
[19] M. T. Ribeiro, S. Singh, C. Guestrin, "why should i trust you?": Explaining the predictions
of any classifier, 2016. arXiv:1602.04938.
[20] G. Korkmaz, J. Cadena, C. J. Kuhlman, A. Marathe, A. Vullikanti, N. Ramakrishnan,
Combining heterogeneous data sources for civil unrest forecasting, in: Proceedings of the
2015 IEEE/ACM Int. Conf. on Advances in Social Networks Analysis and Mining 2015,
ASONAM ’15, ACM, 2015, p. 258–265. doi:10.1145/2808797.2808847.
[21] R. Compton, C. Lee, J. Xu, A. M. Luis, T.-C. Lu, Using publicly visible social media to build
detailed forecasts of civil unrest, 2014. doi:10.1186/s13388-014-0004-6.
[22] M. Junior, P. Melo, A. P. C. da Silva, F. Benevenuto, J. Almeida, Towards understanding
the use of telegram by political groups in brazil, in: Brazilian Symposium on Multimedia
and the Web, WebMedia ’21, ACM, 2021, p. 237–244. doi:10.1145/3470482.3479640.
[23] M. Halkia, S. Ferri, M. K. Schellens, M. Papazoglou, D. Thomakos, The global conflict risk
index: A quantitative tool for policy support on conflict prevention, Progress in Disaster
Science 6 (2020) 100069. doi:https://doi.org/10.1016/j.pdisas.2020.100069.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Singer</surname>
          </string-name>
          ,
          <article-title>The peace researcher and foreign policy prediction</article-title>
          , Peace Science Society (International)
          <volume>21</volume>
          (
          <year>1973</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P. A.</given-names>
            <surname>Schrodt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. G.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Weddle</surname>
          </string-name>
          , Political science:
          <article-title>Keds-a program for the machine coding of event data</article-title>
          ,
          <source>Social Science Computer Review</source>
          <volume>12</volume>
          (
          <year>1994</year>
          )
          <fpage>561</fpage>
          -
          <lpage>587</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M. D.</given-names>
            <surname>Ward</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. D.</given-names>
            <surname>Greenhill</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Bakke</surname>
          </string-name>
          ,
          <article-title>The perils of policy by p-value: Predicting civil conflicts</article-title>
          ,
          <source>Journal of peace research 47</source>
          (
          <year>2010</year>
          )
          <fpage>363</fpage>
          -
          <lpage>375</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hegre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Allansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Basedau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Colaresi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Croicu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fjelde</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Hoyles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Hultman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Högbladh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jansen</surname>
          </string-name>
          , et al.,
          <article-title>Views: A political violence early-warning system</article-title>
          ,
          <source>Journal of peace research 56</source>
          (
          <year>2019</year>
          )
          <fpage>155</fpage>
          -
          <lpage>174</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , L. u. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          , in: I. Guyon,
          <string-name>
            <given-names>U. V.</given-names>
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>30</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Sundararajan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Taly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <article-title>Axiomatic attribution for deep networks</article-title>
          ,
          <source>CoRR abs/1703</source>
          .01365 (
          <year>2017</year>
          ). arXiv:
          <volume>1703</volume>
          .
          <fpage>01365</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Raleigh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Kishi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Linke</surname>
          </string-name>
          ,
          <article-title>Political instability patterns are obscured by conflict dataset scope conditions, sources, and coding choices</article-title>
          ,
          <source>Humanities and Social Sciences Communications</source>
          <volume>10</volume>
          (
          <year>2023</year>
          ).
          <source>doi:10.1057/s41599-023-01559-4.</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>