<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Enhancing Trustworthiness in NLP Systems Through Comprehensive Explainability Approaches</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Santiago González-Silot</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centro de Estudios Avanzados en TIC, Universidad de Jaén</institution>
          ,
          <addr-line>Campus Las Lagunillas s/n, 23007, Jaén</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Doctoral Symposium on Natural Language Processing</institution>
          ,
          <addr-line>25</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>LLM</institution>
          ,
          <addr-line>Language Models, XAI, Trustworthy AI, Explainable AI, Interpretability</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>Natural Language Processing (NLP) systems have made significant strides in recent years, achieving remarkable success in various applications such as machine translation, sentiment analysis, and question answering. However, the black-box nature of many advanced NLP models raises concerns about their trustworthiness and reliability, especially in critical domains like healthcare, legal, and disinformation. This doctoral thesis addresses the need for enhancing trustworthiness in NLP systems by integrating explainability through three main approaches: Feature Importance Methods, Natural Language Generation (NLG) Explanations, and Probing Techniques. The research presented here aims to bridge the gap between complex NLP models and their end-users by developing and evaluating methods that provide transparent and interpretable insights throughout the Machine Learning production cycle: data acquisition, preprocessing, training, and inference. This doctoral thesis hypothesizes that achieving reliable, explainable, and unbiased language models through these three complementary approaches will lead to more human-friendly and usable Artificial Intelligence.</p>
      </abstract>
      <kwd-group>
        <kwd>Approaches</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR</p>
      <p>ISSN1613-0073
to ask how it works and why, but this is not really the case. The trustworthiness of Artificial
Intelligence is key for it to have a good impact on society and the acceptance of users to use it
correctly without fears and prejudices. For example, people are more open to use AI if they
know how it works and why they make certain decisions [3].</p>
      <p>If we do not know why the AI makes a decision, produces a response, or acts in a certain
way, we will not know if that decision is really correct, since in many cases this AI response
is highly subjective, variable, and multifactorial. Many papers [4, 5, 6] have shown that AI is
plagued by biases of all kinds, e.g., gender, ethnicity, and religion, which are inherent in the data
used for training and can condition it to make decisions that are dangerous to humans. That is,
sometimes these biases come from humans themselves. The issue of opacity and limitations in
explanations is determined by the typology of neural networks. For example, rule-based expert
systems do achieve an acceptable ability to explain their decisions [7].</p>
      <p>In addition, explainability is not only a goal to see why a model makes a decision and to
see the model’s behavior, it also serves to justify that decision and to help users to investigate
uncertain or inconsistent predictions. For example, in my previous work [8], I applied SHAP and
observed that the state-of-the-art models of fake news detection took into consideration spurious
features and named entities, which is a violation of impartiality. Thanks to this application of
explainability, I was able to develop a methodology of working to reduce biases in this task
and make the model less biased, more robust to adversarial attacks, more generalizable, and
generally more trustworthy. It is worth mentioning that a paper on the application of this
methodology has been written and is currently under review in a journal.</p>
      <p>Trustworthy AI has become increasingly crucial due to the growing landscape of regulations
designed to ensure ethical, transparent, and accountable use of Artificial Intelligence, as can
be seen in the document of ethics guidelines for trustworthy AI of the European Commision
[9]. As governments and international bodies establish guidelines to protect individual rights
and societal interests, AI researchers and organizations must prioritize trustworthiness to
comply with these standards. Trustworthy AI not only helps in avoiding legal repercussions
and financial penalties but also fosters public confidence and adoption of AI technologies. It
encompasses principles such as fairness, privacy, security, robustness, and explainability, which
are essential to mitigate biases, prevent misuse, and promote transparency. Adhering to these
regulations ensures that AI systems operate responsibly and equitably, reinforcing their positive
impact on society while maintaining public trust and safeguarding against potential harm.</p>
      <p>For these reasons, the objective of this doctoral thesis is to bridge the gap between
blackbox, biased, and opaque models to a more secure, transparent, unbiased, and generally more
trustworthy Artificial Intelligence in the Natural Language Processing domain, focusing on three
distinct yet complementary explainability approaches: Feature Importance Methods, Natural
Language Generation Explanations, and Probing Techniques.</p>
      <p>These three explainability approaches—Feature Importance Methods, NLG Explanations,
and Probing Techniques—ofer complementary perspectives on model behavior. While feature
importance methods highlight influential input elements, NLG explanations provide accessible
rationales, and probing reveals internal representations and linguistic capabilities. Together,
they form a comprehensive toolkit for enhancing the trustworthiness and interpretability of
NLP systems.</p>
      <p>The remaining sections of this paper are organized as follows: Section 2 covers the background
and related work of Trustworthy and Explanability in NLP; Section 3 the main hypothesis and
objectives of the doctoral thesis; Section 4 the research methodology and experiments for this
thesis; Section 5 the specific research elements proposed for discussion; Finally, Section 6 depicts
the conclusions.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background and related work</title>
      <p>Trustworthy and explainable natural language processing (NLP) has become a critical area of
research in recent years. With the increasing focus on ethical challenges within NLP, such
as bias mitigation, identifying objectionable content, and enhancing system design and data
handling practices [10], researchers have delved into various aspects to ensure trustworthy
NLP models. Recent eforts have been made to enhance the trustworthiness of models through
aspects like robustness, explainability, privacy, fairness, accountability, and environmental
well-being [11].</p>
      <p>The field of explainable NLP has evolved to encompass various methodologies and techniques
aimed at enhancing model interpretability. Based on the categorization presented in [12], we
identify three principal approaches to explainability in NLP: Feature Importance Methods,
Natural Language Generation (NLG) Explanations, and Probing Techniques. Each of
these approaches ofers unique insights into model behavior and decision-making processes,
contributing to the broader goal of trustworthy AI.</p>
      <sec id="sec-2-1">
        <title>2.1. Feature Importance Methods</title>
        <p>Feature importance methods focus on identifying and quantifying the contribution of input
features to model predictions. These techniques aim to answer the question “Which parts of
the input were most influential for the model’s decision ” by generating attribution scores for
individual tokens, words, or phrases. Several prominent approaches have emerged in this
category:
• Gradient-based methods such as Integrated Gradients [13] and SmoothGrad [14] utilize
the gradient of the model output with respect to input features to determine feature
importance. These methods provide fine-grained explanations but can be computationally
intensive and may produce noisy attributions.
• Perturbation-based methods like LIME (Local Interpretable Model-agnostic
Explanations) [15] and SHAP (SHapley Additive exPlanations) [16] observe changes in model
predictions when input features are perturbed or removed. LIME approximates complex
models locally with interpretable surrogates, while SHAP draws from cooperative game
theory to assign contribution values to features.
• Attention-based interpretations leverage the attention mechanisms inherent in
Transformer models, providing visualization of which parts of the input the model ”focuses”
on during prediction [17]. However, research by Serrano and Smith [18] has questioned
whether attention weights directly translate to feature importance.</p>
        <p>Recent research has explored how these methods can be adapted specifically for NLP tasks.
For instance, Jin et al. [19] proposed hierarchical explanations for text classification that account
for both word-level and phrase-level contributions. Similarly, Wallace et al. [20] introduced
AllenNLP Interpret, which integrates various feature attribution methods for NLP models.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Natural Language Generation (NLG) Explanations</title>
        <p>Natural Language Generation approaches produce textual explanations that describe the
reasoning process or decision factors of a model. These explanations are often more accessible
to non-technical users compared to numerical scores or visualizations. The key advantage
of NLG explanations is their ability to communicate complex decision processes in a familiar
format—natural language.</p>
        <p>Self-explanatory models incorporate explanation generation as an intrinsic component of
their architecture. Models like ExplanationLP [21] and CoS-E [22] are trained to generate both
predictions and explanations simultaneously. These approaches often use multitask learning
frameworks where explanation generation is an auxiliary task alongside the primary NLP task.</p>
        <p>Post-hoc explanation generators, on the other hand, produce explanations after the model has
made its prediction. Such systems may be trained on human-authored explanations to mimic
human reasoning patterns [23], or they may utilize large language models to generate plausible
rationales for predictions [24].</p>
        <p>Rationalization techniques aim to extract segments of the input text that justify the model’s
prediction [25, 26]. These methods typically employ selective or extractive approaches to
identify crucial portions of the input that influence the output decision.</p>
        <p>Recent advancements in this area include the development of faithfulness metrics to evaluate
how accurately natural language explanations reflect the model’s true decision process [ 27].
Additionally, researchers have explored generating contrastive explanations that highlight why
one prediction was made over another potential outcome [28].</p>
      </sec>
      <sec id="sec-2-3">
        <title>2.3. Probing Techniques</title>
        <p>Probing techniques, also known as diagnostic classifiers or linguistic probing, investigate what
linguistic properties or structures are captured by diferent components of a model. These
methods help researchers understand the internal representations learned by NLP models and
analyze what information is encoded at diferent processing stages.</p>
        <p>Structural probing methods examine how well models capture syntactic and hierarchical
linguistic structures. For instance, [29] demonstrated that BERT’s representations encode parse
tree distances, suggesting the model implicitly learns syntactic information during pretraining.</p>
        <p>Semantic probing assesses a model’s understanding of meaning-related properties. This
includes probing for semantic roles, lexical relations, entity types, and compositional semantics.
[30] showed how diferent layers in BERT capture diferent levels of linguistic information,
from surface features in early layers to semantic information in later layers.</p>
        <p>Behavioral probing examines how models respond to specific challenges or manipulations of
the input [31]. Methods like CheckList [31] provide a framework for testing specific linguistic
capabilities through carefully crafted test cases.</p>
        <p>Advanced probing techniques include controlled interventions [32], where specific neurons
or attention heads are manipulated to observe their impact on model behavior, and
crossarchitectural comparisons [33], which analyze how diferent model architectures represent
similar linguistic phenomena.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Main Hypothesis and Objectives</title>
      <sec id="sec-3-1">
        <title>3.1. Main Hypothesis</title>
        <p>The hypothesis behind this line of research is that if we develop explainable, interpretable, and
less-biased models, we can create a more Trustworthy AI which is more usable, human-friendly,
and responsible.</p>
        <p>This doctoral thesis aims to bridge the gap between black-box, biased, and opaque models
to a more secure, transparent, unbiased, robust, and generally more trustworthy Artificial
Intelligence in the Natural Language Processing domain.
3.2. Objectives
1. Analyze the state of the art of Explainability and Trustworthiness in AI and specifically
in NLP
2. Analyze the possible regulations that exist and will exist in AI to adapt the line of research
and application to these regulations.
3. Develop and evaluate Feature Importance Methods for NLP models that provide
transparent insights into which input features influence model predictions, with particular focus
on addressing biases and improving model robustness.
4. Design and implement Natural Language Generation approaches that produce accessible
and faithful explanations of model behavior, enabling users to understand model decisions
in human-readable format.
5. Apply and extend Probing Techniques to systematically investigate what linguistic
properties are captured by NLP models and how these representations relate to model
performance and biases.
6. Design of an evaluation framework that takes into account the diferent perspectives
of trustworthiness, comparing and integrating insights from all three explainability
approaches to provide a comprehensive understanding of model behavior.
7. Create a methodology for applying appropriate explainability techniques based on specific
domain requirements and user needs, particularly for sensitive applications such as fake
news detection, medical text analysis, and legal document processing.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Research Methodology and Proposed Experiments</title>
      <p>To achieve the objectives and validate the hypothesis, the research will proceed in four stages:
1. Analysis of relevant literature sources: To achieve the objectives of the thesis, an
exhaustive analysis of relevant sources has to be performed. This includes the review of
scientific literature related to language models, explainability techniques (Feature
Importance Methods, NLG Explanations, and Probing), trustworthiness, and the methodologies
that may approach a more Trustworthy AI.
2. Experimental design: Development of techniques and methodologies across the three
explainability approaches to bring language models closer to a more reliable AI. The
experimental design includes:
• Application of Feature Importance Methods to identify influential features in model
decisions and detect biases
• Development of Natural Language Generation techniques for explaining model
decisions in human-readable format
• Implementation of Probing methods to understand what linguistic information is
encoded in model representations
3. Trustworthy Data Creation and Curation: Development of datasets specifically
designed to drive explainable behavior of language models and to evaluate the efectiveness
of diferent explainability approaches. Additionally, data preprocessing techniques will
be developed to ensure privacy and unbiasedness throughout the data lifecycle.
4. Evaluation of results: Application and development of diferent evaluation metrics that
measure how reliable an AI model is across diferent aspects of trustworthiness (absence
of biases, robustness, interpretability, etc.). The evaluation will focus on comparing the
insights gained from each explainability approach and assessing their complementarity.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Research Elements for Discussión</title>
      <p>In a field as broad and incipient as trustworthy AI, there is a discussion on a wide range of
issues, but in particular, I show below the 3 elements of the discussion that I am debating in the
current state of the doctoral thesis.</p>
      <p>1. Integration of Multiple Explainability Approaches While each explainability
approach (Feature Importance, NLG Explanations, and Probing) ofers valuable insights, how
can we efectively integrate these diverse perspectives into a cohesive understanding of
model behavior? Do these approaches sometimes provide contradictory explanations, and
if so, how should such contradictions be resolved? Furthermore, how can we determine
which explainability method is most appropriate for specific stakeholders, domains, or
use cases?
2. Evaluation Techniques for Measuring the Quality of an Explanation: A model’s
quality should be evaluated not only by its accuracy and performance but also by how well
it provides explanations for its predictions [12]. Should we use Informal Examination,
Comparison to Ground Truth or Human Evaluation? What are the advantages and
disadvantages of using metrics such as BLEU [34], ROUGE [35], or Perplexity? Can we
rely on what is relevant to attention mechanisms? [36].
3. Efective evaluation of the degree of bias of a language model . The degree of
trustworthiness of a language model depends on several factors such as its robustness,
interpretability, or absence of bias among others. How can we efectively measure the
degree of bias of a language model? How can we know if there is a real bias in the model
output? How can we identify from which part of the model development cycle the bias
comes?</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>This paper has outlined the initial phase of my doctoral research, which focuses on creating
models that are more explainable, interpretable, and fair, with the goal of narrowing the gap
between opaque black-box systems and the principles of Trustworthy Artificial Intelligence
within the field of Natural Language Processing.</p>
      <p>For this purpose, the state of the art has been analyzed, the objectives to be achieved have
been presented, the methodology to achieve them has been described, and finally, diferent
elements for discussion have been introduced.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>Declaration on Generative AI. During the preparation of this work, the author used DeepL and,
occasionally, ChatGPT in order to check grammar, spelling, and translation. After using these
tools, the author reviewed and edited the content as needed and take full responsibility for the
publication’s content.
[2] K. W. Church, V. Kordoni, Emerging trends: Sota-chasing, Natural Language Engineering
28 (2022) 249–269. doi:10.1017/S1351324922000043.
[3] A. Barredo Arrieta, N. Díaz-Rodríguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado,
S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, R. Chatila, F. Herrera, Explainable artificial
intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible
ai, Information Fusion 58 (2020) 82–115. URL: https://www.sciencedirect.com/science/
article/pii/S1566253519308103. doi:https://doi.org/10.1016/j.inffus.2019.12.012.
[4] T. Bolukbasi, K. Chang, J. Y. Zou, V. Saligrama, A. Kalai, Man is to computer programmer
as woman is to homemaker? debiasing word embeddings, CoRR abs/1607.06520 (2016).</p>
      <p>URL: http://arxiv.org/abs/1607.06520. arXiv:1607.06520.
[5] N. Garg, L. Schiebinger, D. Jurafsky, J. Zou, Word embeddings quantify 100 years of
gender and ethnic stereotypes, Proceedings of the National Academy of Sciences 115
(2018) E3635–E3644.
[6] I. Garrido-Muñoz, F. Martínez-Santiago, A. Montejo-Ráez, Maria and beto are sexist:
evaluating gender bias in large language models for spanish, Language Resources and
Evaluation (2023) 1–31.
[7] C. Yáñez-Márquez, Toward the bleaching of the black boxes: minimalist machine learning,</p>
      <p>IT Professional 22 (2020) 51–56.
[8] S. González-Silot, Procesamiento de Lenguaje Natural Explicable para Análisis de
Desinformación, Master’s thesis, Universidad de Granada, 2023.
[9] European-Commision, Ethics guidelines for trustworthy AI., Technical Report, 2019. URL:
https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai.
[10] S. Prabhumoye, B. Boldt, R. Salakhutdinov, A. W. Black, Case study: Deontological ethics in
NLP, in: K. Toutanova, A. Rumshisky, L. Zettlemoyer, D. Hakkani-Tur, I. Beltagy, S. Bethard,
R. Cotterell, T. Chakraborty, Y. Zhou (Eds.), Proceedings of the 2021 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language
Technologies, Association for Computational Linguistics, Online, 2021, pp. 3784–3798. URL:
https://aclanthology.org/2021.naacl-main.297/. doi:10.18653/v1/2021.naacl- main.297.
[11] H. Zhang, B. Y. Wu, X. Yuan, S. Pan, H. Tong, J. Pei, Trustworthy graph neural networks:</p>
      <p>Aspects, methods and trends (2022). doi:10.48550/arxiv.2205.07424.
[12] M. Danilevsky, K. Qian, R. Aharonov, Y. Katsis, B. Kawas, P. Sen, A Survey of the State
of Explainable AI for Natural Language Processing, in: K.-F. Wong, K. Knight, H. Wu
(Eds.), Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association
for Computational Linguistics and the 10th International Joint Conference on Natural
Language Processing, Association for Computational Linguistics, Suzhou, China, 2020, pp.
447–459. URL: https://aclanthology.org/2020.aacl-main.46.
[13] M. Sundararajan, A. Taly, Q. Yan, Axiomatic attribution for deep networks, CoRR
abs/1703.01365 (2017). URL: http://arxiv.org/abs/1703.01365. arXiv:1703.01365.
[14] D. Smilkov, N. Thorat, B. Kim, F. Viégas, M. Wattenberg, Smoothgrad: removing noise by
adding noise, arXiv preprint arXiv:1706.03825 (2017).
[15] M. T. Ribeiro, S. Singh, C. Guestrin, ” why should i trust you?” explaining the predictions
of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on
knowledge discovery and data mining, 2016, pp. 1135–1144.
[16] S. M. Lundberg, S.-I. Lee, A unified approach to interpreting model predictions, Advances
in neural information processing systems 30 (2017).
[17] K. Clark, U. Khandelwal, O. Levy, C. D. Manning, What does bert look at? an analysis of
bert’s attention, arXiv preprint arXiv:1906.04341 (2019).
[18] S. Serrano, N. A. Smith, Is attention interpretable?, arXiv preprint arXiv:1906.03731 (2019).
[19] X. Jin, Z. Wei, J. Du, X. Xue, X. Ren, Towards hierarchical importance attribution:
Explaining compositional semantics for neural sequence models, arXiv preprint arXiv:1911.06194
(2019).
[20] E. Wallace, J. Tuyls, J. Wang, S. Subramanian, M. Gardner, S. Singh, Allennlp interpret:
A framework for explaining predictions of nlp models, arXiv preprint arXiv:1909.09251
(2019).
[21] B. Hancock, M. Bringmann, P. Varma, P. Liang, S. Wang, C. Ré, Training classifiers
with natural language explanations, in: Proceedings of the conference. Association for
Computational Linguistics. Meeting, volume 2018, 2018, p. 1884.
[22] N. F. Rajani, B. McCann, C. Xiong, R. Socher, Explain yourself! leveraging language models
for commonsense reasoning, arXiv preprint arXiv:1906.02361 (2019).
[23] O.-M. Camburu, T. Rocktäschel, T. Lukasiewicz, P. Blunsom, e-snli: Natural language
inference with natural language explanations, Advances in Neural Information Processing
Systems 31 (2018).
[24] S. Wiegrefe, A. Marasović, N. A. Smith, Measuring association between labels and free-text
rationales, arXiv preprint arXiv:2010.12762 (2020).
[25] T. Lei, R. Barzilay, T. Jaakkola, Rationalizing neural predictions, arXiv preprint
arXiv:1606.04155 (2016).
[26] S. Gurrapu, A. Kulkarni, L. Huang, I. Lourentzou, L. Freeman, F. Batarseh, Rationalization
for explainable nlp: A survey. arxiv 2023, arXiv preprint arXiv:2301.08912 (????).
[27] A. Jacovi, Y. Goldberg, Towards faithfully interpretable nlp systems: How should we
define and evaluate faithfulness?, arXiv preprint arXiv:2004.03685 (2020).
[28] T. Wu, M. T. Ribeiro, J. Heer, D. S. Weld, Polyjuice: Generating counterfactuals for
explaining, evaluating, and improving models, arXiv preprint arXiv:2101.00288 (2021).
[29] J. Hewitt, C. D. Manning, A structural probe for finding syntax in word representations,
in: Proceedings of the 2019 Conference of the North American Chapter of the Association
for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short
Papers), 2019, pp. 4129–4138.
[30] I. Tenney, D. Das, E. Pavlick, Bert rediscovers the classical nlp pipeline, arXiv preprint
arXiv:1905.05950 (2019).
[31] M. T. Ribeiro, T. Wu, C. Guestrin, S. Singh, Beyond accuracy: Behavioral testing of nlp
models with checklist, arXiv preprint arXiv:2005.04118 (2020).
[32] R. Rudinger, A. Teichert, R. Culkin, S. Zhang, B. Van Durme, Neural-davidsonian semantic
proto-role labeling, arXiv preprint arXiv:1804.07976 (2018).
[33] Z. Wu, Y. Chen, B. Kao, Q. Liu, Perturbed masking: Parameter-free probing for analyzing
and interpreting bert, arXiv preprint arXiv:2004.14786 (2020).
[34] K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, Bleu: a method for automatic evaluation of
machine translation, in: Proceedings of the 40th annual meeting of the Association for
Computational Linguistics, 2002, pp. 311–318.
[35] C.-Y. Lin, Rouge: A package for automatic evaluation of summaries, in: Text summarization
branches out, 2004, pp. 74–81.
[36] S. Serrano, N. A. Smith, Is attention interpretable?, in: A. Korhonen, D. Traum, L. Màrquez
(Eds.), Proceedings of the 57th Annual Meeting of the Association for Computational
Linguistics, Association for Computational Linguistics, Florence, Italy, 2019, pp. 2931–2951.
URL: https://aclanthology.org/P19-1282. doi:10.18653/v1/P19- 1282.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Bommasani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Klyman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kapoor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Longpre</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Maslej</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <source>The foundation model transparency index v1. 1: May</source>
          <year>2024</year>
          , arXiv preprint arXiv:
          <volume>2407</volume>
          .12929 (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>