<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Topological Data Analysis for Trustworthy AI</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Victor Toscano Durán</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Applied Mathematics I, University of Sevilla</institution>
          ,
          <addr-line>Sevilla</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Artificial Intelligence (AI) is transforming industries by analyzing large amounts of data to find patterns and make decisions more eficiently than ever before. Neural networks, which are inspired by the human brain, are a key part of AI but often work in ways that are hard to understand, leading to concerns about their reliability. This doctoral research proposal, titled “Topological Data Analysis for Trustworthy AI,” aims to tackle these issues by using Topological Data Analysis (TDA) and Computational Topology. The research will develop a new method to compare piecewise neural networks with ReLU activation functions using topological entropy, which could help make these networks more transparent. It will also apply TDA techniques to improve the analysis of time series data in neural networks, aiming to enhance prediction accuracy and understanding of how these networks work over time. Additionally, the study will look at applying TDA to recurrent neural networks like LSTM and potentially to Transformer models. This research aims to make AI systems more reliable and understandable, with benefits for areas like healthcare and autonomous systems. The proposal also includes plans for attending conferences and publishing research findings.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Artificial Intelligence</kwd>
        <kwd>Neural Networks</kwd>
        <kwd>Topological Data Analysis</kwd>
        <kwd>Time series</kwd>
        <kwd>reliability</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Context and Motivation</title>
      <p>
        Artificial Intelligence (AI) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] stands at the forefront of technological advancement, reshaping
industries, economies, and our interaction with the world. From virtual assistants like Siri and
Alexa to self-driving cars and advanced medical diagnostics, AI is pervasive in modern life. Its
growth trajectory has been nothing short of remarkable, with exponential leaps in capability
and application.
      </p>
      <p>The importance of AI lies in its ability to process vast amounts of data, identify patterns,
and make decisions with a level of eficiency and accuracy unmatched by human counterparts.
This capacity has propelled AI into domains once deemed exclusive to human intelligence,
from natural language processing and image recognition to strategic decision-making and
problem-solving.</p>
      <p>
        Central to the functionality of AI systems are neural networks [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], inspired by the biological
neural networks of the human brain. These interconnected layers of nodes, or artificial neurons,
simulate the processes of learning and adaptation through iterative training on labeled datasets.
As more data is fed into the network, it adjusts its internal parameters to optimize performance,
enabling it to recognize complex patterns and make predictions with increasing accuracy.
      </p>
      <p>
        However, alongside its tremendous potential, AI also presents signicfiant challenges, chief
among them being the issue of reliability, particularly concerning so-called “black box”
algorithms [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. While neural networks excel at solving complex problems, their inner workings often
remain opaque to human understanding. This lack of transparency raises concerns regarding
the reliability and trustworthiness of AI systems, especially in high-stakes applications such as
autonomous vehicles, medical diagnosis, and financial forecasting.
      </p>
      <p>The problem of reliability in AI underscores the need for transparency and accountability
in algorithmic decision-making. Eforts to address this issue include research into explainable
AI, which aims to develop models that not only produce accurate results but also provide
insights into the reasoning behind those decisions. By making AI systems more interpretable
and understandable to human users, researchers hope to build trust and mitigate the risks
associated with opaque algorithms.</p>
      <p>Despite these challenges, the potential benefits of AI are undeniable, with far-reaching
implications for virtually every sector of society. From improving healthcare outcomes and
optimizing resource allocation to enhancing cybersecurity and mitigating climate change, AI
ofers solutions to some of humanity’s most pressing problems. As we continue to harness
the power of artificial intelligence, it is essential to remain vigilant, balancing innovation with
ethical considerations and ensuring that AI serves the collective good.</p>
      <p>
        In the ever-evolving landscape of artificial intelligence (AI), where innovation is the norm
and breakthroughs are constant, emerging fields like Topological Data Analysis (TDA) [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ]
and Computational Topology [6] are gaining recognition for their potential to augment the
eficiency and capabilities of neural networks and AI systems as a whole.
      </p>
      <p>At its core, TDA is a branch of mathematics that leverages tools from algebraic topology to
analyze the shape, structure, and connectivity of complex data sets. By applying topological
principles to high-dimensional data, TDA seeks to extract meaningful insights that may be
obscured by traditional statistical or geometric methods. This approach allows researchers
and practitioners to uncover hidden patterns, identify critical features, and gain a deeper
understanding of the underlying structure inherent in the data.</p>
      <p>
        Computational Topology, on the other hand, focuses on the development and implementation
of algorithms and computational techniques for solving topological problems. It bridges the gap
between theoretical concepts in topology and practical applications in fields such as computer
science, engineering, and data analysis. Through the use of advanced computational tools,
Computational Topology enables researchers to tackle complex problems in data analysis,
visualization, and machine learning [
        <xref ref-type="bibr" rid="ref2">2, 7</xref>
        ].
      </p>
      <p>One of the most promising aspects of TDA and Computational Topology is their potential to
enhance the eficiency and efectiveness of neural networks and AI algorithms. By incorporating
topological insights into the design and training of neural networks, researchers can develop
more robust and adaptive models capable of handling diverse and complex data sets. TDA
techniques such as persistent homology have been successfully applied to tasks such as image
recognition, natural language processing, and time-series analysis, demonstrating their eficacy
in extracting meaningful features and improving classification accuracy.</p>
      <p>In recent years, there has been a growing interest in interdisciplinary research at the
intersection of TDA, Computational Topology, and artificial intelligence. Collaborative eforts between
mathematicians, computer scientists, and domain experts have yielded novel approaches and
techniques for solving complex problems in data analysis and machine learning. This
convergence of disciplines holds great promise for advancing the capabilities of AI systems and
unlocking new opportunities for innovation across a wide range of applications.</p>
      <p>As we continue to explore the synergies between TDA, Computational Topology, and artificial
intelligence, it is clear that these fields will play an increasingly important role in shaping the
future of data-driven decision-making, enabling more eficient, reliable, and interpretable AI
systems.</p>
      <p>In summary, in this doctoral proposal named “Topological Data Analysis for Trustworthy AI”,
I will focus on the application of Topological Data Analysis and Computational Topology as a
fundamental tool for improving the reliability of artificial intelligence in challenging contexts.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>In recent years, there has been a surge of interest in integrating artificial intelligence (AI) with
topological data analysis (TDA) to enhance the eficiency, robustness, and interpretability of
AI systems. This section explores some of the key contributions and advancements in this
interdisciplinary research area.</p>
      <sec id="sec-2-1">
        <title>Enhanced Data Analysis with Topological Summaries</title>
        <p>An emerging focus is the application of topological summaries to improve data analysis itself.
These summaries are mathematical tools used in topological data analysis (TDA) to capture and
characterize the underlying structure of complex data sets by focusing on the intrinsic structure
of complex datasets, rather than relying on traditional geometric methods. These summaries,
rooted in concepts like homology and persistent homology, capture the fundamental shapes and
features of data, providing stable representations that resist noise and variations. By integrating
these topological descriptors, researchers can gain deeper insights into the data, leading to more
informed decisions and enhanced analysis outcomes.</p>
      </sec>
      <sec id="sec-2-2">
        <title>TDA for Feature Extraction in AI</title>
        <p>Topological Data Analysis (TDA) has proven to be a powerful method for extracting meaningful
features from high-dimensional data, which traditional techniques often overlook. Persistent
homology, a key TDA tool, captures stable topological features like connected components
and loops across multiple scales, making it efective for AI tasks such as image recognition. By
identifying essential features that enhance classification accuracy, TDA has been successfully
applied in areas like object detection and texture classification, where the data’s inherent shape
is critical for distinguishing categories.</p>
      </sec>
      <sec id="sec-2-3">
        <title>Topological Representations for Neural Networks</title>
        <p>Incorporating topological representations into neural network design and training ofers
promising improvements in generalization, overfitting reduction, and interpretability. Topological
regularization, which imposes topological constraints during learning, helps neural networks
capture essential data structures, stabilizes training, and increases resilience against adversarial
attacks. Additionally, using topological insights to refine decision boundaries enhances the
robustness and reliability of AI models, contributing to the development of more efective and
interpretable neural networks.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Interpretable AI with TDA</title>
        <p>The integration of Topological Data Analysis (TDA) into AI models has advanced the field
of interpretable AI by making complex systems more transparent and accountable.
TDAbased methods provide topological explanations for AI decisions, ofering insights into how
data features influence predictions, particularly in critical areas like healthcare, finance, and
autonomous systems. This aligns with the goals of explainable AI (XAI), where the focus
extends beyond performance to include the interpretability and trustworthiness of AI outputs,
addressing the increasing demand for transparency in AI technologies.</p>
        <p>Through these advancements, TDA not only contributes to the development of more
eficient and robust AI systems but also addresses the growing demand for interpretability and
transparency in AI technologies.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Research Questions, Hypothesis, and Objetives</title>
      <p>In this project, we embark on the application of topology as a fundamental tool to enhance
the reliability of neural networks in challenging contexts. I aim to initially focus on achieving
satisfactory results in the previous research done by my advisors, and then applying Topological
Data Analysis (TDA) techniques to analyze time series data in neural networks, aiming to
improve prediction accuracy and understand temporal dynamics.</p>
      <sec id="sec-3-1">
        <title>Questions</title>
        <p>Hypothesis
1. How can topology be leveraged to improve the reliability of neural networks in challenging
contexts?
2. What role does topological entropy play in measuring similarities between piecewise
neural networks using activation functions like ReLU?
3. How can TDA be extended to analyze time series data in neural networks, and what
insights can be gained from this analysis?
1. Piecewise neural networks employing ReLU activation functions can be evaluated for
similarity using topological entropy, leading to greater transparency in their operation.
2. The application of TDA to time series analysis in neural networks will yield valuable
insights into the temporal dynamics of network behavior and improve predictive
performance.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Objectives</title>
        <p>Firstly, I will extend previous research carried out by my advisors in [8] on piecewise neural
networks [9], particularly those using ReLU activation functions, by developing a new approach
based on topological entropy to measure similarities between these networks. In addition,
evaluate the efectiveness of the proposed approach in improving the transparency and reliability
of piecewise neural networks. In summary, the research conducted by my advisors in [8] will
be the starting point of my thesis.</p>
        <p>Secondly, my research will extend into the application of Topological Data Analysis (TDA)
techniques to the analysis of time series data within neural networks. This will involve leveraging
existing knowledge in time series analysis and integrating recent advancements in the field.
A significant aspect of this part of the research will be to explore how incorporating TDA
can improve prediction accuracy and provide deeper insights into temporal dynamics. For
that, topological descriptors, also known as topological summaries, will be used, which are
mathematical tools used in topological data analysis to capture and characterise the underlying
structure of complex data sets. Unlike traditional data analysis methods that rely on specific
geometry and Euclidean metrics, topological descriptors focus on the intrinsic properties of the
data space, providing a robust and stable representation in the face of noise and variations in
the data. Additionally, I will evaluate the applicability of TDA techniques to recurrent neural
networks, such as Long Short-Term Memory (LSTM) networks [10, 11, 12, 13, 14, 15, 16, 17, 18,
19].</p>
        <p>Finally, contingent on achieving positive results and having suficient time, there may be an
opportunity to expand the research to include “Transformers” models [20, 21].</p>
        <p>Moreover, as part of my professional development, I intend to attend numerous conferences
to stay abreast of the latest advancements and network with peers in the field. I also plan to
write articles related to my thesis and present them at conferences to contribute to the academic
community.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Stay</title>
        <p>I have arranged two research stays as part of my thesis. The first will take place in October 2024 at
the Institute of Electronic, Informatics and Telecommunications Engineering, National Research
Council in Genoa, under the guidance of Prof. Maurizio Mongelli, who specializes in machine
learning applied to bioinformatics and cyber-physical systems. During this stay, I will focus
on advancing my understanding of machine learning techniques, particularly in the context of
explainable AI. The second stay is planned for Summer 2025 at Bastian Grossenbacher Rieck’s
laboratory in Helmholtz Munich, a leading center for machine learning research, especially
in computational healthcare. There, I will work with the AIDOS Lab, under the guidance of
Prof. Bastian Rieck 1, who is focus on geometry and topology in machine learning with a keen
interest in biomedical applications, to deepen my knowledge of topological machine learning
techniques and their application in healthcare.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Research Approach and Methods</title>
      <sec id="sec-4-1">
        <title>Approach</title>
        <p>The research approach for this study combines theoretical exploration, algorithm development,
and empirical validation to investigate the application of topological data analysis (TDA) in
enhancing the reliability and transparency of neural networks, particularly in challenging
contexts.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Methods</title>
        <p>The methodological approach of this research involves three main phases:
• First, a theoretical exploration will be conducted through a comprehensive review of
existing literature on TDA, neural networks, and topological concepts like persistent
homology and topological entropy, aiming to develop new methodologies for assessing
neural network reliability and enhancing interpretability.
• The second phase focuses on the development of novel algorithms to apply TDA to neural
network architectures, designing techniques for measuring similarities using topological
summaries like persistent entropy, with particular emphasis on ReLU networks. Moreover,
this phase will focus in extend TDA techniques to analyze time series data and integrating
them into neural networks for time series, like recurrent neural networks.
• Finally, empirical validation will be carried out by implementing these algorithms on
realworld datasets to evaluate their efectiveness in improving the reliability, interpretability,
and performance of neural networks, and comparing the outcomes with traditional
approaches.</p>
        <p>The underlying hypothesis is that integrating TDA techniques, which ofer a unique
perspective on data structure and relationships, can overcome the limitations of traditional neural
networks, especially in complex and high-dimensional data domains, thereby enhancing their
performance and transparency, and enhance the analysis of time series.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Preliminary Results and Contributions</title>
      <p>The research is currently in its early stages, focusing on a comprehensive literature review to
identify relevant methodologies and theoretical frameworks for integrating Topological Data
Analysis (TDA) with neural networks. While concrete results are not yet available, initial findings
suggest promising avenues for enhancing AI interpretability and reliability through topological
methods, establishing a strong foundation for future empirical research and experimentation.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Expected Next Research Steps</title>
      <p>We plan to refine our approach for measuring similarities between piecewise neural networks
using topological entropy. This involves enhancing mathematical models that quantify the
relationships between diferent neural network components based on their topological properties,
aiming to develop a robust metric that better captures the complexity and behavior of these
networks. Additionally, we will explore advanced techniques for applying topological data
analysis (TDA) to time series within neural networks, with the goal of improving the predictive
power and interpretability of AI models.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Expected final contribution to knowledge</title>
      <p>The expected outcome of this research is a significant contribution to the field of AI, particularly
in the areas of reliability, transparency, and interpretability. By integrating Topological Data
Analysis with neural networks, the research aims to produce AI systems that are not only more
accurate but also more understandable to human users. This integration has the potential to
revolutionize how neural networks are designed and applied, particularly in high-stakes areas
where trust and transparency are paramount. Ultimately, the research aspires to bridge the gap
between complex AI models and human interpretability, contributing to the development of AI
systems that are both powerful and ethically sound.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>Thanks to my thesis tutor, Rocío González Díaz, for her invaluable help, advice, and for this
incredible opportunity, and to my thesis supervisors, Miguel Ángel Gutiérrez Naranjo and Matteo
Rucco, for their guidance and support. This work was supported in part by the European Union
HORIZON-CL4-2021-HUMAN-01-01 under grant agreement 101070028 (REXASI-PRO) and by
TED2021-129438B-I00 / AEI/10.13039/501100011033 / Unión Europea NextGenerationEU/PRTR.
[6] H. Edelsbrunner, J. Harer, Computational Topology: An Introduction, Springer, 2010.</p>
      <p>doi:10.1007/978-3-540-33259-6\_7.
[7] G. F. Marcus, Deep learning: A critical appraisal, ArXiv abs/1801.00631 (2018). doi:10.</p>
      <p>48550/arXiv.1801.00631.
[8] M. Rucco, R. Gonzalez-Diaz, M.-J. Jimenez, N. Atienza, C. Cristalli, E. Concettoni,
A. Ferrante, E. Merelli, A new topological entropy-based approach for measuring
similarities among piecewise linear functions, Signal Processing 134 (2017) 130–138.
doi:10.1016/j.sigpro.2016.12.006.
[9] Q. Tao, L. Li, X. Huang, X. Xi, S. Wang, J. Suykens, Piecewise linear neural
networks and deep learning, Nature Reviews Methods Primers 2 (2022) 42. doi:10.1038/
s43586-022-00125-7.
[10] Y. Zhou, Persistent homology on time series, 2016. doi:https://doi.org/10.7939/</p>
      <p>R3K931F13.
[11] N. Ravishanker, R. Chen, An introduction to persistent homology for time series, WIREs</p>
      <p>Computational Statistics 13 (2021). doi:10.1002/wics.1548.
[12] S. C. di Montesano, H. Edelsbrunner, M. Henzinger, L. Ost, Dynamically maintaining the
persistent homology of time series, ArXiv abs/2311.01115 (2023). doi:10.48550/arXiv.
2311.01115.
[13] C. M. Pereira, R. F. de Mello, Persistent homology for time series and spatial data clustering,
Expert Systems with Applications 42 (2015) 6026–6038. doi:10.1016/j.eswa.2015.04.
010.
[14] T. Ichinomiya, Time series analysis using persistent homology of distance matrix,
Nonlinear Theory and Its Applications, IEICE 14 (2023) 79–91. doi:10.1587/nolta.14.79.
[15] Y.-M. Chung, W. Cruse, A. Lawson, A persistent homology approach to time series
classification (2020). doi: 10.48550/arXiv.2003.06462. arXiv:2003.06462.
[16] L. Leaverton, Analysis of financial time series using tda: theoretical and empirical results,
2020. URL: http://hdl.handle.net/2445/163638.
[17] G. Ma, Using topological data analysis to process time-series data: A persistent homology
way, Journal of Physics: Conference Series 1550 (2020) 032082. doi:10.1088/1742-6596/
1550/3/032082.
[18] Y. Umeda, J. Kaneko, H. Kikuchi, Topological data analysis and its application to time-series
data analysis, Fujitsu Scientific and Technical Journal 55 (2019) 65–71.
[19] M. L. Sánchez, Persistent homology: Functional summaries of persistence diagrams for
time series analysis, 2021. URL: http://hdl.handle.net/2445/181324.
[20] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I.
Polosukhin, Attention is all you need, in: Proceedings of the 31st International Conference
on Neural Information Processing Systems, Curran Associates Inc., 2017, p. 6000–6010.
doi:10.48550/arXiv.1706.03762.
[21] R. Reinauer, M. Caorsi, N. Berkouk, Persformer: A transformer architecture for topological
machine learning (2022). doi:10.48550/arXiv.2112.15210. arXiv:2112.15210.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Winston</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Artificial</given-names>
            <surname>Intelligence</surname>
          </string-name>
          , A-W Series in Computerscience, Addison-Wesley Publishing Company,
          <year>1992</year>
          . URL: https://books.google.es/books?id=b4owngEACAAJ.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , Y. Bengio, G. Hinton,
          <article-title>Deep learning</article-title>
          ,
          <source>Nature</source>
          <volume>521</volume>
          (
          <year>2015</year>
          )
          <fpage>436</fpage>
          -
          <lpage>44</lpage>
          . doi:
          <volume>10</volume>
          .1038/ nature14539.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>
          ,
          <source>Nature</source>
          (
          <year>2019</year>
          ). doi:
          <volume>10</volume>
          .1038/s42256-019-0048-x.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>G. Carlsson,</surname>
          </string-name>
          <article-title>Topology and data</article-title>
          ,
          <source>Bulletin of The American Mathematical Society - BULL AMER MATH SOC 46</source>
          (
          <year>2009</year>
          )
          <fpage>255</fpage>
          -
          <lpage>308</lpage>
          . doi:
          <volume>10</volume>
          .1090/S0273-0979-09-01249-X.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Chazal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <article-title>An introduction to topological data analysis: Fundamental and practical aspects for data scientists</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          <volume>4</volume>
          (
          <year>2017</year>
          ). doi:
          <volume>10</volume>
          . 3389/frai.
          <year>2021</year>
          .
          <volume>667963</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>