What Is an AI Vulnerability, and Why Should We
                                Care? Unpacking the Relationship Between AI
                                Security and AI Ethics
                                Lauri Tuovinen1,∗ , Kimmo Halunen1,2
                                1
                                    Biomimetics and Intelligent Systems Group, P.O. Box 4500, FI-90014 University of Oulu, Finland
                                2
                                    Department of Military Technology, National Defence University, P.O. Box 7, FI-00861 Helsinki, Finland


                                               Abstract
                                               Artificial intelligence (AI) systems are vulnerable to new types of attack such as adversarial examples and
                                               prompt injection, which cause the system to behave in unintended ways and potentially lead to harm.
                                               Taking care of the security of AI systems is therefore viewed in AI ethics as an important part of ensuring
                                               that central values such as safety or personal privacy are not jeopardised by AI systems. However, this
                                               view of AI security oversimplifies its relationship with AI ethics, as there are also situations where
                                               security may need to be traded off against another ethics requirement or where the exploitation of an AI
                                               vulnerability can be argued to be ethically justified. To provide a more nuanced view, the paper reviews
                                               the conception of security as an ethics principle in a selection of AI ethics guides and examines some
                                               notable cases where a tension exists between security and some other AI ethics principle. Furthermore,
                                               the existence of vulnerabilities in AI systems does not directly translate into harm, so it is important to
                                               distinguish theoretical scenarios involving AI vulnerabilities from their actual real-world impact. To
                                               gauge the impact, a search targeting six different incident repositories was carried out; it was observed
                                               that such efforts are hindered by the concept of AI vulnerability being vaguely defined and by the lack
                                               of a good repository that would allow exploration of AI incidents specifically involving exploitation of
                                               vulnerabilities. The search yielded only a very small number of relevant incidents, which is taken to
                                               indicate that the scale of the problem is currently small. However, it is also recognised that there is likely
                                               to be some number of incidents that the search missed, either because they were not included in the
                                               databases searched or because the search method failed to find them.

                                               Keywords
                                               artificial intelligence, cybersecurity, adversarial attack, AI incident, AI ethics principle, non-maleficence


                                1. Introduction
                                One of the common safety concerns regarding artificial intelligence (AI) has to do with the
                                robustness of AI systems against adversarial attacks. In addition to traditional types of cyber-
                                attack, AI systems may be vulnerable to ones that specifically target the AI algorithms, such
                                as the creation of adversarial examples to induce errors in the outputs of machine learning
                                (ML) models. For example, in 2018 it was demonstrated how ML models trained for road-sign
                                classification could be attacked by physically altering the signs in relatively subtle ways that
                                leave the sign legible to a human viewer but confuse the classification model [1].
                                7th Conference on Technology Ethics (TETHICS2024), November 6–7, 2024, Tampere, Finland
                                ∗
                                    Corresponding author.
                                Envelope-Open lauri.tuovinen@oulu.fi (L. Tuovinen); kimmo.halunen@oulu.fi (K. Halunen)
                                Orcid 0000-0002-7916-0255 (L. Tuovinen); 0000-0003-1169-5920 (K. Halunen)
                                             © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   An obvious example of an application where algorithmic recognition of road signs is needed
is autonomous vehicles. It is similarly obvious that if the ability of a self-driving car to perform
such tasks reliably can be sabotaged by tampering with objects that a malicious actor can
relatively easily gain physical access to, this represents a considerable safety hazard. However,
there is a difference between the demonstration of a vulnerability in academic research or
organisational red teaming and exploitation of the vulnerability by a malicious actor. Especially
when the mainstream media reports on discovered vulnerabilities, there is a natural tendency
to focus on worst-case scenarios, but it is important to separate these from the actual real-world
impact of the vulnerabilities.
   A problem that emerges here is that the lack of established reporting and cataloguing practices
makes it difficult to accurately gauge that real-world impact. There are various online databases
collecting and publishing reports of AI vulnerabilities and incidents, but these vary considerably
in terms of inclusion criteria, information stored, coverage and general quality control. Non-AI-
specific cyber incident databases, on the other hand, do not provide any easy way to discover
reports pertaining to AI vulnerabilities specifically.
   It is generally accepted that ensuring the security of AI systems is an important part of
responsible AI development and use. However, the relationship between AI security and AI
ethics is by no means straightforward. One aspect of this is that the connection between AI
vulnerabilities and real-world AI harm remains difficult to define while the reporting of incidents
remains inconsistent and while there is not even a consensus on what exactly constitutes an
AI vulnerability. For example, the possibility of bypassing the built-in safeguards of general-
purpose AI (GPAI) tools such as ChatGPT to generate unethical content may be classified as
a vulnerability, but from another point of view, the possibility of malicious use is part of the
fundamental nature of GPAI and any safeguards that may have been implemented are just
arbitrary constraints on their functionality.
   Furthermore, the translation of AI ethics principles into practical requirements is also not
straightforward, as they must be interpreted within a specific real-world context [2]. Security is
no exception here: while one might intuitively think that stronger security measures are always
unambiguously better, circumstances may in fact arise where it is necessary to weigh security
against another AI ethics principle (e.g. explainability) and seek an acceptable trade-off. There
are also situations where an actor exploiting a vulnerability in an AI system, while malicious
from the perspective of the operator of the system, arguably has a legitimate reason for their
actions (e.g. defending their privacy against mass surveillance) and is not doing anything
unethical or illegal.
   In this paper we explore the cluster of vaguely defined concepts at the intersection of AI
security and AI ethics, aiming to at least partially answer the following questions:
    • What characterises security as an AI ethics principle?
    • Based on incidents recorded in public databases, what is the real-world impact of known
      vulnerabilities in AI systems?
    • Does this picture correspond to reality?
    • Under what circumstances is security in conflict with other AI ethics principles?
  The remainder of the paper is organised as follows: In Section 2, we review some well-known
sets of AI ethics principles and examine the conception of security emerging from them. In
Section 3, we discuss the confusion regarding the definition of AI vulnerability and establish a
set of criteria for incidents where the exploitation of an AI vulnerability resulted in demonstrable
real-world harm. In Section 4, we present the results of a search for such incidents in six public
databases. In Section 5, we review situations where security is misaligned with other important
principles. In Section 6 we discuss our findings, and in Section 7 we present our conclusions.


2. Security as an AI Ethics Principle
In [3], 84 documents containing AI ethics principles or guidelines are reviewed and analysed.
The authors of the paper synthesise their findings into a list of eleven values and principles.
In order of the frequency at which they occur in the various documents studied, these are
transparency, justice and fairness, non-maleficence, responsibility, privacy, beneficence, freedom
and autonomy, trust, dignity, sustainability, and solidarity.
   Notably, security is not identified as an ethics principle in its own right. Instead, security
is mentioned in the more detailed discussion in the context of two of the principles: non-
maleficence and privacy. This is not the whole picture, since the documents from which the
principles have been synthesised represent a range of granularities: some feature security more
explicitly, while others do not even include privacy as an independent principle but subsume
it under the more general principle of avoiding harm. Overall, however, it would appear that
security is usually not considered an ethics principle as such but rather an enabling requirement
implied by some higher-level value or principle.
   A look at some well-known sets of AI ethics principles will serve to illustrate this. In the
Asilomar AI Principles [4], the safety principle simply states that AI systems should be “safe
and secure throughout their operational lifetime”, without discussing the meaning of the terms
in any detail. UNESCO’s Recommendation on the Ethics of Artificial Intelligence [5] is slightly
more detailed in its safety and security principle, defining safety as the avoidance of “unwanted
harms” and security as the avoidance of “vulnerabilities to attack”. In the Montréal Declaration
for a Responsible Development of Artificial Intelligence [6], the roughly equivalent principle
is the one of prudence, which states (among other things) that AI systems must “meet strict
reliability, security, and integrity requirements”, “not put people’s lives in danger, harm their
quality of life, or negatively impact their reputation or psychological integrity”, and “protect
the integrity and confidentiality of personal data”.
   The Ethically Aligned Design document of the IEEE Global Initiative on Ethics of Autonomous
and Intelligent Systems [7] mentions the requirement of safety and security under its human
rights principle and discusses the risk of adversarial attacks in more detail under the awareness
of misuse principle. In the Ethics Guidelines for Trustworthy AI proposed by the European
Commission’s High-Level Expert Group on Artificial Intelligence (AI HLEG) [8], the principle
of prevention of harm states that AI systems “and the environments in which they operate must
be safe and secure”, and also that they must be “technically robust and it should be ensured that
they are not open to malicious use”. The AI4People framework [9], drawing on several of the
above, similarly subsumes security under its non-maleficence principle.
   It is not hard to come up with examples of how adversarial attacks against AI systems may
compromise values other than safety and privacy. For instance, by poisoning the training data
of an AI system, an attacker could induce biases into its decision-making algorithms, causing it
to violate the principle of fairness. It therefore makes sense to view security as a necessary-
but-not-sufficient prerequisite to several – perhaps even all – ethics principles rather than an
ethics principle per se. Some of the collections of ethics principles explicitly acknowledge the
possibility of data poisoning or other AI-specific attacks such as “gaming”, a term used in [7] to
refer to exploiting the learned behaviour of an AI system to trigger responses not intended by
the deployer.
   Something that is not necessarily mentioned explicitly but is at least implicit in discussions
of security as an AI ethics principle is that they view security primarily as an obligation that the
deployer of an AI system has to those affected by the system. The deployer, in contrast, would
traditionally view security first and foremost through the lens of protecting the confidentiality,
integrity and availability of its own assets. Often the interests of different stakeholders are
mutually aligned or at least compatible, but not always; some examples of the latter are discussed
in Section 5.


3. AI Vulnerabilities and Incidents
The current state of AI incident documentation is reviewed in [10], where four online repositories
of AI incidents are examined: the AI, Algorithmic, and Automation Incidents and Controversies
Repository (AIAAIC) [11], the AI Incident Database (AIID) [12], the AI Vulnerability Database
(AVID) [13] and Where in the World is AI? [14]. Among these, AIAAIC, AIID and AVID present
a traditional tabular view, while Where in the World presents a map view with the locations of
incidents marked with different colours representing different application domains. The latter
also differs from the others in that it collects both harmful and helpful cases, whereas the others
focus on harmful ones.
   The only one among the four repositories that explicitly aims to document AI vulnerabilities
is AVID, which makes a distinction between vulnerabilities and reports. A vulnerability is defined
as “a high-level evidence of an AI failure mode”, whereas a report is “one example of a particular
vulnerability occurring, supported by qualitative or quantitative evaluation”. AVID reports are
thus equivalent to what are termed incidents by AIAAIC and AIID and cases by Where in the
World. On the face of it, AVID would seem to be the most promising option for finding cases
of real-world harm resulting specifically from the exploitation of an AI vulnerability, but in
practice this is not the case, for several reasons:

    • The distinction between vulnerabilities and reports is not consistently observed; many of
      the entries listed as vulnerabilities describe individual examples and should therefore be
      classified as reports. In some cases there is one entry listed as a vulnerability and another,
      identical entry listed as a report (e.g. entries AVID-2023-V025 and AVID-2023-R0001,
      respectively).
    • The quality of the entries is likewise inconsistent. In extreme cases, the details section of
      the entry contains only some placeholder text (AVID-2022-V003) or text referring to a
      completely different case, apparently as a result of copy-and-paste (AVID-2023-V022).
    • The definition of vulnerability is very broad, resulting in the inclusion of failure modes
      whose relevance to security seems tenuous, such as ChatGPT failing to follow lexical
      constraints in user prompts (AVID-2023-V025). On the other hand, the database also
      includes entries describing uses of AI that are ethically questionable but arguably do not
      represent a failure mode, such as the generation and distribution of a deepfaked video of
      the president of Ukraine (AVID-2022-V009).
    • Despite the loose inclusion criteria, the coverage of the database is modest, with fewer
      than 50 entries (vulnerabilities and reports) in total at the time of writing. AIID covers
      some 650 incidents and AIAAIC, the largest one of the four, some 1400, although it is
      worth noting that the latter is not limited to incidents involving AI.

   Regarding the third point, it should be noted that whether or not a failure mode can be
adversarially exploited depends ultimately on the context of use. Given the vast range of
potential uses of GPAI systems, we should be cautious about categorically declaring that a given
failure mode does not have any security implications, even if the idea of it being exploited seems
far-fetched. However, if any observed failure mode may or may not represent a vulnerability
depending on context, this suggests that looking for AI vulnerabilities without considering the
context of use is not very fruitful in the first place.
   In addition to the four databases mentioned in [10], it is worth looking into generic databases
of cybersecurity vulnerabilities and incidents. In the United States, the Cybersecurity and
Infrastructure Security Agency (CISA) maintains the Known Exploited Vulnerabilities (KEV)
catalogue [15], and in Europe, the European Repository of Cyber Incidents (EuRepoC) maintains
the EuRepoC database [16]. These have the advantage of being more pertinent to vulnerabilities
as the term is normally understood in the security domain, but also the major disadvantage
that the vast majority of the database entries are not related to AI vulnerabilities, and there is
no easy way to filter these out to reveal those entries that are relevant to the question at hand.
   There seems to be no single database that could give a straightforward and satisfactory
answer to the question of the real-world impact of AI vulnerabilities. However, each of the six
databases mentioned above could be reasonably expected to shed some light on the question, if
searched using a suitable approach. To design such an approach, it is first necessary to specify
precise criteria for entries considered relevant. These are as follows:
    • Incident: A relevant entry must be based on a reliable report of a real-world event.
      Hypothetical scenarios and unconfirmed allegations are not considered relevant.
    • Intrusion: A relevant entry must involve deliberate exploitation of a weakness in the
      system. Instances where the system fails without the involvement of an intruder are not
      considered relevant.
    • Impact: A relevant entry must involve an incident where demonstrable real-world
      harm was inflicted. Theoretically possible consequences not proven in practice are not
      considered relevant.
    • Intelligence: A relevant entry must involve a vulnerability specifically in an AI com-
      ponent of the system. Instances where a traditional vulnerability in an AI system is
      exploited are not considered relevant.

  A search of the databases to identify entries that satisfy the criteria was carried out in
November 2023. The search methodology and the results of the search are described in the next
section.
4. Real-World Impact of AI Vulnerabilities
To narrow down the list of potentially relevant entries, different search approaches were
applied to different databases. For AIAAIC and AIID, a list of search terms that could indicate
malicious activity (e.g. “adversarial”, “attack”. “breach”, “exploit”, “hack”) was compiled and
entries containing one or more of the terms on the list were flagged as candidates. For KEV
and EuRepoC, a similar approach was used, but the search terms used were ones that could
indicate that an AI/ML system was targeted, including general terms related to such systems
(e.g. “model”) as well as more specific ones such as “poison” (for data poisoning, a type of attack
against ML systems) and “prompt” (for prompt injection, a type of attack against systems based
on language models).
   For AVID, since the number of entries was manageable and since there was (again, on the face
of it) reason to expect a high ratio of relevant to irrelevant entries, all entries in the database
were treated as candidates. Where in the World does not provide a search function, so a filter
was applied to hide the cases where AI was helpful and the remaining cases were treated as
candidates. All of the candidates in all six databases were then scanned manually and evaluated
against the criteria specified above to classify each candidate as fully relevant, partially relevant,
possibly relevant or irrelevant. As a result of this process, six incidents were identified that were
deemed to fully satisfy the relevance criteria. These are shown in Table 1; the columns of the
table correspond to the criteria, providing a summary of what happened, what kind of intrusion
was involved, what were the consequences and what was the role of AI in the incident.
   In addition to the 6 fully relevant ones, there were 27 incidents classified as partially relevant,
indicating that they were considered to satisfy three of the criteria fully and the fourth partially.
The incidents in this category were mostly demonstrations of vulnerabilities by non-malicious
actors, but there were also some cases where a vulnerability was exploited with malicious intent
but either the harm inflicted was unclear or the role of AI was marginal. 8 incidents were
classified as possibly relevant, indicating that they could be relevant but the source database
did not provide sufficient details for a definitive verdict. For example, Where in the World
would sometimes only provide a link to an article hidden behind a paywall, leaving the case
ambiguous.
   The number of fully relevant incidents is too small to permit any meaningful analysis, but if
we include the partially relevant and possibly relevant incidents, a few notable themes emerge:

    • Deception of biometric identification systems: Basic facial recognition models may
      be deceived by something as simple as a 2D photograph, whereas more advanced ones
      can be attacked by crafting a special mask. Identical twins have been reported to deceive
      both facial and voice recognition systems.
    • Evasion / manipulation of filtering algorithms: Targeted algorithms include malware
      detection algorithms of cybersecurity software suites, spam filters and content moderation
      algorithms of social media platforms.
    • Manipulation of conversational AI: In the archetypal case, the user circumvents the
      restrictions built into the system, causing it to generate unsafe content. In some incidents,
      prompt injection was used to achieved remote code execution on the host computer.
Table 1
Summary of Fully Relevant AI Incidents

 Incident                  Intrusion                 Impact                   Intelligence

 Twitter bot Tay poi-      Trolls   deliberately     Twitter users exposed    Bot used AI to learn
 soned                     sought to trigger in-     to hate speech           from interactions with
                           flammatory responses                               human users
 Facebook fact checks      Deliberate      small     Facebook users ex-       Facebook uses AI to de-
 evaded during COVID       changes made to posts     posed to COVID-19        tect posts that violate
 pandemic                  to deceive filtering      disinformation           terms and conditions
                           algorithms
 Twitter bot remoteli.io   Prompt injection vul-     Bot behaving in ways     Bot used language
 tricked into carrying     nerability exploited to   not intended by the      model to generate
 out arbitrary instruc-    hijack bot                owner                    responses to messages
 tions
 TikTok content moder-     Massive coordinated       TikTok users exposed     TikTok uses AI to de-
 ation evaded by sui-      uploading of different    to traumatising con-     tect videos that violate
 cide video                versions of the video     tent                     terms and conditions
 Chinese local govern-     Camera hijack attack      $77 million acquired     System used facial
 ment tax system com-      used to gain access to    through fraudulent in-   recognition for access
 promised                  system                    voices                   control
 PyTorch dependency        Dependency replaced       Malicious    binary      PyTorch is a widely
 chain compromised         with binary containing    downloaded by un-        used Python library for
                           malicious code            known number of PyPI     ML applications
                                                     users


    • Manipulation / hijacking of cyber-physical systems: Included in this category are
      various attacks on autonomous vehicles, but also some that target customer service
      robots, allowing passive spying through the robot’s sensors or taking active control of its
      functions.

   While these may give some indication of the types of AI vulnerabilities and attacks that are
likely to result in real-world harm, the number of incidents is still small, even with the partially
and possibly relevant ones included. In Section 6, we will discuss some possible reasons for the
small number of relevant incidents found in the search, as well as some possible biases in the
incidents that were found.


5. When Security and Ethics Clash
In ethics, it is not uncommon to face a situation where it is necessary to make a value-based trade-
off between conflicting requirements, and AI ethics is no exception. For example, explainability
of the decisions made by an AI system is widely considered an important ethics requirement, as
demonstrated by transparency being the most frequently occurring AI ethics principle in the
documents reviewed in [3]. However, as discussed in [17], a high level of explainability may be
difficult to achieve if a high level of accuracy is also required, although recent empirical work
has challenged the traditional view that there is some kind of inverse relationship between the
two [18, 19].
   In comparison with e.g. explainability or privacy, security is not an aspect of AI ethics that
figures very prominently in discussions of value trade-offs. However, as discussed in [20], the
use of explainable AI (XAI) methods makes AI systems more susceptible to attacks that aim
to degrade their performance or to infer sensitive information. Arguably this is not so much
a trade-off as a reminder that with XAI, particular care should be taken to defend the system
against such attacks; nevertheless, it is important to recognise that even though security is
instrumental by nature and often not considered an ethics principle as such, it can be at odds
with ethics principles and is not simply an enabler.
   The fact that different stakeholders may have conflicting, yet legitimate – or at least not
unambiguously illegitimate – interests regarding an AI system introduces further nuances into
the relationship between security and ethics. Similarly to what the authors of [1] did to road
sign recognition, facial-recognition systems can also be attacked through the use of physical
objects. An obvious method is to simply hide one’s face to such a degree that there is not
enough information available for the system to reliably identify the person, but there are also
methods that leave the face in view and target the specific weaknesses of facial-recognition
models instead. Examples include 3D printing of spectacle frames [21] and placing of stickers
on a hat [22] or directly on the face [23].
   In the broader societal and ethical discourse, methods for thwarting facial-recognition systems
are typically viewed in the context of surveillance systems in public places and the age-old debate
where public safety is pitted against civil liberties. On the one hand, deliberately confusing
a system adopted and operated by legitimate authorities under democratic supervision could
be construed as a malicious act. On the other hand, if the purpose of the disguise is not to
commit a criminal act, the individual can argue that they are enforcing their fundamental right
to privacy – in a sense, opting out of data collection to which they do not consent.
   A debate that has emerged much more recently has to do with the legal and ethical justification
of training generative AI models with creative works scraped from the internet without licensing
them from the copyright holders. Related to this, an interesting new development is Nightshade,
a special type of training data poisoning attack targeting text-to-image models proposed in [24].
The authors of the paper explicitly discuss their method as a tool that content creators could
use to protect their intellectual property and rectify the power asymmetry between themselves
and the developers of AI models, who face no consequences from using scraping algorithms
that do not comply with voluntary opt-out requests.
   Legally, this appears to be a grey area at the moment, but it is at least possible that using
copyright-protected works as training material for AI models is covered by exemptions such as
the fair use doctrine in United States copyright law [25]. However, regardless of how courts
of law rule on the issue or how lawmakers choose to approach it, creators may still see it as
their right to opt out of their works being harvested and to use tools such as Nightshade to
enforce that right if necessary. Certainly even if scraping is considered lawful, this does not
imply that creators are under any obligation to cooperate with scrapers, so we can view this
case as another example of how the relationship between AI security and AI ethics is not as
straightforward as it might first seem.
6. Discussion
What explains the small number of AI incidents found that even partially or possibly satisfy the
“4I” relevance criteria defined in Section 3? We have no evidence at the moment that would
enable us to say anything conclusive, but we can identify a number of possible factors:
   1. Exploitation of AI vulnerabilities does not happen. While it is unlikely that the dis-
      covered incidents represent the full picture, it seems plausible that compared to traditional
      vulnerabilities, the real-world impact of exploitation of AI vulnerabilities by malicious
      actors is still small. This does not mean, of course, that the impact will not grow, possibly
      even very rapidly.
   2. Exploitation does happen, but does not get reported. There may be some number of
      incidents that would have satisfied the relevance criteria but could not be found because no
      report was made of them. This could be the case if, for example, the targeted organisation
      decided to cover up the incident to protect its reputation.
   3. Exploitation gets reported, but reports do not get included in the databases. There
      may be some number of relevant incidents that were reported but could not be found
      because the reports are deposited somewhere other than the databases we searched.
      Such reports could reside, for example, in the internal databases of various security
      organisations or even on public websites.
   4. Reports get included, but the search method used fails to find them. Finally, it is
      possible that some number of relevant incidents stored in the databases were not found
      because we failed to identify the right keywords or because the whole keyword-based
      search approach was flawed. Such an approach can only be successful if cyber incidents
      involving AI are consistently described using language that makes the nature of the
      incident clear, which is not necessarily the case.
   For the time being, we are operating on the assumption that each of these contributed to some
extent to the small number of search results. Further exploration of this question is a matter for
future work; we will remark, however, that because of the variety of factors that may cause
relevant AI incidents to be overlooked by the search, the search results are likely to be biased
in various ways. For example, the relative prevalence of reports involving self-driving cars is
arguably not so much an indicator of their particular vulnerability to adversarial attacks as an
artefact of the considerable attention they attract in both academic research and the mainstream
media. We should therefore be wary of drawing even very general conclusions about what the
search results can tell us.
   Cybercrime has become a well-organised industry with a range of business models [26], and
its annual cost is estimated to be in the order of ten trillion USD globally [27]. Based on the
available evidence, it would appear that exploitation of AI vulnerabilities does not yet account for
any appreciable fraction of this, and it may well be that effective models for generating revenue
through AI-specific attacks have yet to emerge, as opposed to more traditional approaches such
as ransomware attacks. However, given the magnitude of the financial incentive, cybercriminals
will undoubtedly seek to capitalise on the new opportunities created by the proliferation of
AI systems, so it is important to monitor the situation, and to develop better instruments of
monitoring if the currently existing ones prove inadequate.
   Notably, neither of the two non-AI-specific databases, KEV and EuRepoC, yielded any in-
cidents deemed (partially/possibly) relevant according to the 4I criteria. This may be at least
partially because of item number 4 in the list above – i.e. there are some relevant reports in those
databases but the search failed to find them – but it is also possible that (some) AI vulnerabilities
are not even recognised as vulnerabilities in the traditional sense, and are therefore not being
included in traditional cybersecurity repositories. Given this, combined with the elusive nature
of the concept of AI vulnerability and the inconsistency of AI incident reporting practices, we
argue that there is a need for more dialogue between the security and ethics communities on
this topic. Without proper understanding and cataloguing of AI vulnerabilities, it will not be
possible to understand the true scale and nature of their real-world impact.


7. Conclusion
In this paper we examined the security of AI systems from the perspective of AI ethics, aiming
to clarify the relationship between the two and to highlight some of its nuances. By studying
research literature and some well-known sets of proposed AI ethics principles, we found that
while security is primarily viewed as an enabler for higher-level ethics requirements such as
safety, in reality the situation is more complicated because of various trade-offs and value con-
flicts involving AI security. The security of AI systems therefore cannot be properly understood
without considering the real-world sociotechnical context in which they are deployed.
   We focused particularly on assessing the scale and nature of real-world harm resulting from
the exploitation of AI security vulnerabilities, finding this difficult because of shortcomings
in the way vulnerabilities and incidents involving AI are currently being recorded in public
repositories. We defined a set of relevance criteria and carried out a search of four AI incident
repositories and two cybersecurity incident repositories, resulting in only about 40 incidents
that could be considered at least partially relevant to the research question. We concluded that
while the real-world impact of AI vulnerabilities probably still is comparatively small, the set
of incidents discovered by the search is likely to suffer from biases induced by a number of
factors. Obtaining a more accurate picture requires further research as well as cross-community
discourse between divergent perspectives on AI security.


Acknowledgments
The research reported in this paper was carried out with funding awarded by the Scientific Ad-
visory Board for Defence (MATINE). We would also like to acknowledge the helpful suggestions
of our colleague Arttu Pispa during the preparation of the manuscript.


Declaration on Generative AI
The authors have not employed any Generative AI tools.
References
 [1] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno,
     D. Song, Robust physical-world attacks on deep learning visual classification, in: Proceed-
     ings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018,
     pp. 1625–1634.
 [2] B. Mittelstadt, Principles alone cannot guarantee ethical AI, Nature Machine Intelligence
     1 (2019) 501–507.
 [3] A. Jobin, M. Ienca, E. Vayena, The global landscape of AI ethics guidelines, Nature Machine
     Intelligence 1 (2019) 389–399.
 [4] Future of Life Institute, Asilomar AI principles, 2017. URL: https://futureoflife.org/
     open-letter/ai-principles/, accessed April 3, 2024.
 [5] UNESCO, Recommendation on the ethics of artificial intelligence, 2021. URL: https://
     unesdoc.unesco.org/ark:/48223/pf0000381137, accessed April 3, 2024.
 [6] Université de Montréal, Montréal declaration for a responsible development of artificial
     intelligence, 2018. URL: https://montrealdeclaration-responsibleai.com/the-declaration/,
     accessed April 3, 2024.
 [7] IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems, Ethically aligned
     design: A vision for prioritizing human well-being with autonomous and intelligent
     systems, version 2, 2017. URL: https://standards.ieee.org/industry-connections/ec/ead-v1/,
     accessed April 3, 2024.
 [8] High-Level Expert Group on Artificial Intelligence, Ethics guidelines for trust-
     worthy AI, 2019. URL: https://op.europa.eu/en/publication-detail/-/publication/
     d3988569-0434-11ea-8c1f-01aa75ed71a1, accessed April 3, 2024.
 [9] L. Floridi, J. Cowls, M. Beltrametti, R. Chatila, P. Chazerand, V. Dignum, C. Luetge,
     R. Madelin, U. Pagallo, F. Rossi, B. Schafer, P. Valcke, E. Vayena, AI4People—an ethical
     framework for a good AI society: Opportunities, risks, principles, and recommendations,
     Minds and Machines 28 (2018) 689–707.
[10] V. Turri, R. Dzombak, Why we need to know more: Exploring the state of AI incident
     documentation practices, in: Proceedings of the 2023 AAAI/ACM Conference on AI,
     Ethics, and Society, 2023, pp. 576–583.
[11] AIAAIC repository, 2024. URL: https://www.aiaaic.org/home, accessed March 20, 2024.
[12] AI incident database, 2024. URL: https://incidentdatabase.ai/, accessed March 20, 2024.
[13] AI vulnerability database, 2024. URL: https://avidml.org/, accessed March 20, 2024.
[14] Where in the world is AI?, 2024. URL: https://map.ai-global.org/, accessed March 20, 2024.
[15] Known exploited vulnerabilities catalog, 2024. URL: https://www.cisa.gov/
     known-exploited-vulnerabilities-catalog, accessed March 20, 2024.
[16] EuRepoC database, 2024. URL: https://eurepoc.eu/database/, accessed March 20, 2024.
[17] P. P. Angelov, E. A. Soares, R. Jiang, N. I. Arnold, P. M. Atkinson, Explainable artificial
     intelligence: an analytical review, WIREs Data Mining and Knowledge Discovery 11 (2021)
     e1424.
[18] A. Bell, I. Solano-Kamaiko, O. Nov, J. Stoyanovich, It’s just not that simple: An empirical
     study of the accuracy-explainability trade-off in machine learning for public policy, in:
     Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency,
     2022, pp. 248––266.
[19] L.-V. Herm, K. Heinrich, J. Wanner, C. Janiesch, Stop ordering machine learning algorithms
     by their explainability! a user-centered investigation of performance and explainability,
     International Journal of Information Management 69 (2023) 102538.
[20] C. N. Spartalis, T. Semertzidis, P. Daras, Balancing XAI with privacy and security con-
     siderations, in: Computer Security. ESORICS 2023 International Workshops, 2024, pp.
     111–124.
[21] M. Sharif, S. Bhagavatula, L. Bauer, M. K. Reiter, Accessorize to a crime: Real and stealthy
     attacks on state-of-the-art face recognition, in: Proceedings of the 2016 ACM SIGSAC
     Conference on Computer and Communications Security, 2016, pp. 1528––1540.
[22] S. Komkov, A. Petiushko, AdvHat: Real-world adversarial attack on ArcFace face id system,
     in: 2020 25th International Conference on Pattern Recognition (ICPR), 2021, pp. 819–826.
[23] M. Shen, H. Yu, L. Zhu, K. Xu, Q. Li, J. Hu, Effective and robust physical-world attacks on
     deep learning face recognition systems, IEEE Transactions on Information Forensics and
     Security 16 (2021) 4063–4077.
[24] S. Shan, W. Ding, J. Passananti, S. Wu, H. Zheng, B. Y. Zhao, Prompt-specific poisoning
     attacks on text-to-image generative models, 2024. arXiv:2310.13828 .
[25] P. Samuelson, Generative AI meets copyright, Science 381 (2023) 158–161.
[26] C. Griffy-Brown, D. Lazarikos, M. Chun, Cybercrime business models: Developing an
     approach for effective security against better organized criminals, Journal of Applied
     Business and Economics 19 (2017).
[27] S. Morgan, Cybercrime to cost the world $9.5 trillion USD annually in 2024, 2023. URL: https:
     //cybersecurityventures.com/cybercrime-to-cost-the-world-9-trillion-annually-in-2024/,
     accessed April 11, 2024.