=Paper= {{Paper |id=Vol-3762/481 |storemode=property |title=Artificial Intelligence and Anti-Corruption |pdfUrl=https://ceur-ws.org/Vol-3762/481.pdf |volume=Vol-3762 |authors=Fabrizio Sbicca |dblpUrl=https://dblp.org/rec/conf/ital-ia/Sbicca24 }} ==Artificial Intelligence and Anti-Corruption== https://ceur-ws.org/Vol-3762/481.pdf
                                Artificial Intelligence and Anti-Corruption
                                Fabrizio Sbicca1, *

                                1 Autorità Nazionale Anticorruzione (ANAC), Via Marco Minghetti 10, 00187 Rome

                                The opinions expressed in this paper are the author's own and do not reflect the view of ANAC.




                                                    Abstract

                                                    The article presents recent developments undertaken by ANAC in the
                                                    understanding of corruption and suggests possible avenues for further
                                                    analysis of the phenomenon using machine learning techniques.

                                                    Keywords
                                                    corruption, public procurement, big data, machine learning1



                                1. Introduction                                                     presented to the public a section of its portal called
                                                                                                    "Measure                                    Corruption"
                                    Although corruption represents one of the main                  (https://www.anticorruzione.it/il-progetto). Seventy
                                obstacles to economic, political, and social                        indicators are made available to the community
                                development, it is a latent phenomenon and,                         capable of measuring the risk of corruption in the
                                therefore, difficult to measure. Indeed, the corruptive             territory         (https://www.anticorruzione.it/gli-
                                phenomenon can be compared to an iceberg of which                   indicatori).
                                only the tip is visible, despite the submerged part                     These indicators can be considered as warning
                                being much larger than it appears. The cases of                     bells signaling potentially anomalous situations. They
                                corruption that are learned about, for example,                     allow to have a picture of territorial contexts more or
                                through court rulings, constitute the visible part, but             less exposed to corruptive phenomena on which to
                                they leave us ignorant regarding the size and                       invest in terms of prevention and/or investigation.
                                characteristics of the phenomenon that remains                      They can also direct the attention of civil society and
                                largely hidden. Not surprisingly, there is an extreme               increase civic participation. From this point of view,
                                shortage of structured scientific data on the                       this system of indicators could represent a useful
                                corruptive phenomenon internationally that goes                     contribution to the country for the construction and
                                beyond the measurement of the so-called                             implementation of further and more targeted tools for
                                "perception" or of ad hoc studies, certainly very                   the prevention, monitoring, and control of corruption,
                                interesting and rich in insights, but whose contents                with the ultimate aim of better managing the future
                                and results are difficult to generalize.                            use of public financial resources. The perspective
                                                                                                    pursued in developing the website has been to
                                                                                                    highlight the importance of strengthening collective
                                2. ANAC’s experience in measuring                                   awareness on the serious social consequences
                                    corruption                                                      resulting from corruption. Prevention and repression
                                                                                                    are in fact necessary but not sufficient, to fight the
                                    A significant step forward in the understanding of              phenomenon in a more profound way we need an
                                this phenomenon was made by the Italian Anti-                       increase in social capital. For this reason, the
                                Corruption Authority (ANAC), which in July 2022                     dashboards in the website are “easy”, behind them


                                Ital-IA 2024: 4th National Conference on Artificial Intelligence,                 © 2024 Copyright for this paper by its authors. Use permitted under
                                                                                                                  Creative Commons License Attribution 4.0 International (CC BY 4.0).
                                organized by CINI, May 29-30, 2024, Naples, Italy
                                * f.sbicca@anticorruzione.it

                                    0009-0000-9372-2325




                                                                                                                                                                                        1
CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
there are complex data, algorithms and IT structures        in the context of public procurement, thus signaling
but the result that ANAC tried to achieve is that they      the risk of corruption in every Italian province.
would be understandable to everyone and                          An example of corruption risk indicators in public
captivating, especially for young people, in order to       procurement is the use of discretionary procedures
engage more easily in questions, reflections and            [3] or tenders with very few bidders [4], but also
awareness.                                                  delays and cost overruns [9]. The literature identifies
    In particular, the so-called "context indicators"       that low competition in tenders associated with more
provide an idea of the complex social and economic          discretion is typically a signal of corruption risk [8].
context of the territory in which a risk of corruption is   Other examples of contract-level red flags for
more or less likely to manifest. This analysis indeed       corruption can be found in [10, 11, 12].
took into consideration 18 indicators, collected in four         The portal allows for the calculation of synthesis
thematic domains (education, economy, crime, social         indicators according to different risk thresholds,
capital). Other 25 indicators were then added. These        obtained by condensing the information coming from
indicators are useful for evaluating the conditions of      all or part of the 17 indicators. For each of the selected
the territorial context (for a total of 43 simple           indicators, in fact, it is possible to highlight the
indicators), all related to the main hypotheses             provinces whose value exceeds a given percentage of
identified in the literature regarding factors              the provinces with a less risky value. The threshold
associated with corruption. The analysis of the             value can be freely chosen from the 75th to the 99th
external context, in fact, aims to identify the cultural,   percentile.
economic, and social characteristics of the provincial           Finally, five indicators were calculated at the level
territory in which the administrations operate, which       of single administration, in this case, the 745 Italian
can favor, or conversely hinder, the occurrence of          municipalities with a population equal to or greater
corruptive phenomena.                                       than 15,000 inhabitants. These indicators were
    Each thematic domain is summarized by a                 calculated based on the statistical analysis of the
composite index to simplify the reading of complexity       relationships between variables potentially related to
due to the many dimensions considered. The four             corruption and episodes that occurred at the level of
thematic composite indicators are in turn                   single administration.
synthesized, by combining them, into a further
"composite of composites" index that therefore
provides a highly informative synthetic measure on          3. Artificial Intelligence and Anti-
some characteristics of the entire phenomenon. Thus,
the "context dashboard" makes available to the
                                                                Corruption
community a total of 48 indicators, of which 5 are              What are the potential future developments and
composite.                                                  opportunities opened by technological innovation
    The risk indicators for corruption in the public        that is evolving with unprecedented speed,
procurement, on the other hand, provide information         particularly regarding artificial intelligence? First, an
related to the purchases of administrations located in      opportunity arises from the increasing availability of
the province to which they refer and are particularly       information in large public databases of various kinds
important both because of the unique weight of the          which, if properly used, allow for the extraction of
corruptive phenomenon in the public procurement             potentially very useful indicators. The joint use of
market and the institutional purposes of ANAC. The          separate databases is very advantageous, based on the
source of the information is in fact the National           principle that the value of data tends to grow more
Database of Public Contracts (BDNCP), a great value         than proportionally with the combination of different
asset that, for the quantity and detail of the data         sources. In the Italian case, though, several databases
contained, relating to about 70 million contracts,          are often owned by distinct public administrations.
represents a unique experience at the European level.       Their joint use is hindered by several factors,
The availability of this database allows for the            including concerns about privacy protection. The
computation of corruption risk indicators with an           need to overcome such impediments is particularly
extreme degree of territorial, sectoral, and temporal       urgent today, with the spread of tools and techniques
detail.                                                     for analyzing so-called "big data," which the Italian
    Based on an increasingly important and                  public administration generates in increasing
substantial body of scientific studies, ANAC has            measure. They can unleash their potential to support
identified 17 indicators that, in various ways, identify    a public debate that is anchored in the evidence of
aspects highlighting potential corruptive phenomena




                                                                                                                    2
facts and can help policymakers to take more                and easy to apply to collect and analyze large volumes
informed decisions.                                         of information available in computerized databases.
     Another important aspect is certainly the                   The ever-greater availability of large data sets has
digitalization of the procurement lifecycle, an             also increasingly shifted attention to the potential for
important and difficult transformation process that is      developing advanced algorithms, using big data
occurring worldwide. In Italy, digitalization has been      analytics and artificial intelligence in addition to
expressively envisaged by the new Public Contract           traditional statistical analyses [16]. Machine learning
Code. First of all, digitalization could in itself          can help identifying further and more targeted red
constitute an effective measure for the prevention of       flags that concerning both the individual transaction
corruption as it is likely to bring a higher degree of      and the purchasing activity of a certain administration
transparency, traceability, participation, control of       or the set of administrations in a certain territorial
activities, potentially suitable to ensure compliance       area. For instance, [17] studies a particular red flag for
with legality [1, 2, 13, 18]. With the full                 corruption, which is the degree of political connection
implementation of the digitalization of the contract        of firms.
lifecycle, data should be "natively digital," which could        In this regard, AI anti-corruption tools can be
improve not only the quality and completeness of            defined as "data processing systems driven by tasks or
information but also allow for the acquisition of           problems designed to, with a degree of autonomy,
additional data not previously detected by the              identify, predict, summarize, and/or communicate
mentioned BDNCP or acquired in a very deficient             actions related to the misuse of position, information
manner. The informative bases held by ANAC could            and/or resources aimed at private gain at the expense
therefore have in the future a role of great importance     of the collective good" [19].
and greater centrality also in the prevention and                Thanks to the processing of large volumes of data
combat of corruption and other phenomena (such as           with the current processing speed, artificial
fraud, collusion, conflict of interest) strongly            intelligence can indeed contribute to uncovering
detrimental to the correct functioning of the market        patterns of corruption and identifying warning signs.
and the effective and efficient allocation of resources     Research on the potential of such tools in the field of
in the context of public procurement, including those       corruption prevention and combat is still in its initial
funded with EU funds. Recent experiences of full            phase [14], and so far, there are not many concrete
digitalization of the public procurement process            examples of application to this theme, among these
analyzed in the literature can be found in Ukraine [6,      are cited the case of Brazil (Anti-corruption tools
5] and Georgia[21].                                         based on artificial intelligence to monitor public
     On the other hand, both Regulation (EU)                spending, for example cartel practices); the Chinese
2021/241 of February 12 2021 establishing the               "Zero Trust" system to predict the risk that public
Recovery and Resilience Facility and Regulation (EU)        officials are involved in corrupt practices; the "SyRI"
2021/1060 of 24 June 2021 laying down common                algorithm used by the Dutch authorities to identify
provisions for different European funds , provide that      fraud in the social security sector, however
Member States implement effective control                   dismantled in 2020 due to often discriminatory and
mechanisms on procurement based as much as                  biased results; the Ukrainian "ProZorro" system to
possible on methodologies and tools for collecting and      detect violations from public procurement data and
analyzing large volumes of information available in         prevent the misuse of public funds. Moreover, some
computerized databases, emphasizing the centrality          authors used data science techniques to construct
of risk indicators as a fundamental tool for the            networks of firms bidding in the same auctions in the
prevention and combat of serious irregularities in          Georgian public procurement market to find possible
such market, such as fraud, corruption, and conflicts       networks of firms that collude to win public contracts
of interest. And “Notice on tools to fight collusion in     [21], other researchers use a neural network
public procurement and on guidance on how to apply          approach to detect corruption in the Spanish
the related exclusion ground” (2021/C 91/01 of 18           provinces [15], or methods from network science to
March 2021) , emphasizes, with specific reference to        analyze the corruption risk at the EU level [20].
collusion, the importance of indicators as a tool to             From this point of view, the indicators of the ANAC
combat distortive phenomena of competition,                 portal need to be “valid” in order to be used in the
reaffirming the need for central authorities in             future in a targeted way and with a solid scientific
Member States to increasingly and effectively               basis of reference also for preventive purposes. This
collaborate in the analysis of procurement data,            validation can be obtained thanks to techniques that
developing methodologies and tools that are simple          go beyond the deductive reasoning that led to their




                                                                                                                    3
identification in the first place. In this regard, any         predictive indicators, using also machine learning
further future developments exploring this line of             techniques. First, the development of a cluster
research, already practiced in the case of the above-          analysis     on     "infected"     Municipalities     (i.e.,
mentioned municipal risk indicators already present            characterized by at least one episode of corruption in
in the ANAC portal, could be based on a validation             the time period examined) with the aim of identifying
methodology that is based on the distinction between:          some subgroups of municipalities that present
a. "relevant events," which are summarized by the              recurring organizational, governance, and managerial
risk indicators, for example, those related to public          characteristics. The analyses conducted have indeed
procurement, calculated thanks to the BDNCP;                   allowed identifying among the medium-large Italian
b. "phenomena of possible corruption," as indicated            Municipalities those in which episodes of corruption
by other types of sufficiently structured and                  occurred in the five-year period 2015-2019. It might
numerous data, among these: judicial convictions for           be interesting, within this group, to therefore conduct
corruption crimes or, more generally, for crimes               a cluster analysis to be able to identify the "similarity"
against the PA; reports received by ANAC; news                 between the Municipalities in which episodes of
articles related to episodes of corruption; dissolution        corruption were detected, proceeding with a
of municipal councils for mafia infiltration, etc.             classification of the same based on: 1) organizational
    Validating the risk indicators (which summarize            variables; 2) governance variables; 3) risk indicators
the "relevant events") means evaluating their capacity         in public procurement; 4) accounting variables. The
to "predict" the "phenomena of possible corruption".           development of this type of investigation could allow
Regarding this, the procedure that could be                    identifying within the "infected" municipalities,
experimented is based on two areas of statistical              subgroups characterized by a high internal
techniques that can be used. On one hand, there are            homogeneity with respect to some features. This
the so-called traditional statistical models, both             could make it possible to identify some organizational,
parametric (such as, for example, regression models)           governance, and managerial characteristics that are
and non-parametric. On the other hand, there are               recurrent among the Italian Municipalities
various types of machine learning techniques. In both          characterized by corruptive episodes.
cases, the analysis is only carried out in a subset of the         Another possible deepening could concern the
available data, to then be able to consider the                extension of the analysis to other sectors of public
predictive capacity of the estimated relationship              administration (e.g., health units) in order to identify
"outside the sample" (the part of the data not used).          potential corruptive risk indicators linked to
This allows, among other things, to evaluate which             organizational, managerial, and accounting variables.
indicators, among the alternatives considered, have            Indeed, it is known in the literature that Public
the best predictive capacity. Finally, the potential of        Administrations        constitute    a     varied     and
this approach should be considered where the                   heterogeneous set of entities profoundly different in
programs used for the construction of the algorithms           terms of institutional, organizational, managerial,
were made available to the public in order to prevent          accounting arrangements that operate in significantly
such measures from being perceived (or actually are)           different normative and regulatory contexts. In other
as "black boxes". This is an important issue today, and        words, the corruptive risk indicators could vary from
will gain even greater prominence in the future, as the        sector to sector of Public Administrations, due to the
use of artificial intelligence techniques that can be          specificities and differentiations that characterize
particularly opaque spreads.                                   them.
    A first example of statistical validation has already          Finally, a further interesting development could
been carried out within the project on measuring               concern the use of the findings on corruption cases in
corruption and concerns the 5 risk indicators at the           municipalities to support the development of
municipal level mentioned earlier, which are in fact           predictive techniques of corruption risk based on
significantly associated with the occurrence of                artificial intelligence. Municipalities are one of the
corruption episodes of a single administration. Unlike         areas of the PA where there is more need to
the 48 context indicators and the 17 procurement               strengthen the analysis of context variables affecting
indicators, which were calculated at the aggregated            the corruptive risk. The magnitude of this risk is
territorial level of the province, in this case, the unit of   presumably destined to grow with the use of EU funds
analysis is indeed the single Municipality intended as         to finance the numerous projects approved in the
an entity.                                                     various territorial areas. In the most recent economic
    Based on the results achieved, it is possible to           literature, the analysis of the corruptive risk in Italian
identify various possible lines of development of              municipalities has been conducted by some




                                                                                                                         4
researchers through the application of Artificial              from italian calls for tenders, EPJ Data
Intelligence techniques, such as machine learning in           Science 11 (2022).
order also to build predictive models of corruption in    [10] M. Fazekas, I. J. Tóth, L. P. King, An objective
Italian municipalities, using as predictors a series of        corruption risk index using public
socio-economic, demographic, geographic, and                   procurement data, European Journal of
biophysical variables, drawn from the sector                   Criminal Policy and Research, 22(2016)
literature [7]. From this point of view, the analyses          369–397.
conducted within the ANAC project have led to the         [11] M. Fazekas, L. Cingolani, B. Tóth, A
development of a database of corruptive events for             comprehensive        review      of    objective
medium-large Italian Municipalities in the period              corruption proxies in public procurement:
2015-2019, which has allowed to detect also                    risky actors, transactions, and vehicles of
numerous organizational, governance, accounting,               rent extraction, Budapest, GTI-WP/2016:03,
and risk variables in public procurement. The                  2017.
variables available in the Datasets prepared during       [12] M. Fazekas, G. Kocsis, Uncovering high-level
the project could therefore be used for the                    corruption:       cross-national       objective
construction of new algorithms that can learn from             corruption risk indicators using public
this set of data for predictive purposes.                      procurement data, British Journal of Political
                                                               Science 50 No.1 (2020) 155-164.
4. References                                             [13] J. Ferwerda, I. Deleanu, B. Unger, Corruption
                                                               in public procurement: finding the right
   [1] P. Aarvik, Artificial intelligence – a promising        indicators, European Journal on Criminal
       anti-corruption tool in development                     Policy and Research, 23 (2017) 245–267.
       settings? U4 Report Insights, Chr. Michelsen       [14] J. Li, W. H. Chen, Q. Xu, N. Shah, J.C. Kohler,
       Institute (2019).                                       T.K. Mackey, Detection of self-reported
   [2] I. Adam, M.Fazekas, Are emerging                        experiences with corruption on twitter using
       technologies helping in the fight against               unsupervised machine learning, Social
       corruption? A review of the state of evidence,          Sciences & Humanities Open, 2 (2020).
       Information Economics and Policy 57                [15] F.J. Lopez-Iturriaga, I.P. Sanz, Predicting
       (2021).                                                 public corruption with neural networks: an
   [3] A. Abdou, A. Ágnes Czibik, B. Tóth, M.                  analysis of spanish provinces, Social
       Fazekas, COVID-19 emergency public                      Indicators Research 140 (2018) 975-998.
       procurement in Romania: corruption risks           [16] N. Kobis, C. Starke, I. Rahwan, Artificial
       and market behavior, Budapest, GTI-                     intelligence as an anti-corruption tool (AI-
       WP/2021:03, 2021.                                       ACT) -- potentials and pitfalls for top-down
   [4] E. Auriol, S. Straub, T. Flochel, Public                and      bottom-up       approaches.      arXiv,
       procurement and rent-seeking: the case of               2102.11567, 2021.
       Paraguay, World Development, 77 (2016)             [17] D. Mazrekaj, F. Schiltz, V. Titl, Identifying
       395–407.                                                politically connected firms: a machine
   [5] B. Baránek, V. Titl, L. Musolff, Detection of           learning approach, OECD Global Anti-
       collusive networks in e-procurement,2021.               Corruption & Integrity Forum, March 20-
   [6] S. Baumann, M. Klymak, Paying over the odds             21,2019.
       at the end of the fiscal year. Evidence from       [18] F. Merenda, Legalità, algoritmi e corruzione:
       Ukraine (2022).                                         le tecniche di intelligenza artificiale
   [7] G. de Blasio, A. D'Ignazio, M. Letta, Gotham            potrebbero essere utilizzate nel e per il
       city. Predicting ‘corrupted’ municipalities             sistema        di       prevenzione        della
       with machine learning, Technological                    corruzione? Rivista italiana di informatica e
       Forecasting & Social Change, 184 (2022).                diritto, 4 No.2 (2022) 23-38.
   [8] F. Decarolis, R. Fisman, P. Pinotti, S.            [19] F. Odilla, Bots against corruption: exploring
       Vannutelli, Rules, discretion, and corruption           the benefits and limitations of AI‑based
       in procurement: evidence from italian                   anti‑corruption technology, Crime, Law and
       government contracting, NBER Working                    Social Change 80 (2023) 353-396.
       Paper 28209 (2020).;                               [20] J. Wachs, M. Fazekas, J. Kertesz, Corruption
   [9] F. Decarolis, C. Giorgiantonio, Corruption red          risk in contracting markets: a network
       flags in public procurement: new evidence




                                                                                                             5
     science perspective, International Journal of
     Data Science and Analytics 12 (2021) 45–60.
[21] J. Wachs, J. Kertesz, A network approach to
     cartel detection in public auction markets,
     Scientific reports 9 No.1 (2021).




                                                     6