=Paper=
{{Paper
|id=Vol-3762/481
|storemode=property
|title=Artificial Intelligence and Anti-Corruption
|pdfUrl=https://ceur-ws.org/Vol-3762/481.pdf
|volume=Vol-3762
|authors=Fabrizio Sbicca
|dblpUrl=https://dblp.org/rec/conf/ital-ia/Sbicca24
}}
==Artificial Intelligence and Anti-Corruption==
Artificial Intelligence and Anti-Corruption
Fabrizio Sbicca1, *
1 Autorità Nazionale Anticorruzione (ANAC), Via Marco Minghetti 10, 00187 Rome
The opinions expressed in this paper are the author's own and do not reflect the view of ANAC.
Abstract
The article presents recent developments undertaken by ANAC in the
understanding of corruption and suggests possible avenues for further
analysis of the phenomenon using machine learning techniques.
Keywords
corruption, public procurement, big data, machine learning1
1. Introduction presented to the public a section of its portal called
"Measure Corruption"
Although corruption represents one of the main (https://www.anticorruzione.it/il-progetto). Seventy
obstacles to economic, political, and social indicators are made available to the community
development, it is a latent phenomenon and, capable of measuring the risk of corruption in the
therefore, difficult to measure. Indeed, the corruptive territory (https://www.anticorruzione.it/gli-
phenomenon can be compared to an iceberg of which indicatori).
only the tip is visible, despite the submerged part These indicators can be considered as warning
being much larger than it appears. The cases of bells signaling potentially anomalous situations. They
corruption that are learned about, for example, allow to have a picture of territorial contexts more or
through court rulings, constitute the visible part, but less exposed to corruptive phenomena on which to
they leave us ignorant regarding the size and invest in terms of prevention and/or investigation.
characteristics of the phenomenon that remains They can also direct the attention of civil society and
largely hidden. Not surprisingly, there is an extreme increase civic participation. From this point of view,
shortage of structured scientific data on the this system of indicators could represent a useful
corruptive phenomenon internationally that goes contribution to the country for the construction and
beyond the measurement of the so-called implementation of further and more targeted tools for
"perception" or of ad hoc studies, certainly very the prevention, monitoring, and control of corruption,
interesting and rich in insights, but whose contents with the ultimate aim of better managing the future
and results are difficult to generalize. use of public financial resources. The perspective
pursued in developing the website has been to
highlight the importance of strengthening collective
2. ANAC’s experience in measuring awareness on the serious social consequences
corruption resulting from corruption. Prevention and repression
are in fact necessary but not sufficient, to fight the
A significant step forward in the understanding of phenomenon in a more profound way we need an
this phenomenon was made by the Italian Anti- increase in social capital. For this reason, the
Corruption Authority (ANAC), which in July 2022 dashboards in the website are “easy”, behind them
Ital-IA 2024: 4th National Conference on Artificial Intelligence, © 2024 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
organized by CINI, May 29-30, 2024, Naples, Italy
* f.sbicca@anticorruzione.it
0009-0000-9372-2325
1
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
there are complex data, algorithms and IT structures in the context of public procurement, thus signaling
but the result that ANAC tried to achieve is that they the risk of corruption in every Italian province.
would be understandable to everyone and An example of corruption risk indicators in public
captivating, especially for young people, in order to procurement is the use of discretionary procedures
engage more easily in questions, reflections and [3] or tenders with very few bidders [4], but also
awareness. delays and cost overruns [9]. The literature identifies
In particular, the so-called "context indicators" that low competition in tenders associated with more
provide an idea of the complex social and economic discretion is typically a signal of corruption risk [8].
context of the territory in which a risk of corruption is Other examples of contract-level red flags for
more or less likely to manifest. This analysis indeed corruption can be found in [10, 11, 12].
took into consideration 18 indicators, collected in four The portal allows for the calculation of synthesis
thematic domains (education, economy, crime, social indicators according to different risk thresholds,
capital). Other 25 indicators were then added. These obtained by condensing the information coming from
indicators are useful for evaluating the conditions of all or part of the 17 indicators. For each of the selected
the territorial context (for a total of 43 simple indicators, in fact, it is possible to highlight the
indicators), all related to the main hypotheses provinces whose value exceeds a given percentage of
identified in the literature regarding factors the provinces with a less risky value. The threshold
associated with corruption. The analysis of the value can be freely chosen from the 75th to the 99th
external context, in fact, aims to identify the cultural, percentile.
economic, and social characteristics of the provincial Finally, five indicators were calculated at the level
territory in which the administrations operate, which of single administration, in this case, the 745 Italian
can favor, or conversely hinder, the occurrence of municipalities with a population equal to or greater
corruptive phenomena. than 15,000 inhabitants. These indicators were
Each thematic domain is summarized by a calculated based on the statistical analysis of the
composite index to simplify the reading of complexity relationships between variables potentially related to
due to the many dimensions considered. The four corruption and episodes that occurred at the level of
thematic composite indicators are in turn single administration.
synthesized, by combining them, into a further
"composite of composites" index that therefore
provides a highly informative synthetic measure on 3. Artificial Intelligence and Anti-
some characteristics of the entire phenomenon. Thus,
the "context dashboard" makes available to the
Corruption
community a total of 48 indicators, of which 5 are What are the potential future developments and
composite. opportunities opened by technological innovation
The risk indicators for corruption in the public that is evolving with unprecedented speed,
procurement, on the other hand, provide information particularly regarding artificial intelligence? First, an
related to the purchases of administrations located in opportunity arises from the increasing availability of
the province to which they refer and are particularly information in large public databases of various kinds
important both because of the unique weight of the which, if properly used, allow for the extraction of
corruptive phenomenon in the public procurement potentially very useful indicators. The joint use of
market and the institutional purposes of ANAC. The separate databases is very advantageous, based on the
source of the information is in fact the National principle that the value of data tends to grow more
Database of Public Contracts (BDNCP), a great value than proportionally with the combination of different
asset that, for the quantity and detail of the data sources. In the Italian case, though, several databases
contained, relating to about 70 million contracts, are often owned by distinct public administrations.
represents a unique experience at the European level. Their joint use is hindered by several factors,
The availability of this database allows for the including concerns about privacy protection. The
computation of corruption risk indicators with an need to overcome such impediments is particularly
extreme degree of territorial, sectoral, and temporal urgent today, with the spread of tools and techniques
detail. for analyzing so-called "big data," which the Italian
Based on an increasingly important and public administration generates in increasing
substantial body of scientific studies, ANAC has measure. They can unleash their potential to support
identified 17 indicators that, in various ways, identify a public debate that is anchored in the evidence of
aspects highlighting potential corruptive phenomena
2
facts and can help policymakers to take more and easy to apply to collect and analyze large volumes
informed decisions. of information available in computerized databases.
Another important aspect is certainly the The ever-greater availability of large data sets has
digitalization of the procurement lifecycle, an also increasingly shifted attention to the potential for
important and difficult transformation process that is developing advanced algorithms, using big data
occurring worldwide. In Italy, digitalization has been analytics and artificial intelligence in addition to
expressively envisaged by the new Public Contract traditional statistical analyses [16]. Machine learning
Code. First of all, digitalization could in itself can help identifying further and more targeted red
constitute an effective measure for the prevention of flags that concerning both the individual transaction
corruption as it is likely to bring a higher degree of and the purchasing activity of a certain administration
transparency, traceability, participation, control of or the set of administrations in a certain territorial
activities, potentially suitable to ensure compliance area. For instance, [17] studies a particular red flag for
with legality [1, 2, 13, 18]. With the full corruption, which is the degree of political connection
implementation of the digitalization of the contract of firms.
lifecycle, data should be "natively digital," which could In this regard, AI anti-corruption tools can be
improve not only the quality and completeness of defined as "data processing systems driven by tasks or
information but also allow for the acquisition of problems designed to, with a degree of autonomy,
additional data not previously detected by the identify, predict, summarize, and/or communicate
mentioned BDNCP or acquired in a very deficient actions related to the misuse of position, information
manner. The informative bases held by ANAC could and/or resources aimed at private gain at the expense
therefore have in the future a role of great importance of the collective good" [19].
and greater centrality also in the prevention and Thanks to the processing of large volumes of data
combat of corruption and other phenomena (such as with the current processing speed, artificial
fraud, collusion, conflict of interest) strongly intelligence can indeed contribute to uncovering
detrimental to the correct functioning of the market patterns of corruption and identifying warning signs.
and the effective and efficient allocation of resources Research on the potential of such tools in the field of
in the context of public procurement, including those corruption prevention and combat is still in its initial
funded with EU funds. Recent experiences of full phase [14], and so far, there are not many concrete
digitalization of the public procurement process examples of application to this theme, among these
analyzed in the literature can be found in Ukraine [6, are cited the case of Brazil (Anti-corruption tools
5] and Georgia[21]. based on artificial intelligence to monitor public
On the other hand, both Regulation (EU) spending, for example cartel practices); the Chinese
2021/241 of February 12 2021 establishing the "Zero Trust" system to predict the risk that public
Recovery and Resilience Facility and Regulation (EU) officials are involved in corrupt practices; the "SyRI"
2021/1060 of 24 June 2021 laying down common algorithm used by the Dutch authorities to identify
provisions for different European funds , provide that fraud in the social security sector, however
Member States implement effective control dismantled in 2020 due to often discriminatory and
mechanisms on procurement based as much as biased results; the Ukrainian "ProZorro" system to
possible on methodologies and tools for collecting and detect violations from public procurement data and
analyzing large volumes of information available in prevent the misuse of public funds. Moreover, some
computerized databases, emphasizing the centrality authors used data science techniques to construct
of risk indicators as a fundamental tool for the networks of firms bidding in the same auctions in the
prevention and combat of serious irregularities in Georgian public procurement market to find possible
such market, such as fraud, corruption, and conflicts networks of firms that collude to win public contracts
of interest. And “Notice on tools to fight collusion in [21], other researchers use a neural network
public procurement and on guidance on how to apply approach to detect corruption in the Spanish
the related exclusion ground” (2021/C 91/01 of 18 provinces [15], or methods from network science to
March 2021) , emphasizes, with specific reference to analyze the corruption risk at the EU level [20].
collusion, the importance of indicators as a tool to From this point of view, the indicators of the ANAC
combat distortive phenomena of competition, portal need to be “valid” in order to be used in the
reaffirming the need for central authorities in future in a targeted way and with a solid scientific
Member States to increasingly and effectively basis of reference also for preventive purposes. This
collaborate in the analysis of procurement data, validation can be obtained thanks to techniques that
developing methodologies and tools that are simple go beyond the deductive reasoning that led to their
3
identification in the first place. In this regard, any predictive indicators, using also machine learning
further future developments exploring this line of techniques. First, the development of a cluster
research, already practiced in the case of the above- analysis on "infected" Municipalities (i.e.,
mentioned municipal risk indicators already present characterized by at least one episode of corruption in
in the ANAC portal, could be based on a validation the time period examined) with the aim of identifying
methodology that is based on the distinction between: some subgroups of municipalities that present
a. "relevant events," which are summarized by the recurring organizational, governance, and managerial
risk indicators, for example, those related to public characteristics. The analyses conducted have indeed
procurement, calculated thanks to the BDNCP; allowed identifying among the medium-large Italian
b. "phenomena of possible corruption," as indicated Municipalities those in which episodes of corruption
by other types of sufficiently structured and occurred in the five-year period 2015-2019. It might
numerous data, among these: judicial convictions for be interesting, within this group, to therefore conduct
corruption crimes or, more generally, for crimes a cluster analysis to be able to identify the "similarity"
against the PA; reports received by ANAC; news between the Municipalities in which episodes of
articles related to episodes of corruption; dissolution corruption were detected, proceeding with a
of municipal councils for mafia infiltration, etc. classification of the same based on: 1) organizational
Validating the risk indicators (which summarize variables; 2) governance variables; 3) risk indicators
the "relevant events") means evaluating their capacity in public procurement; 4) accounting variables. The
to "predict" the "phenomena of possible corruption". development of this type of investigation could allow
Regarding this, the procedure that could be identifying within the "infected" municipalities,
experimented is based on two areas of statistical subgroups characterized by a high internal
techniques that can be used. On one hand, there are homogeneity with respect to some features. This
the so-called traditional statistical models, both could make it possible to identify some organizational,
parametric (such as, for example, regression models) governance, and managerial characteristics that are
and non-parametric. On the other hand, there are recurrent among the Italian Municipalities
various types of machine learning techniques. In both characterized by corruptive episodes.
cases, the analysis is only carried out in a subset of the Another possible deepening could concern the
available data, to then be able to consider the extension of the analysis to other sectors of public
predictive capacity of the estimated relationship administration (e.g., health units) in order to identify
"outside the sample" (the part of the data not used). potential corruptive risk indicators linked to
This allows, among other things, to evaluate which organizational, managerial, and accounting variables.
indicators, among the alternatives considered, have Indeed, it is known in the literature that Public
the best predictive capacity. Finally, the potential of Administrations constitute a varied and
this approach should be considered where the heterogeneous set of entities profoundly different in
programs used for the construction of the algorithms terms of institutional, organizational, managerial,
were made available to the public in order to prevent accounting arrangements that operate in significantly
such measures from being perceived (or actually are) different normative and regulatory contexts. In other
as "black boxes". This is an important issue today, and words, the corruptive risk indicators could vary from
will gain even greater prominence in the future, as the sector to sector of Public Administrations, due to the
use of artificial intelligence techniques that can be specificities and differentiations that characterize
particularly opaque spreads. them.
A first example of statistical validation has already Finally, a further interesting development could
been carried out within the project on measuring concern the use of the findings on corruption cases in
corruption and concerns the 5 risk indicators at the municipalities to support the development of
municipal level mentioned earlier, which are in fact predictive techniques of corruption risk based on
significantly associated with the occurrence of artificial intelligence. Municipalities are one of the
corruption episodes of a single administration. Unlike areas of the PA where there is more need to
the 48 context indicators and the 17 procurement strengthen the analysis of context variables affecting
indicators, which were calculated at the aggregated the corruptive risk. The magnitude of this risk is
territorial level of the province, in this case, the unit of presumably destined to grow with the use of EU funds
analysis is indeed the single Municipality intended as to finance the numerous projects approved in the
an entity. various territorial areas. In the most recent economic
Based on the results achieved, it is possible to literature, the analysis of the corruptive risk in Italian
identify various possible lines of development of municipalities has been conducted by some
4
researchers through the application of Artificial from italian calls for tenders, EPJ Data
Intelligence techniques, such as machine learning in Science 11 (2022).
order also to build predictive models of corruption in [10] M. Fazekas, I. J. Tóth, L. P. King, An objective
Italian municipalities, using as predictors a series of corruption risk index using public
socio-economic, demographic, geographic, and procurement data, European Journal of
biophysical variables, drawn from the sector Criminal Policy and Research, 22(2016)
literature [7]. From this point of view, the analyses 369–397.
conducted within the ANAC project have led to the [11] M. Fazekas, L. Cingolani, B. Tóth, A
development of a database of corruptive events for comprehensive review of objective
medium-large Italian Municipalities in the period corruption proxies in public procurement:
2015-2019, which has allowed to detect also risky actors, transactions, and vehicles of
numerous organizational, governance, accounting, rent extraction, Budapest, GTI-WP/2016:03,
and risk variables in public procurement. The 2017.
variables available in the Datasets prepared during [12] M. Fazekas, G. Kocsis, Uncovering high-level
the project could therefore be used for the corruption: cross-national objective
construction of new algorithms that can learn from corruption risk indicators using public
this set of data for predictive purposes. procurement data, British Journal of Political
Science 50 No.1 (2020) 155-164.
4. References [13] J. Ferwerda, I. Deleanu, B. Unger, Corruption
in public procurement: finding the right
[1] P. Aarvik, Artificial intelligence – a promising indicators, European Journal on Criminal
anti-corruption tool in development Policy and Research, 23 (2017) 245–267.
settings? U4 Report Insights, Chr. Michelsen [14] J. Li, W. H. Chen, Q. Xu, N. Shah, J.C. Kohler,
Institute (2019). T.K. Mackey, Detection of self-reported
[2] I. Adam, M.Fazekas, Are emerging experiences with corruption on twitter using
technologies helping in the fight against unsupervised machine learning, Social
corruption? A review of the state of evidence, Sciences & Humanities Open, 2 (2020).
Information Economics and Policy 57 [15] F.J. Lopez-Iturriaga, I.P. Sanz, Predicting
(2021). public corruption with neural networks: an
[3] A. Abdou, A. Ágnes Czibik, B. Tóth, M. analysis of spanish provinces, Social
Fazekas, COVID-19 emergency public Indicators Research 140 (2018) 975-998.
procurement in Romania: corruption risks [16] N. Kobis, C. Starke, I. Rahwan, Artificial
and market behavior, Budapest, GTI- intelligence as an anti-corruption tool (AI-
WP/2021:03, 2021. ACT) -- potentials and pitfalls for top-down
[4] E. Auriol, S. Straub, T. Flochel, Public and bottom-up approaches. arXiv,
procurement and rent-seeking: the case of 2102.11567, 2021.
Paraguay, World Development, 77 (2016) [17] D. Mazrekaj, F. Schiltz, V. Titl, Identifying
395–407. politically connected firms: a machine
[5] B. Baránek, V. Titl, L. Musolff, Detection of learning approach, OECD Global Anti-
collusive networks in e-procurement,2021. Corruption & Integrity Forum, March 20-
[6] S. Baumann, M. Klymak, Paying over the odds 21,2019.
at the end of the fiscal year. Evidence from [18] F. Merenda, Legalità, algoritmi e corruzione:
Ukraine (2022). le tecniche di intelligenza artificiale
[7] G. de Blasio, A. D'Ignazio, M. Letta, Gotham potrebbero essere utilizzate nel e per il
city. Predicting ‘corrupted’ municipalities sistema di prevenzione della
with machine learning, Technological corruzione? Rivista italiana di informatica e
Forecasting & Social Change, 184 (2022). diritto, 4 No.2 (2022) 23-38.
[8] F. Decarolis, R. Fisman, P. Pinotti, S. [19] F. Odilla, Bots against corruption: exploring
Vannutelli, Rules, discretion, and corruption the benefits and limitations of AI‑based
in procurement: evidence from italian anti‑corruption technology, Crime, Law and
government contracting, NBER Working Social Change 80 (2023) 353-396.
Paper 28209 (2020).; [20] J. Wachs, M. Fazekas, J. Kertesz, Corruption
[9] F. Decarolis, C. Giorgiantonio, Corruption red risk in contracting markets: a network
flags in public procurement: new evidence
5
science perspective, International Journal of
Data Science and Analytics 12 (2021) 45–60.
[21] J. Wachs, J. Kertesz, A network approach to
cartel detection in public auction markets,
Scientific reports 9 No.1 (2021).
6