NextProcurement: Challenges in Public Procurement
in Spain
María Navas-Loro1
1
    Ontology Engineering Group, Universidad Politécnica de Madrid, Spain


                                         Abstract
                                         Public procurement accounts for 14% of the annual budget of the various governments of the European
                                         Union. Despite its importance, there are currently several challenges in data processing in this domain.
                                         This paper identifies the main challenges detected in the first phase of the European NextProcurement
                                         project, which aims to create a platform to help harmonize and enrich public procurement data in the
                                         European Union, focusing on the specific case of Spain.

                                         Keywords
                                         Public Procurement, NextProcurement, Challenges, Textual documents, Natural Language Processing,
                                         Spanish Public Procurement


1. Introduction
Public authorities in the European Union spend around 14% of the annual Gross Domestic
Product (about 2 trillion euros) on the purchase of services, utilities and supplies.1 Free access
to this data allows facilitating accountability and transparency in Europe. Many governments,
therefore, provide this data on their own national open data portals (this is for instance the
case of the Spanish portal PLACE/PLASCP2 ), and different platforms have been developed to
improve both efficiency and transparency in European public procurement by exploiting this
information3 [1].
   There are several steps in the process from the publication of a tender to its execution and
payment. These steps, in turn, involve various administrative documents, such as the bid
itself and the technical criteria to be met, as well as the evaluation criteria for the different
proposals. The automatic processing of this information would greatly simplify and improve
the transparency of the public administration.
   However, the processing of these documents entails certain challenges for today’s data
processing and natural language processing, as it requires some domain knowledge. This paper
breaks down some of the main challenges, extracted from document analysis and from domain

Joint Proceedings of ISWC2022 Workshops: the International Workshop on Artificial Intelligence Technologies for Legal
Documents (AI4LEGAL) and the International Workshop on Knowledge Graph Summarization (KGSum), October, 2022
Envelope-Open mnavas@fi.upm.es (M. Navas-Loro)
GLOBE https://marianavas.linkeddata.es (M. Navas-Loro)
Orcid 0000-0003-1011-5023 (M. Navas-Loro)
                                       © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
    CEUR
    Workshop
    Proceedings         CEUR Workshop Proceedings (CEUR-WS.org)
                  http://ceur-ws.org
                  ISSN 1613-0073


                  1
                    https://ec.europa.eu/growth/single-market/public-procurement_en
                  2
                    https://contrataciondelestado.es
                  3
                    https://opentender.eu/es/about/about-opentender


                                                                                                          91
experts’ experience in Spanish public procurement in the context of the NextProcurement
project4 , aimed to develop an open harmonized and enriched public procurement data platform.
We will consider different factors for each of these challenges, such as the need for experts
and/or annotators to undertake them, related work in the public administration domain if
available, or the feasibility of solving the challenges, either in terms of the resources required
or the cost required versus the benefits of undertaking the tasks to address the challenges.
   The rest of the paper is organized as follows. Section 2 introduces the public procurement
tendering process in Spain, explaining the different stages in it. Section 3 explains the documents
involved in the tendering process. Section 4 and 5 present the main challenges identified,
organizing them according to the stage of the process in which they appear from among those
explained in the previous section. Finally, Section 6 concludes the most interesting tasks to
tackle from the analysis performed.


2. Public Procurement Process in Spain
Figure 1 shows the different stages through which a public tender passes in Spain. Each of these
phases is briefly explained below:

   1. Tender preparation: The public administration detects a necessity (whether for services or
      material resources) and prepares a tender, which involves the drafting of documentation.
      This documentation includes a series of criteria that the bids must meet to be selected,
      both from a technical and an economic point of view. This documentation must as
      well, as far as possible, detail how the evaluation of the bid will be carried out. Possible
      improvements at this stage include suggesting wording to the public employee based on
      similar existing tenders, and also recommendations of keywords and classification within
      a taxonomy.
   2. Tender publication: once the tender is ready, it is published on the various public platforms
      available. To facilitate access to potential bidders, it is important to include metadata
      correctly and, as much as possible, to publish it as semantic information.
   3. Bid presentation: During a limited period of time, potential suppliers can send their offers
      to the public administration. The number of potential suppliers of different sizes, as well
      as their volume, will depend, among other factors, on the classification and accessibility
      of the tender in the previous stages. An accurately classified tender will reach target
      bidders, while a wrongly classified one will be difficult to find.
   4. Bid evaluation: Once the deadline for submission of bids has expired, the public adminis-
      tration in charge evaluates the different bids based on the previously specified criteria.
      The ability to detect patterns of behaviour or even illegalities such as collusion5 among
      bidders would facilitate this process.
   5. Adjudication: The previous stage will lead to the award to the bidder with the best
      evaluation, who will have to adhere to the specified conditions.

    4
     http://nextprocurement-project.com/
    5
     According to Cambridge Dictionary, “agreement between people to act together secretly or illegally in order to
deceive or cheat someone”.


                                                        92
   6. Execution: The awarded bidder will supply the materials or services within the stipulated
      execution period.
   7. Payment: Following completion of the service, the public administration proceeds to pay
      for the service.


Figure 1: Timeline of a public tender in Spain. Big circles represent steps performed by the public
administration, while the small ones are done by potential/awarded suppliers.


   We could additionally consider a latter phase, consisting of the consultation of data from
different past tenders by the administrations for internal use.


3. Documentation
Documentation of a tender usually contains the main document (that can be in the form of a
document per se or in the form of metadata included in a platform) that details the service to be
provided, usually including a title, a description and information such as its CPV classification.
CPV codes (Common Procurement Vocabulary codes)6 help classify public procurement pro-
cesses in the European Union across different languages. Each European public procurement
process must be classified with at least one CPV, among the thousands of possible codes (more
than 9000) in the CPV taxonomy.
   There can be additional documents attached to the tender. They are not compulsory, and
if present they are usually referred to with urls that go to scanned documents that must be
read using OCR techniques. The most important ones are named “Pliegos” in Spanish, and are
detailed below.


   6
       https://simap.ted.europa.eu/web/simap/cpv


                                                   93
3.1. Pliegos
3.1.1. General Administrative Clause Specifications (Pliego Cláusulas Administrativas
       Generales, PCAG)
PCAG includes those rules and conditions that apply to all the contracts from a certain Public
Administration, in a general manner. Regarding how the clauses will be later evaluated, we can
distinguish two different types:

    • Non-automatically evaluable criteria: Quantification depends on a value judgment. Math-
      ematical formulas are not applicable to them and they are therefore considered subjective
      criteria, although the public administration must make an effort to specify the aspects to
      be valued in each one of them.
    • Automatically evaluable criteria: Mathematical formulas are applied, therefore being
      objective. It includes the economic offer (a formula is always applied), and those technical
      criteria will also be evaluated through the use of formulas (e.g., the memory of a computer).

3.1.2. Particular Administrative Clause Specifications (Pliego Cláusulas
       Administrativas Particulares, PCAP)
Rules specific to the tender in hand, award criteria and the legal conditions of the contract. The
elements of the contract include:

    • The object, budget and cost, execution time, capacity and solvency required to bid, etc.
    • The awarding of the contract. The procedure, how and where the bids must be submitted,
      how the contracting company will be selected and how the contract will be formalized.
    • The execution of the contract, including labour, social and economic obligations the
      contractor company has to comply with.
    • The prerogatives of the Administration, Jurisdiction and Remedies. Here are defined the
      privileges and jurisdiction that will be applied in the event of disagreement between the
      parties and how the appeals should be handled.
    • The criteria of valuation of the offers, among other annexes.

3.1.3. Technical Prescription Clause Specifications (Pliego Cláusulas Prescripciones
       Técnicas, PPT)
This pliego includes the minimum technical conditions required in the project, supply, service,
work, etc. These specifications will detail the characteristics of the supply, work or service
required by the Administration, as well as those aspects of improvement that will be evaluated
in the bids. Anything that is not specified in the specifications can be evaluated, except in the
case that it appears in the criteria ‘other improvements’, where those things that have not been
specified in each section can be evaluated.


                                                94
4. Challenges in Tender Preparation
4.1. Tender Drafting
The wording of the tenders may present inaccuracies that are difficult to manage. Many
imprecise words, such as ”solvency”, are frequently used without being clear about what they
imply. Additionally, tenders are expressed in many different ways depending on the writer, even
if they ask for the same products or services, and the same work of drafting is done multiple
times, incurring great inefficiency.
   Thus, it would be desirable to be able to detect the worst written parts of existing tenders and
create a taxonomy of the worst errors. For this purpose, words that tend to be imprecise, such
as ”solvency”, could be detected. Similarly, a style recommender could also be generated on
the basis of correct tenders. If, for example, a public administration wants to publish a tender
looking for drugs of a certain type, it would be desirable to locate similar texts in order to use
them as a reference, based on text similarity but also other similar aspects (such as budget).
Finally, also a metric could be designed to measure the clarity and readability
   The main problem in dealing with these issues is the need for annotations and expert knowl-
edge to detect and classify main errors. Natural Language Processing could help in these tasks
but would require annotated corpora to correctly identify domain particularities of the domain.

4.2. CPV Code Assignment
As previously mentioned, Common Procurement Vocabulary codes (CPVs) help classify public
procurement processes in the European Union. Thanks to CPVs, decision-makers can easily
explore contracting processes across Europe, and potential suppliers from different countries
may use them to detect procurement processes of interest, independently of their size or country
of origin.
   Each public procurement process must be classified with at least one CPV. However, manual
CPV classification presents three main challenges [2]. First, there are thousands of possible
codes (more than 9000), some of them with similar purposes, making it difficult for those
assigning or curating them to decide which codes better suit a specific process. This problem
is worsened with short descriptions, which are abundant among tenders. Second, different
administrations follow different classification guidelines. Third, since CPVs are organized in
a hierarchy, and thus annotated at different levels of granularity according to the annotator’s
or department’s criteria, some codes are often overgeneralized. More information about these
problems and proposed solutions can be found in literature [2], since it has been previously
targeted both in academia [3, 4, 2, 5] and industry [6].
   The main problems reported regarding CPV classification are human error and less specific
CPV being commonly used. A system that receives the description of a tender and recommends
the best fitting CPV codes could help with these issues. For this kind of classification task there
is no need for human annotation, but just classified tenders that can be found in different public
repositories (in the case of Spain, Hacienda7 , PLACE, and also the European TED8 ), it is an
   7
       https://www.hacienda.gob.es/es-ES/GobiernoAbierto/Datos%20Abiertos/Paginas\/LicitacionesContratante.
aspx
   8
       https://simap.ted.europa.eu/


                                                     95
affordable task from the point of view of the required specialized manpower, since no domain
annotators are needed. One of the problems we may encounter is the under-representation of
some of the CPV codes, but this can be solved by explicitly searching for these codes to arrive
at a balanced dataset, or even with data augmentation techniques.


5. Challenges in Bid Evaluation
5.1. Pattern Detection
Several of the indicators used at the European level to evaluate the performance of the different
states in terms of public procurement are related to the number of bidders that apply for a
tender. Figures 2 and 3 show different statistics per country that illustrate some of the problems
in current European public procurement.


Figure 2: Proportion of contracts awarded where there was just a single bidder (excluding frame-
work agreements). Figure extracted from https://single-market-scoreboard.ec.europa.eu/policy_areas/
public-procurement_en


   The more companies that apply for a tender, the higher the quality of the final adjudication.
Additionally, with this low participation, it is difficult to avoid problems such as collusion
or awarding of contracts to the same companies without justified reasons. The analysis of
competition would gradually improve the publication of tenders.
   Possible solutions to these issues include building clustering taking into account the amount
and type of clauses (e.g. social) to understand low bidding. Additionally, doing some profiling
of bidders (e.g. defining typologies among bidders and different levels of participation) would
allow an analysis of the relation between the type and amount of bidders and tenders (object,
clauses, etc). This could also lead to a network of bidders/tenders that would facilitate collusion
detection.
   Regarding requirements and feasibility, basic Machine Learning techniques could be tested
against available data; no annotation is required, just expert knowledge for the design of the
typology.


                                                96
Figure 3: Proportion of procurement procedures that were negotiated with a company without
any call for bids. Figure extracted from https://single-market-scoreboard.ec.europa.eu/policy_areas/
public-procurement_en


5.2. Clauses and Evaluation
As previously reported, when drafting a proposal it is common for many of the clauses to
repeat or resemble those of previous tenders. This similarity could also be used to automate
the evaluation of similar clauses. Developing a clause topology (e.g., social, environmental, or
innovation related) that organizes and helps generate a repository of clauses would significantly
streamline the entire process of drafting and evaluating tenders, facilitating also the creation of
evaluation models.
   Additionally, in some cases, we have tenders that could include clauses that were not included.
Sometimes, besides these clauses we find “special execution conditions”, but it is not clear how
they affect bidding and execution; are there fewer bidders? How are these special conditions
later evaluated?
   As mentioned, a possible solution would be to identify and generate a repository of clauses
related to objects/topics/CPVs... this way they could be easily reused and statistics could be
retrieved. The same applies to the validation criteria related to clauses and special conditions.
With this information, we may be able to recommend clauses given the object of a new clause
because similar tenders included them. This work would lead to the creation of a Knowledge
Graph that relate all this information with information extracted for other problems (e.g.,
bidders).
   Nevertheless, although the obvious usefulness of these solutions, the main issue when tackling
these tasks is the need for expert knowledge.

5.3. Questions over tenders
In order to know the impact of the tenders, to avoid collusion and similar problems, and
to improve transparency, public administrations want to be able to make queries to tender
platforms, such as“How many pills for X disease were bought last year? ”, or “For how many


                                                97
tenders was company Y awarded? ”.
   Different Natural Language Processing tools could be used in tenders to derive triples, as well
as Named Entity Recognition and Disambiguation. Once all the information is stored as triples,
a Question Answering system could be built to query the knowledge graph. Additionally, in
order to facilitate querying the system, SPARQL queries could be expressed as natural language.
   This work would depend on solutions to the problems previously presented, such as processing
tenders, clauses, and bidders. Once these are tackled, different Question Answering systems
could be tested, such as ElasticSearch9 .


6. Conclusions
The list of challenges presented is not exhaustive, but summarizes the main problems detected
in the first phase of the NextProcurement project.
   As previously mentioned, not all of them are equally feasible, and some of them require
expert knowledge that is difficult to obtain. In addition, annotation tasks and the creation of
typologies are often cumbersome and time-consuming for people not familiar with these tasks,
which significantly complicates their achievement. Therefore, the first steps to be taken in the
context of the NextProcurement project will focus on the CPV code assignment (already started
[2]) and the identification of patterns, since the data required for them is already available. In
the meantime, work will be done on the possible expert annotation of clauses in tenders in
order to try to address the other challenges identified.


Acknowledgments
This work has been supported by NextProcurement European Action (grant agreement INEA/CE-
F/ICT/A2020/2373713-Action 2020-ES-IA-0255).


References
[1] A. Soylu, O. Corcho, B. Elvesæter, C. Badenes-Olmedo, F. Yedro-Martínez, et al., Data
    quality barriers for transparency in public procurement, Information 13 (2022). URL:
    https://www.mdpi.com/2078-2489/13/2/99. doi:10.3390/info13020099 .
[2] M. Navas-Loro, D. Garijo, O. Corcho, Multi-label text classification for public procurement
    in spanish, Procesamiento del Lenguaje Natural 69 (2022) 73–82. URL: http://journal.sepln.
    org/sepln/ojs/ojs/index.php/pln/article/view/6429.
[3] O. Ahmia, Assisted strategic monitoring on call for tender databases using natural language
    processing, text mining and deep learning, Ph.D. thesis, Université de Bretagne Sud, 2020.
[4] S. Kayte, P. Schneider-Kamp, A mixed neural network and support vector machine model
    for tender creation in the european union ted database, in: Proceedings of the 11th Interna-
    tional Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge
    Management, INSTICC, SciTePress, 2019, pp. 139–145. doi:10.5220/0008362701390145 .

   9
       http://www.elastic.co/products/elasticsearch


                                                      98
[5] A. Suta, Multilabel text classification of public procurements using deep learning intent
    detection, Master’s thesis, KTH, Mathematical Statistics, 2019.
[6] Deloitte, Study on up-take of emerging technologies in public procurement, Technical
    Report, Deloitte, 2020.


                                             99