<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>SEUPD@CLEF Team RISE at LongEval: Improving Search by Crafting Titles and Matching URLs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Davide Furlan</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giulia Gibellato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Seyedeh Sara NaziriAlhashem</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuele Pase</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alberto Pasqualetto</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Filippo Tiberio</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nicola Ferro</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Padua</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This report outlines the development of the RISE group's Information Retrieval (IR) system for the LongEvalWebRetrieval CLEF 2025 Lab. The objective was to design an eficient, scalable search engine capable of handling large-scale French collections with a focus on consistent performance. The proposed system incorporates a modular architecture, including a parser, an analyzer, an indexer and a searcher, then also query translation and expansion using the Gemini LLM, and a non-neural reranking component to enhance retrieval quality. Emphasis was put on optimizing indexing and searching speed through multi-threading, improving relevance via crafting a title for each document and an URL-based document boosting based on the alignment between user queries and the document's URL. The evaluation has followed a stepwise enhancement approach, beginning with a Lucene-based baseline.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Document Parsing</kwd>
        <kwd>Query Translation</kwd>
        <kwd>Query Expansion</kwd>
        <kwd>URL Manipulation</kwd>
        <kwd>CLEF 2025</kwd>
        <kwd>LongEval-WebRetrieval</kwd>
        <kwd>Information Retrieval</kwd>
        <kwd>Temporal Evolution</kwd>
        <kwd>Search Engine</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Enormous amounts of data is published every day online: to make it useful it must be stored, parsed,
indexed and retrieved eficiently.</p>
      <p>The best way to retrieve such data is through the implementation of search engines: a user asks
for information (aka does a query) and the model will return a document (a webpage when talking
about websites) which satisfies the most the initial request. Search engines play a fundamental role in
information retrieval across many diferent domains such as entertainment, healthcare, education, etc..</p>
      <p>
        Our team’s goal is to build an eficient search engine for the CLEF LongEval-WebRetrieval 2025
collection[
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. In our work, we focus on computational speed and scalability while aiming for the best
results in terms of the Normalized Discounted Cumulated Gain (nDCG) metric.
      </p>
      <p>We will describe the implementation of multiple ideas with the aim of tuning their hyperparameters,
comparing, and merging them into a final best-performing system. We decided to analyze the top 5
best-performing systems in terms of nDCG which will be analyzed for their performance robustness
over time (the diferent snapshots of the dataset) and their statistical similarity.</p>
      <p>The paper is organized as follows: Section 2 links to LongEval’s previous years’ related works and
the foundational studies that influenced our methodology; Section 3 describes our approach and the
pipeline, here we show which are the building blocks of our system; Section 4 describes the experimental
setup and links to the codebase; Section 5 discusses our results and the relative statistical tests; finally
Section 6 draws some conclusions and outlooks for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Works</title>
      <p>At the beginning of our work, we spent a few days understanding which innovative or interesting
solutions our colleagues from previous years implemented. Below are some information we have
gathered:
• JIHUMING team [2]: They highlighted the avoidance of Named Entity Recognition (NER) due to
eficiency issues.
• NEON team [3]: Avoidance of WordNet usage for query expansion. Not good results because it
only works with texts in English.
• QEVALS team [4]: They used OpenAI’s GPT 3.5 turbo for query expansion. From this idea, we
decided to use Google Gemini for query expansion.
• IRIS team [5]: From the IRIS team we took the idea of using an additional document field for the
URL, with the purpose of saving the web address of the document in case the query explicitly
asked for the URL. Additionally, from IRIS we took some ideas about stoplists sets and to use the
French Light Stemmer.
• MOUSE team [6]: We used the MOUSE team’s project, the last year’s best performing work
coming from UniPD, as a reference of the steps to do and which components can be the best
performing ones.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>Apache Lucene is the core framework that we used to build our search engine. It is a high-performance,
full-featured text search engine library written in Java. Lucene is widely adopted in the Information
Retrieval (IR) community. We also implemented Apache OpenNLP for Natural Language Processing
tasks, such as tokenization and Named Entity Recognition (NER), but we did not use it in our final system.</p>
      <sec id="sec-3-1">
        <title>3.1. Analyzer</title>
        <p>The key component of a search engine built with Apache Lucene is the analyzer. The analyzer processes
text data from both documents and queries before indexing and searching, breaking it down into tokens,
and applying various transformations or filters to improve the efectiveness of the search.</p>
        <p>It is implemented as RiseAnalyzer, a Java class that extends the Analyzer class from Lucene.</p>
        <p>Our implementation of the analyzer is dynamic, which means that it can be changed at runtime
without the need to recompile the code. This is achieved by using a JSON configuration file that
specifies the components of the analyzer. The configuration file contains a Tokenizer, and a list of
TokenFilters. Both standard (from Lucene) and OpenNLP analyzers are supported. The following is
the configuration file for our best performing analyzer:
Listing 1: Configuration file for the analyzer
{
"tokenizer": {
"kind": "core",
"name": "org.apache.lucene.analysis.standard.StandardTokenizer"
},
"filters": [
{
},
{
"kind": "class",
"kind": "class",
"name": "org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter"
}
]
},
{
},
{
}
"name": "org.apache.lucene.analysis.LowerCaseFilter"
"kind": "elision",
"name": "filters/FR_elision −articles.txt"
"kind": "class",
"name": "org.apache.lucene.analysis.fr.FrenchLightStemFilter"</p>
        <p>This approach allows us to easily experiment with diferent analyzers and find the one that works
best for our data.</p>
        <p>Analyzer.initReader()</p>
        <p>Text Cleaning and Normalization
Analyzer.createComponents()</p>
        <p>Stemming/Lemmatization</p>
        <p>Tokenization
Stopwords removal</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Text Cleaning and Normalization</title>
          <p>Document cleaning is a crucial preprocessing step in IR-based systems. Before tokenization, the pipeline
performs extensive normalization and cleaning on raw text, aiming to standardize information such
as dates, and phone numbers to maximize matching with queries presenting dates or phone numbers,
while also removing irrelevant symbols, characters, and few-digit numbers.</p>
          <p>A modular TextPreprocessor Java class has been developed, which performs several
transformations:
Date Normalization French and English natural language dates (e.g., 26 janvier 1984, January 26th,
1984) are converted into a unified dd_mm_yyyy format using regular expressions and
languagespecific month mappings. Numeric date formats (e.g. 26 01 1984) are also normalized to the same
structure.</p>
          <p>Phone Number Normalization French phone numbers in various formats are standardized to a plain
digit format (e.g., 0612345678) to improve matching and reduce token ambiguity.
Short Number Removal Standalone numbers with 1 to 3 digits that are not part of dates or phone
numbers are removed to eliminate noise, which commonly arises from HTML or CSS code coming
from the scraping process.</p>
          <p>Character Cleanup The pipeline filters out emojis, hashtags, HTML and CSS code, invisible Unicode
characters, and styling directives, using a compiled regex pattern.</p>
          <p>The cleaning process is integrated into the RiseAnalyzer class by overriding the initReader()
method. This ensures that all text passed through the Lucene pipeline is cleaned consistently and
eficiently before tokenization.</p>
          <p>One particular challenge was preserving semantically meaningful tokens while eliminating noise:
to address this, the cleaning module protects normalized entities (dates and phone numbers) during
ifltering and restores them afterwards. This multistep protection strategy minimizes false deletions.</p>
          <p>By adopting this robust cleaning pipeline, both the quality of the index and the consistency of
token-level features are improved, which benefits later stages of query matching.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Tokenization</title>
          <p>Tokenization is the fundamental step of any Analyzer and its quality significantly afects the performance.
We have explored the following tokenizers with the goal of identifying the most suitable one for our
pipeline:
WhitespaceTokenizer (from Lucene) it is a simple tokenizer that splits text on whitespace
characters. While fast and lightweight, it lacks the sophistication to handle punctuation or
languagespecific rules.</p>
          <p>LetterTokenizer (from Lucene) it splits tokens at non-letter characters. Although slightly more
refined than WhitespaceTokenizer, it still falls short in treating acronyms, contractions, and
numeric data appropriately.</p>
          <p>OpenNLPTokenizer (from OpenNLP) these models give more linguistically-aware tokenizers.</p>
          <p>They need to be configured with French (or English) sentence and token models. While these
tokenizers ofer accurate segmentation and sentence-level analysis, they introduce significant
runtime overhead, and they focus on just one language.</p>
          <p>StandardTokenizer (from Lucene) this is a widely adopted tokenizer in the Lucene framework. It
provides robust handling of punctuation, alphanumeric strings, email addresses, acronyms and
other linguistic patterns using a grammar-based approach. It balances accuracy and performance
efectively without the need for external models.</p>
          <p>The latter tokenizer is the one we decided to adopt in our pipeline. This decision was guided not
only by our empirical observations and the will of keeping indexing step as lightweight as possible, but
also by findings in prior reports (described in section 2). So some of the key reasons behind this choice
include proven reliability, balanced trade-ofs, and also a sort of community consensus.</p>
        </sec>
        <sec id="sec-3-1-3">
          <title>3.1.3. Stemming and Lemmatization</title>
          <p>The stemmer constitutes the second major step in the ofline phase of an IR pipeline. Its primary
role is to reduce words to their root forms, enabling the system to match various inflected or derived
versions of a word to a common base. This not only helps improve recall by supporting more flexible
term matching, but also reduces index size. By collapsing word variants into unified representations,
stemming introduces a level of semantic normalization that enhances search efectiveness. An example
of stemming is the reduction of the words change, changing, and changer to the root form chang. Several
stemmers like PorterStemmer and SnowballStemmer are available in the Lucene library.</p>
          <p>A similar tool is the lemmatizer, which also reduces words to their base forms, but does so transforming
words to linguistically correct lemmas. For example, a lemmatizer would convert better to good and
went to go. In Lucene, lemmatizers are implemented through OpenNLP and they need a language model
to work.</p>
          <p>Lemmatization is more accurate than stemming, but it is also way more computationally expensive.
In the context of out project, where we are dealing with a large collection of documents, we opted for
stemming to ensure a faster indexing process.</p>
          <p>The chosen stemmer is the FrenchLightStemmer [7], which is a stemmer distributed with Lucene
and is specifically designed for the French language. The choice of the stemmer has also been directed
by previous works by Cazzador et al. [6] and Galli et al. [5].</p>
          <p>Stoplists Stopwords are very frequent words that do not help to discriminate between documents,
and are usually removed from the text prior to indexing.</p>
          <p>We tried implementing diferent stoplists in our indexing process. We tested both stoplists found on
the internet and custom stoplists crafted starting from most frequent words in the corpus.</p>
          <p>Since some documents are not in French and, even in French passages, English words may appear,
we also considered a stoplist for English words. We have been directed to this approach by the previous
work of the IRIS team [5].</p>
          <p>At first, we used downloaded stoplists to see how they would afect the indexing, but after taking a
closer look at the output tokens, we realized that some very common and irrelevant words were still
not being filtered out. To fix this, we decided to create some custom stoplists from the most frequent
words in the corpus trying to get to a cleaner and smaller index, and an improved search result quality.
Most frequent words were retrieved by accessing the index using the Apache Lucene Luke tool.
• FR_stoplist.txt: a French stoplist found on the internet. It contains 463 words.
• customFrenchStoplist.txt: a custom stoplist created by us merging the most popular
stoplists. It contains 552 words.
• SMART.txt: the english stoplist of SMART retrieval system [8]. It contains 571 words.
• 500-custom.txt: a custom stoplist created removing the most frequent 500 words in the
corpus.
• 250-custom.txt: a custom stoplist created removing the most frequent 250 words in the
corpus.
• 125-custom.txt: a custom stoplist created removing the most frequent 125 words in the
corpus.</p>
          <p>• 50-custom.txt: a custom stoplist created removing the most frequent 50 words in the corpus.</p>
          <p>As we will detail in section 3.4.1, the use of stoplists has a negative impact on the quality of the
results for this dataset.</p>
          <p>Another filter we have implemented, similar to stemming is the ElisionFilter, which is a Lucene
iflter that removes French articles and prepositions with apostrophes from the tokens, in the French
language a lot of words are formed by the contraction of a preposition and an article, for example
l’avion (the plane) will be tokenized as avion (plane). To apply such filter we need to provide a list of a
few articles to be removed. Those can be found in the FR_elision-articles.txt file. We observed
this filter being efective, so we implemented it in our baseline.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Indexer</title>
        <p>To enable eficient retrieval and structured storage of documents, we have developed an indexing
component implemented in the RiseIndexer class.</p>
        <p>This class is responsible for orchestrating the entire indexing workflow: reading raw documents
from the parsed corpus, analyzing their textual content through the analyzer, and writing the resulting
indexable fields.</p>
        <p>In our implementation, the indexing logic is designed to be modular, extensible and optimized for
performance through multi-threading. The indexer leverages concurrent workers to process and index
documents in parallel, significantly reducing the overall indexing time when handling large corpora.</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Language Detector</title>
          <p>During the development of the search engine, one of the core challenges we have addressed was the
support for multilingual documents, specifically in French and English. To this end, we have initially
integrated a language detection component using an OpenNLP model. However, this turned out to be a
complex and critical aspect of the system design.</p>
          <p>Our idea was to automatically detect the language of each document at indexing time and then
apply the most appropriate language-specific analysis pipeline (for example a French analyzer for the
French documents and an English analyzer for the English ones). This logic has been encapsulated
inside a custom RiseAnalyzerWrapper class. The wrapper acts as an abstraction layer over multiple
analyzers, dynamically choosing the correct one for each document based on the detected language,
without requiring manual separation or prior annotation of the documents.</p>
          <p>Despite the theoretical value of this approach, we encountered several practical and architectural
challenges:
Thread safety and concurrency integrating dynamic analyzer switching into a multi-threaded
indexing pipeline introduced complexity and potential synchronization issues;
Performance concerns given the large number of documents, the overhead introduced by on-the-fly
language detection was substantial. Indexing time was observed to increase up to threefold,
which significantly impacted eficiency.</p>
          <p>To better understand the implications of our design choices, we have run experiments using the
language detector along with a set of counters (now commented out in the code), used to track the
distribution of languages across documents, identify inconsistencies in language detection for documents
sharing the same ID and verify detection accuracy relative to expectations. The results of this analysis
revealed that the majority of documents are written in French, with only a small portion in English or
other languages.</p>
          <p>Based on this observation, and given the performance penalties, we concluded that a fully multilingual
indexing pipeline was not worthwhile for this particular dataset.</p>
          <p>So we ultimately decided to abandon automatic language detection having also the confirmation from
the organizers that the task should be French-only, hence non-French documents should be considered
noise. Instead, we have opted for a single-language French pipeline. This allows us to maintain high
indexing performance and avoid potential inconsistencies in term processing.</p>
          <p>On the other hand a language detection and translation component has been applied for the queries
(further details about query translation in section 3.3.1).
3.2.2. Parser
In the provided collection documents, split into multiple files, both in JSON and TREC format, are
provided. They need to be parsed before they can be indexed. At this stage, we also perform an ad
hoc cleaning of the text to remove noise such as HTML tags, CSS code, emojis, and other irrelevant
characters.</p>
          <p>Since the documents were provided to us in two formats, we conducted analyses on the reading
speed of both formats to determine which one would be more eficient to reduce as much as possible the
indexing time even from the very first step. To do this, the ParserBenchmarker class has been created
to compare the reading speed of TREC and JSON documents, both with handcrafted logic (REGEXes),
and using specific libraries.</p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.3. Parsing Speed Testing</title>
          <p>Performance evaluations were performed to determine the optimal selection of the parser based on
execution speed. To ensure statistical robustness, the parsing times of the complete dataset were
recorded across 10 independent runs.</p>
          <p>Preliminary results demonstrated that parsers employing a manual implementation approach (using
REGEXes), which avoid external JSON/XML parsing libraries, ofered significantly superior performance
in terms of parsing speed. Subsequently, a statistical comparison between parsing TREC and JSON
formatted files was carried out using a two-sided Wilcoxon signed rank test. The results indicated a
statistically significant speed advantage when parsing JSON files compared to TREC files ( - =
0.001706).</p>
          <p>To further enhance parsing eficiency, a decision was made to develop a dedicated "one-shot" parser,
prioritizing processing speed over memory usage. This parser, also operating on JSON-formatted files,
fully loads the dataset into memory utilizing the com.fasterxml.jackson library and subsequently
returns the parsed objects iteratively via the invocation of the iterator’s next() method. Although
this implementation poses potential risks, such as triggering Out-of-Memory (OOM) errors under
insuficient system memory conditions, it provides a substantial speed increase beneficial for indexing
processes.</p>
          <p>A very important class is ParsedDocument, which serves to represent the parsed documents, saving:
• Id: The numeric identifier value of the document
• Title: Crafted from the first part of the document, as described in paragraph 3.2.3
• Text: The body of the document, containing the full text
And, if present:
• URL: The web address of the document
• Domain: The domain of the web address
• Keyword: Words present within the URL; used as query expansion to improve document search
Title extraction Examining the documents in the corpus, we found that frequently the first part of
the document contains the title of the webpage.</p>
          <p>The title is often a concise and informative summary of the content of the document, which makes it
a valuable piece of information for the search engine. Hence we decided to exploit this information and
extract the title from the document in order to generate a new field in the index called title used to
boost the score of the documents during the search phase (see Subsection 3.3.2 for more details).</p>
          <p>The algorithm we used is simple: it keeps the part of the document that is before a delimiter ("|", "-",
"?", "!", ".") and discards the rest. If the title is not found, the algorithm checks if in the beginning
of the document there is a substring all in uppercase letters, which is likely to be the title. If the title is
not found, the algorithm keeps the first 70 characters of the document as the title.</p>
          <p>Present
Keep before
delimiter</p>
          <p>Original sentence
Delimiter detection</p>
          <p>Repetition removal
(a) Pipeline of the title extraction algorithm
imple</p>
          <p>mented in the TitleExtractor class
"WWW.SAURCLIENT.FR ESPACE CLIENT</p>
          <p>WWW.SAURCLIENT.FR ESPACE CLIENT
WWW.SAURCLIENT.FR ESPACE CLIENT - Ceci
est un exemple de contenu de document."
"WWW.SAURCLIENT.FR ESPACE CLIENT
WWW.SAURCLIENT.FR ESPACE CLIENT
WWW.SAURCLIENT.FR ESPACE CLIENT"</p>
          <p>Truncate after delimiter</p>
          <p>Remove repetitions
"WWW.SAURCLIENT.FR ESPACE CLIENT"
(b) Example of the title extraction algorithm
A final processing step is to remove duplicates from the title.</p>
          <p>The extraction of the title is done in the TitleExtractor class.</p>
          <p>URL extraction An important feature we developed is the URL extraction.</p>
          <p>A URL is primarily composed of a domain and a path. This structure is particularly relevant because
the domain may appear in the user’s query; identifying it allows us to prioritize documents from strongly
authoritative sources corresponding to the queried domain. Additionally, the path often contains terms
that either reflect an alternative version of the document’s title or provide a semantic description of
its content. As such, it enables us to boost query tokens that are especially relevant to the document,
similarly to how a title does.</p>
          <p>The URL extraction occurs simultaneously with the creation of the ParsedDocument: When the
document ID is available, the system looks in the release_2025/collection_db.db database for the entry
with the same ID. Then, it processes the URL by extracting the domain and the words present in the
address to save them in the respective fields mentioned above.</p>
          <p>Here is how URL extraction works in detail:</p>
          <p>Original URL
Extracted domain</p>
          <p>Words in path
(a) Pipeline of the URL extraction
1. The first step occurs during the creation of the ParsedDocument, where the ID of the document
is used to seek the corresponding URL.
2. If an URL is found, it undergoes some manipulation to extract the domain and the alphanumeric
string after the "/" character following the domain itself.
3. This alphanumeric string is further processed to separate the string into substrings, then numbers,
special characters, single letters, and file extensions or properties of the page (like ".pdf", ".html",
".php", ".aspx") are removed. We are then left with words that have a length greater than three
characters.
4. The final step is to save the URL and the domain in their relative fields, and the remaining words
appending them to the title field of the ParsedDocument.</p>
          <p>We had a couple of options for where to save the words of the URL (i.e., the strings following the
domain name):
Storing them in the document body This option was dismissed because the terms would have been
treated as regular content words and likely diluted within the broader context of the document,
reducing their influence during retrieval.</p>
          <p>Storing them in the document title This alternative appeared more promising, as it allowed us to
assign greater weight to the terms, thereby increasing their impact on relevance scoring.</p>
          <p>In the end we did not use the domain field since we observed that the url alignment (described in
subsection 3.3.3) was already suficient to boost the documents coming from the same domain.</p>
          <p>DB Access Parallelization Query id to URL mappings appear in the release_2025/collection_db.db
sqlite database present in the task’s dataset.</p>
          <p>Initially the ParsingURL class took into account the whole reading of the database, but we noticed
that this was a huge bottleneck in corpus indexing since database queries were sequential, making vain
the efort put into the parallelization of indexing step.</p>
          <p>In order to overcome this obstacle, we created the DBReader class which provides methods to access
the database in a fast and parallelizable way, exploiting a read only approach, mainly by disabling the
journal usage, disabling the synchronous mode, and optimizing the database which will be accessed in
immutable, read-only mode.</p>
          <p>Used Pragmas:
PRAGMA mmap_size = 2 6 8 4 3 5 4 5 6
PRAGMA c a c h e _ s i z e = − 1 6 0 0 0
PRAGMA p a g e _ s i z e = 8 1 9 2
PRAGMA r e a d _ u n c o m m i t t e d = 1
PRAGMA s y n c h r o n o u s = OFF
PRAGMA j o u r n a l _ m o d e = OFF
PRAGMA a n a l y s i s _ l i m i t = 1 0 0 0
PRAGMA o p t i m i z e</p>
          <p>A DBReader class instance, initialized with a database path, will create multiple Connections and
PreparedStatements, one for each thread, and store them in ConcurrentHashMaps</p>
          <p>This utility class has been also used for topics reading after query translation (described in subsection
3.3.1) and query expansion (described in subsection 3.3.4) has been implemented for parsing topics and
assign them one field each.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Searcher</title>
        <p>The Searcher is the last major component of our information retrieval pipeline. It is responsible for
retrieving the relevant documents from the indexed set based on a given query. The main components
of the searcher are:
Query formulation (formulateQuery()) : For each topic, a BooleanQuery is created, containing
the title and body fields with their respective boosts.</p>
        <p>Search execution : This method performs the actual search. RiseSearcher is implemented
exploiting parallel processing, considerably improving the performance of the system.</p>
        <p>The steps involved are:
• For each topic, the Searcher uses the query to retrieve the top documents, ranked by
relevance to the query.
• The scores of the retrieved documents are re-ranked.
• The scores are then normalized between 0 and 1 to ensure consistency across queries.
• The results are written to the output file.</p>
        <p>Other interesting features of the Searcher are:
• Title boost 3.3.2
• URL Alignment 3.3.3
• Query expansion 3.3.4</p>
        <sec id="sec-3-3-1">
          <title>3.3.1. French translation of the query</title>
          <p>Given that most documents in our dataset are in French, we have implemented a query translation
system that converts user queries into French when they are written in another language. This approach
helps ensure accurate and consistent retrieval results without requiring a multilingual pipeline.</p>
          <p>The translation logic has been implemented in a standalone Python script: query_translation.py,
this step is performed ofline since we exploited the fact we have the full set of queries available in
advance inside the release_2025/queries_db.db sqlite database.</p>
          <p>The pipeline begins by receiving the original query and determining whether translation is necessary.
This decision is determined by a prior step: language detection.</p>
          <p>For this, we used the fast_langdetect Python library. If the detected language is already French
(our target language), the system simply bypasses translation and returns the original query. While, if
the query is in another language, the pipeline uses the translation component, which interfaces with
the Google Gemini 2.0 Flash API.</p>
          <p>The API is invoked through a custom HTTP POST request, constructed with a carefully tailored
French-language prompt that instructs the model to translate the query literally, without interpretation
or additional explanation.</p>
          <p>The prompt (in french):</p>
          <p>Vous êtes un moteur de traduction. Vous recevrez une chaîne de texte arbitraire dans une
langue source inconnue.
1. Traduisez le texte mot à mot, en préservant exactement tous les termes et la structure.
2. Traduisez toujours en français.
3. Ne renvoyez que la requête traduite, sans commentaire ni explication.</p>
          <p>Voici le texte à traduire: &lt;QUERY&gt;</p>
          <p>If translation is successful, the translated query is returned; otherwise, the original query is preserved
to ensure no interruption in processing.</p>
          <p>This preprocessing pipeline ensures that user queries are consistently aligned with the main language
of the indexed documents, improving the accuracy and relevance of retrieval results without introducing
significant complexity into the core system.</p>
        </sec>
        <sec id="sec-3-3-2">
          <title>3.3.2. Title Boost</title>
          <p>Since the title field includes both parts of the URL (described in paragraph 3.2.3) and the extracted
document title (as described in paragraph 3.2.3), precious information to discriminate relevant documents,
we implemented a boost to the terms present in the title field. This boost is applied during the search
phase, where the title terms are given a higher weight compared to terms which appear only in the
body.</p>
        </sec>
        <sec id="sec-3-3-3">
          <title>3.3.3. URL Alignment</title>
          <p>It was decided to perform a first re-ranking of the documents using the url of the documents: An
alignment between the URL and the query was chosen to assess the amount of overlap between the
two. Specifically, we decided to use a normalized global alignment parameterized to +1 for matches
and − 1 for insertions/deletions. In practice, the version based on the dynamic programming technique
of the Needleman-Wunsh algorithm has been used: this algorithm calculates the alignment score and
normalizes it in the range [− 1, +1]. The only exception results from the fact that: if the query is an
exact substring of the URL, then the score is 1 regardless of the alignment.</p>
          <p>Example of our alignment:
original url: code.google.com/archive/p/stop-words/
original query: code google.com stopwords
Aligned url: code.-google.com/archive/p/stop-words/
Aligned query: code-
google.com-----------stop-wordsScore:
++++--++++++++++-----------++++-+++++</p>
          <p>
            In order to also parameterize the boost factor [, ] during reranking, the score is
interpolated by a normalized sigmoidal function centered on zero: the score is first compressed into
the range [
            <xref ref-type="bibr" rid="ref1">0, 1</xref>
            ], the normalized sigmoidal function is computed, and then extended to the boost factor
range. As in the following:
 () = 1 + − 1(− 0.5) , 0 = 1 + 1· 0.5 , 1 = 1 + 1− · 0.5
 =
 + 1
2
          </p>
          <p>,  = 10
norm() =
 () − 0
1 − 0
final() =  + norm() · ( − )</p>
          <p>Essentially, this approach aims to leverage the semantic content of the URL in order to reward
documents with strong alignment and, possibly, slightly penalize those with poor matching.</p>
          <p>Final Normalized Boost vs. Score
1</p>
        </sec>
        <sec id="sec-3-3-4">
          <title>3.3.4. Query Expansion</title>
          <p>We query expand the queries through Google Gemini 2.0 Flash: it permits us to get synonyms and
related concepts to the query in analysis. Our prompt to Gemini gives some examples to the model to
gain the benefits of few-shot prompting compared to zero-shot one:</p>
          <p>Nevertheless, we asked Gemini to answer using structured output, so we can easily parse and use it
in our pipeline.</p>
          <p>The prompt (in french):</p>
          <p>Vous faites partie d’un système d’information qui traite les requêtes des utilisateurs. Vous
développez une requête donnée en requêtes de sens similaire.</p>
          <p>Développez la requête de recherche suivante en une liste de mots-clés et expressions associés.
Fournissez la sortie sous forme de tableau JSON de chaînes de caractères. N’incluez aucun
autre texte en dehors du tableau JSON.</p>
          <p>Exemples :
Requête : recettes de cuisine faciles ["recettes rapides", "cuisine simple", "idées repas faciles",
"préparer à manger rapidement", "recettes pour débutants", "meilleures recettes faciles à faire"]
Requête : apprendre le français en ligne ["cours de français en ligne", "sites pour apprendre le
français", "applications pour le français", "exercices de français en ligne", "apprentissage du
français à distance"]</p>
          <p>Requête : &lt;QUERY&gt;</p>
          <p>Of particular concern are some queries related to sexual content and violence since they won’t be
query-expanded even with safety filters turned of. If for this or any other reason the model doesn’t
return any output, we won’t consider any expansion for the query.</p>
        </sec>
        <sec id="sec-3-3-5">
          <title>3.3.5. Reranking</title>
          <p>Re-ranking refers to the process of re-evaluating and adjusting the scores of documents retrieved by a
ifrst-stage retriever (BM25) by computing similarity between the query and candidate documents using
their semantic meaning with the goal of improving retrieval quality.</p>
          <p>In our system, we explored multiple reranking strategies during development. Apart from the
manual URL alignment-based re-scoring method detailed in section 3.3.3, some bi-encoders and
cross-encoders have been tested to try achieving a higher nDCG score.</p>
          <p>Tested bi-encoder: paraphrase-multilingual-MiniLM-L12-v2.</p>
          <p>Tested cross-encoder: mixedbread-ai/mxbai-rerank-base-v2.</p>
          <p>Dense neural models are computationally expensive, for this reason they are used only for reranking
on a few documents.</p>
          <p>During our exploration in the training set we tested the previously mentioned rerankers, but we
observed that the outcome was not satisfying, sometimes not even improving while paying a huge cost
in latency, even with high-performance hardware (NVIDIA H200 GPU) considering on average 10000
queries per month.</p>
          <p>For this reason we decided to cut out this step from our pipeline.</p>
          <p>We need to precise that we obtained such results using qrels that were afterwards modified:
at the time of our tests the qrels had some relevance judgments that were clearly incorrect.
Given the higher focus on semantics of neural IR models with respect to BM25, such
inaccurate qrels may have influenced the deterioration of performance we noticed.
Due to computational constraints, we were unable to re-run the reranking on the training
and test set.</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Parameters Optimization</title>
        <p>We implemented the previously cited features block by block into the pipeline and evaluated the
contribution of each individual block when concatenated with the preceding ones. For each block, an
optimization phase was carried out to identify the optimal parameters.</p>
        <sec id="sec-3-4-1">
          <title>3.4.1. Baseline</title>
          <p>Our initial baseline, which already yielded satisfactory results has been fixed after testing several
analyzer combinations with the components (tokenizer, stoplists, etc....) detailed in section 3.</p>
          <p>As a next step, we implemented stopword removal in the pipeline, initially using publicly available
French and English stoplists containing approximately 500 words each. However, we observed a decrease
in the nDCG score compared to the baseline. To investigate further, we created custom stopword lists
composed of the top-k most frequent indexed terms. We found that performance was sensitive to the
size of the stoplists: reducing their size led to improved results, with nDCG values eventually returning
to baseline levels.</p>
          <p>Therefore, we decided to skip the stopword removal step and define as BASELINE the
"vanilla" Lucene pipeline, which employs an analyzer composed of the following components:
org.apache.lucene.analysis.standard.StandardTokenizer +
• org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter
• org.apache.lucene.analysis.LowerCaseFilter
• FR_elision-articles.txt
• org.apache.lucene.analysis.fr.FrenchLightStemFilter</p>
          <p>With this setup, we obtained the initial nDCG values in Table 1.</p>
        </sec>
      </sec>
      <sec id="sec-3-5">
        <title>3.5. Grid Optimization</title>
        <p>During our tests, as explained before, we implemented step by step each "block", hence we found the
approximate hyperparameters/boosting factors that most enhanced the nDCG in the training set.</p>
        <p>Having these values, we performed a grid optimization in multiple finer-coarse steps to fine-tune
them and find the hyperparameters that best suited for our dataset.</p>
        <p>Such optimization has been performed on only one month since we observed that all systems tend to
have the same performance enhancement trend on all months: a system better than another in a month
is better than it also in the other months.</p>
        <p>First, we conducted the search experiments only with title boost, query expansion, and query
translation enabled, while disabling URL alignment (figure 8). These features were tested together
but separately from the others, as they interact with one another due to their shared reliance on the
document’s body field.</p>
        <p>Once we obtained the 5 best configurations out of the search, we determined the best hyperparameters
for url alignment as second step (figure 9).</p>
        <p>At this point, the five best parameter combinations were identified, and the following diferent
systems were also added as reference to the comparison:
• The baseline: no optimization applied, i.e., all parameters set to zero
• The baseline with URL boost, without title-related boosts
• The baseline with title-related boosts, without URL boost</p>
        <p>In Table 2, the selected systems are presented, while Table 4 reports the results obtained. Only the 5
best systems have been sent for evaluation to LongEval.</p>
        <p>A legend of acronyms can be found in Table 3.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental Setup</title>
      <sec id="sec-4-1">
        <title>4.1. Used Collections</title>
        <p>For the experiments, we employed the training collection TrecEval25 provided by TrecEval. The
collection is organized into 9 separate folders, each containing JSON and TREC files. Each folder was
treated as an independent dataset - here the longitudinal evaluation that gives the name to the task.
Additionally, the provided database files ( .db) was utilized to extract the corresponding URLs for each
document and to perform query translation and expansion.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Evaluation Measures</title>
        <p>Performance evaluation throughout this work was conducted using the nDCG metric.
where Discounted Cumulated Gain (DCG)) is computed as:
nDCG(k) =</p>
        <p>DCG(k)
iDCG(k)

DCG(k) = ∑︁</p>
        <p>2 − 1
=1 log2( + 1)
(1)
(2)
with  representing the graded relevance of the item at position , and Ideal Discounted Cumulated
Gain (iDCG) is the DCG(k) value obtained by sorting documents in decreasing order of relevance, thus
representing the optimal ranking.</p>
        <p>The metrics were extracted from the runs using the trec_eval tool (https://github.com/usnistgov/trec_
eval) using the flag -c which considers in the evaluation not only the intersection between topics and
qrels, but all the queries in the relevance judgments.</p>
        <p>The qrels considered in this work are only those with at least one relevant document.</p>
        <p>We also considered the Relative nDCG Drop (RnD): the other evaluation measure considered by
LongEval, which is measured by computing the diference between snapshots test sets. This measure
supports the evaluation of the impact of the data changes on the systems’ results. An analysis on
performance trends over time through RnD is presented in Section 5.2.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Git Repository</title>
        <p>The entire project, including the source code, generated outputs, analysis, and additional documentation,
is available on Bitbucket at the following URL: https://bitbucket.org/upd-dei-stud-prj/seupd2425-rise/
src/master/</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Hardware Used for Experiments</title>
        <p>The experiments were conducted on a Windows 10 PC configured as follows:
• CPU: Intel Core i5-10400F
• RAM: 32 GB DDR4 at 3600 MHz
• GPU: NVIDIA GTX 1660
• Storage: Dedicated Samsung 990 EVO NVMe SSD</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>In Table 5 are the performances of our systems in the test set.</p>
      <p>In the following section, we validate the significance of improvements introduced in our pipeline
using a statistical approach. We focus on the nDCG metric across multiple queries and configurations.
Specifically, we apply Two-Way ANOVA followed by Tukey’s Honestly Significant Diference ( HSD) test
given as significance level  = 0.05.</p>
      <sec id="sec-5-1">
        <title>5.1. Hypothesis Testing</title>
        <p>The Two-Way ANOVA followed by Tukey’s HSD test revealed several clear and insightful trends in
the performance of our retrieval systems: the results confirm that the diferences between system
configurations are statistically significant when comparing the baseline to systems enhanced with
query expansion and title boosting. The baseline configuration, 1_s_tb0.0_qe0.0_qt0.0_l0.0_u0.0,
consistently underperforms relatively to all the other configurations. For example, its comparison with
4e_s_tb0.5_qe0.1_qt0.0_l0.7_u1.2 yields a mean diference of 0.10588 with a p-value &lt; 0.001, indicating a
highly significant improvement. Similar results were observed against nearly all enhanced systems, all
with p-values efectively at zero. This strongly supports the conclusion that adding query expansion
and title boosting contributes meaningfully to retrieval efectiveness.</p>
        <p>Moreover, even switching from l=0.0, u=0.0 to l=0.7, u=1.2 in the baseline system yields a statistically
significant improvement, indicating that, even without query expansion, in the baseline system these
changes can also provide measurable gains.</p>
        <p>Instead, when comparing systems within the same "family" of configurations (i.e., all using the
same multipliers for title boosting and query expansion, like tb=0.5 and qe=0.1) we observe that
diferences due to fine-grained changes are not statistically significant. For example, the system
4e_s_tb0.5_qe0.1_qt0.0_l0.7_u1.2 compared with the system 4a_s_tb0.5_qe0.1_qt0.1_l0.7_u1.2 yields
a mean diference of only 0.000041, with a p-value = 1.00, showing no evidence of a meaningful
performance gap. In general, all such within-family comparisons exhibit mean diferences &lt; 0.001, with
p-values of 1.00, indicating complete statistical overlap.</p>
        <p>These considerations suggest that most of the performance gains stem from the presence or absence
of key features such as query expansion and title boosting. In figure 6 it is possible to see the ANOVA
results.</p>
        <p>The statistical evaluation is extended to the diferent months of the collection and their behavior is
the same. We implemented the code to evaluate 3-way ANOVA, but we did not have enough time to
fully execute it and report the results. A further comparison on diferent months split is brought in the
next section.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Long vs Short term behavior</title>
        <p>To measure the systems’ temporal robustness we exploited the Relative nDCG Drop (RnD) measure
for each month, by computing the nDCG diference between months as described by the LongEval
evaluation guidelines.</p>
        <p>We can consider the training set part as short term behavior and the test set part as long term behavior.
The results are reported in Table 6 and Figure 7.</p>
        <p>From the plot in figure 7, it can be observed that the system’s performance increase until June 2023,
after which it begins a rapid decline until the end of the observed period. With so few months of data,
it is dificult to assess the consistency of this trend or determine its underlying causes.</p>
        <p>We also claim that our systems’ performances are unlinked from time since in long term (test set
after 12 months) we see the highest peak in nDCG performances and also the greatest drop (test set
after 14 months), while in the very short term the system has stable performances. We hypothesize
that these changes in performances of all systems together are primarily due to queries, documents, or
relevant judgments quality or updates of their creation system.</p>
        <p>An interesting observation is that more complex systems, the ones that perform the best in absolute
value, exhibit smaller performance gains and larger degradations compared to simpler systems, including
the baseline. This suggests that our fine-tuned models may be overfitting to the specific dataset.
2023-03
(0.0711)</p>
        <p>0.0
(0.0748)</p>
        <p>0.0
(0.0660)</p>
        <p>0.0
(0.0631)</p>
        <p>0.0
(0.0627)</p>
        <p>0.0
(0.0627)</p>
        <p>0.0
(0.0628)</p>
        <p>0.0
(0.0630)
0.0</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions and Future Work</title>
      <p>We began our project by creating a basic working structure, referred to as the baseline, which
implemented a simple version of our pipeline: Parser–Analyzer–Indexer–Searcher. From this baseline, we
iteratively introduced and tested various enhancements to improve performance. These included
integrating custom stopwords, implementing title boosting, applying URL alignment boosting, expanding
queries using Google Gemini, and reranking results. Each component and configuration was evaluated
using the nDCG metric to determine its impact.</p>
      <p>The results are promising and are mainly pushed by the title-boosting factor and secondarily by
leveraging the URL information.</p>
      <p>Our system appears to be independent of temporal factors, as its performance does not exhibit a clear
stable, increasing, or decreasing trend over time, but, while obtaining best-in-class nDCG performances,
we notice an overfit on task’s data.</p>
      <p>Future eforts could focus on refining query strategies and exploring additional optimization
techniques, such as:
• Further explore the usage of URL: the words extracted from the url are not currently used, but
they may result in another increase of performance comparable to the one already obtained by
query expansion.
• Trying to use n-grams: by incorporating n-grams, the system can capture contextual relationships
between consecutive words rather than just individual words. This may improve search accuracy,
especially when dealing with multiple terms.
• Using NER algorithms: allow the system to identify and categorize entities within a text, such
as company’s name, locations, dates, phone numbers (and relative alignment for numbers with
and without country code) and other important terms. By assigning a higher weight to these
entities, the retrieval system can give higher or lower rank results based on the significance of
these keywords.
• Test reranking more extensively, maybe reducing the documents on which it is applied, or just
give it more time.
• Examine the decline in performance following stopwords removal by exploring potential causes
and verifying the code for errors, given that this is an unusual occurrence, although it has been
documented in the literature [9].
• Translating all documents into French, if they are not already: translating all documents into
French ensures that all documents are processed uniformly, allowing the system to apply
consistent tokenization and linguistic rules. This can lead to better matching of queries with documents,
but it can also introduce translation errors or nuances that could afect the retrieval quality.
However, as described in the relative section, this approach would require significant computational
resources and time, especially for large datasets.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used GPT-4o and Writefull in order to: Improve writing
style and Grammar and spelling check. After using these services, the authors reviewed and edited the
content as needed and take full responsibility for the publication’s content.
Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth
International Conference of the CLEF Association (CLEF 2025), 2025.
[2] I. Atabek, H. Chen, J. Moncada Ramírez, N. Santini, G. Zago, Seupd@clef: Team JIHUMING on
enhancing search engine performance with character n-grams, query expansion, and named entity
recognition, in: Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023),
Thessaloniki, Greece, September 18th to 21st, 2023, volume 3497 of CEUR Workshop Proceedings,
CEUR-WS.org, 2023, pp. 2204–2221. URL: https://ceur-ws.org/Vol-3497/paper-185.pdf.
[3] S. Bortolin, G. Ceccon, G. Czaczkes, A. Pastore, P. Renna, G. Zerbo, Seupd@clef: Team NEON.
a memoryless approach to longitudinal evaluation, in: Working Notes of the Conference and
Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023,
volume 3497 of CEUR Workshop Proceedings, CEUR-WS.org, 2023, pp. 2281–2305. URL: https:
//ceur-ws.org/Vol-3497/paper-189.pdf.
[4] S. Fincato, E. D’Alberton, Y. Qiu, L. Vaidas, L. Pallante, A. Jassal, Seupd@clef: Team QEVALS on
information retrieval adapted to the temporal evolution of web documents, in: Working Notes of
the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September
18th to 21st, 2023, volume 3497 of CEUR Workshop Proceedings, CEUR-WS.org, 2023, pp. 2416–2431.</p>
      <p>URL: https://ceur-ws.org/Vol-3497/paper-194.pdf.
[5] F. Galli, M. Rigobello, M. Schibuola, R. Zuech, N. Ferro, Seupd@clef: Team IRIS on temporal
evolution of query expansion and rank fusion techniques applied to cross-encoder re-rankers,
in: Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), Grenoble,
France, 9-12 September, 2024, volume 3740 of CEUR Workshop Proceedings, CEUR-WS.org, 2024, pp.
2356–2383. URL: https://ceur-ws.org/Vol-3740/paper-218.pdf.
[6] L. Cazzador, F. L. De Faveri, F. Franceschini, L. Pamio, S. Piron, N. Ferro, Seupd@clef: Team
MOUSE on enhancing search engines efectiveness with large language models., in: Working
Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), Grenoble, France, 9-12
September, 2024, volume 3740 of CEUR Workshop Proceedings, CEUR-WS.org, 2024, pp. 2336–2355.</p>
      <p>URL: https://ceur-ws.org/Vol-3740/paper-217.pdf.
[7] J. Savoy, Light Stemming Approaches for the French, Portuguese, German and Hungarian Languages,
in: H. M. Haddad, K. M. Liebrock, R. Chbeir, M. J. Palakal, S. Ossowski, K. Yetongnoon, R. L.
Wainwright, C. Nicolle (Eds.), Proc. 21st ACM Symposium on Applied Computing (SAC 2006), ACM
Press, New York, USA, 2006, pp. 1031–1035.
[8] G. Salton, M. E. Lesk, The SMART automatic document retrieval systems — an illustration,
Commun. ACM 8 (1965) 391–398. URL: https://doi.org/10.1145/364955.364990. doi:10.1145/364955.
364990.
[9] J. Savoy, A Stemming Procedure and Stopword List for General French Corpora, Journal of the</p>
      <p>American Society for Information Science (JASIS) 50 (1999) 944–952.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cancellieri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>El-Ebshihy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Fink</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Gonzalez-Saez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Goeuriot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Iommi</surname>
          </string-name>
          , J. Keller, P. Knoth,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mulhem</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Piroi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pride</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Schaer</surname>
          </string-name>
          ,
          <article-title>Overview of the CLEF 2025 LongEval Lab on Longitudinal Evaluation of Model Performance</article-title>
          , in: J.
          <string-name>
            <surname>Carrillo-de Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>García Seco de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>