<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Quantum Annealing for Machine Learning: Applications in Feature Selection, Instance Selection, and Clustering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Chloe Pomeroy</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aleksandar Pramov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karishma Thakrar</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lakshmi Yendapalli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Georgia Institute of Technology</institution>
          ,
          <addr-line>North Ave NW, Atlanta, GA 30332</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <abstract>
        <p>This paper explores the applications of quantum annealing (QA) and classical simulated annealing (SA) to a suite of combinatorial optimization problems in machine learning, namely feature selection, instance selection, and clustering. We formulate each task as a Quadratic Unconstrained Binary Optimization (QUBO) problem and implement both quantum and classical solvers to compare their efectiveness. For feature selection, we propose several QUBO configurations that balance feature importance and redundancy, showing that quantum annealing (QA) produces solutions that are computationally more eficient. In instance selection, we propose a few novel heuristics for instance-level importance measures that extend existing methods. For clustering, we embed a classical-to-quantum pipeline, using classical clustering followed by QUBO-based medoid refinement, and demonstrate consistent improvements in cluster compactness and retrieval metrics. Our results suggest that QA can be a competitive and eficient tool for discrete machine learning optimization, even within the constraints of current quantum hardware.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Quantum annealing</kwd>
        <kwd>Simulated annealing</kwd>
        <kwd>QUBO formulation</kwd>
        <kwd>Feature selection</kwd>
        <kwd>Instance selection</kwd>
        <kwd>Clustering</kwd>
        <kwd>DWave Quantum Annealer</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>As machine learning systems are applied to ever-larger datasets, the demands placed on core workflows
like feature selection, instance selection, and clustering have grown accordingly. These tasks often
involve complex, combinatorial decisions that are challenging to solve eficiently, especially as feature
spaces expand into the thousands and datasets span millions of instances. In many cases, classical
algorithms struggle to keep up, either becoming computationally prohibitive or falling back on heuristics
that don’t guarantee globally optimal solutions.</p>
      <p>In response to these challenges, there has been growing interest in leveraging quantum computing
paradigms, particularly quantum annealing (QA), for machine learning optimization tasks. By
formulating these as Quadratic Unconstrained Binary Optimization (QUBO) problems or Ising models, QA
can be applied to select optimal subsets of features or instances, or to identify meaningful clusters.
QA ofers a fundamentally diferent mechanism for exploring solution spaces by exploiting quantum
tunneling, potentially enabling it to escape local minima more efectively than classical counterparts
like simulated annealing (SA). With commercial quantum annealers, such as those provided by D-Wave
Systems, now accessible to researchers, it is possible to empirically explore the strengths and limitations
of QA in practical machine learning contexts.</p>
      <p>
        The 2025 edition of the Quantum CLEF Competition investigates the feasibility of performing
traditional machine learning (ML) tasks by using quantum annealers and comparing their performance to
classical methods. It features three subtasks, each to be solved with algorithms runs using both Quantum
Annealing (QA) and a Simulated Annealing (SA). Task 1 (Feature Selection) involves selecting the
smallest set of features that preserves performance for learning-to-rank on benchmark web collections
(MQ2007, ISTELLA) and for an item-based k-NN recommender on a private music corpus with 100- and
400-dimensional item-content matrices. Task 2 (Instance Selection) targets cost-efective fine-tuning of
an LLM (Llama 3.1) for sentiment classification by reducing training instances from the Vader NYT
and Yelp Reviews datasets without degrading F1 score. Finally, Task 3 (Clustering) requires generating
centroid embeddings for the ANTIQUE question-answer corpus, evaluated using the Davies–Bouldin
index and query-time nDCG@10 to assess how clustering can accelerate downstream information
retrieval tasks [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ].
      </p>
      <p>In this work, we investigate the use of quantum annealing for the three aforementioned core machine
learning tasks. We formulate each task as a QUBO problem, suitable for execution on D-Wave’s
Advantage_System quantum annealer. To assess the comparative performance of QA, we also implement
classical simulated annealing (SA) using D-Wave’s classical solvers. While quantum annealing remains
in its early stages and current hardware imposes certain constraints (e.g., limited qubit connectivity,
noise, problem size), our findings show that QA produces competitive solutions and serves as a
promising component in hybrid ML pipelines. This work contributes to the growing body of research on the
practical viability of quantum optimization for real-world machine learning challenges. All code used
in this study is available at the respective GitHub repository for each of the three application areas
explored in this work: Feature Selection (https://github.com/dsgt-arc/qclef-2025-feature), Instance
Selection (https://github.com/dsgt-arc/qclef-2025-instance), and Clustering (https://github.com/dsgt-arc/
qclef-2025-clustering).</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Quantum annealing (QA) has emerged as a promising approach for solving combinatorial optimization
problems by leveraging quantum fluctuations to escape local minima [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Unlike gate-based quantum
computing, QA is designed to find low-energy solutions to problems expressed as Ising models or,
equivalently, as Quadratic Unconstrained Binary Optimization (QUBO) problems. This paradigm has
been realized in practical hardware via systems like the D-Wave Advantage, which uses thousands of
superconducting qubits [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ].
      </p>
      <p>
        Formulating optimization problems as QUBOs is central to harnessing QA efectively. A
comprehensive mapping of classical NP-hard problems to QUBO and Ising forms, demonstrating the model’s
lfexibility across domains including graph theory, scheduling, and statistical inference was shown in
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Subsequent research extends this work to machine learning, showing how QUBOs can directly
encode loss functions and regularization terms for training models [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        In the context of feature selection (Task 1) and related ML tasks, several studies have explored QA
methods using both filter and wrapper approaches. A QUBO framework that encodes feature importance
and redundancy was introduced by [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], influencing later work by [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] and [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], who adapted this for
recommender systems. Systematic studies and feature selection techniques, expanding on hybrid solver
architectures were done by [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Repository-scale applications of QA have also emerged: [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]
performed quantum-annealing-based feature selection in a diverse set of classical supervised learning
tasks. Feature selection was adapted with QA for recommending content in sparse scenarios, addressing
real-world scalability in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Their demonstration of combining relevance and redundancy within the
QUBO matrix for domain-specific datasets closely aligns with our methodology.
      </p>
      <p>
        In the context of instance selection (Task 2), [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] introduced the first Quantum Annealing approach for
instance selection problem and indeed proposes the first QUBO formulation for it. It is a straightforward
application of cosine similarity between document embeddings and a size constraint encoded into the
objective. While their formulation is straightforward, it laid the foundation for subsequent refinements.
In parallel, approaches like E2SC[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and influence-function-based methods [ 14, 15, 16] ofer algorithms
for instance selection in a classical computing paradigm.
      </p>
      <p>With regards to Clustering (Task 3), [17] introduced one of the earliest QUBO formulations for the
k-Medoids clustering problem, proposing a binary optimization objective that selects k representative
medoids from a dataset without requiring explicit cluster assignments. Their formulation directly
inspires our refinement stage, as we adopt their objective structure and constraint encoding to
enforce fixed-size cluster selection using quantum annealing. Unlike their purely theoretical framing,
however, we embed this QUBO formulation into a full pipeline that combines classical pre-clustering
with quantum refinement, tailored to work within real-world hardware limits. Building on this, [ 18]
applied QUBO-based k-Medoids clustering in a document retrieval context for QuantumCLEF 2024.
They implemented a hierarchical method that uses simulated annealing and classical clustering for
dimensionality reduction before quantum refinement. Their work demonstrates the promise of
combining classical preprocessing with quantum optimization for large-scale embeddings, a structure we
also adopt. However, our approach difers by systematically comparing multiple classical clustering
methods (e.g., k-Medoids, HDBSCAN, GMM) and integrating a principled formulation of the fixed- k
constraint using dimod.generators.combinations, enabling more consistent enforcement during
sampling.</p>
      <p>QA-ST was proposed by [19], a quantum annealing-based clustering algorithm that extends simulated
annealing using a quantum efect to explore multiple suboptimal solutions. Their results show that
quantum annealing can outperform simulated annealing (SA) in exploring global optima across datasets
such as MNIST and Reuters. While their work focuses on probabilistic exploration within the clustering
assignment space, ours emphasizes post-clustering refinement—using quantum annealing to select
diverse, high-quality medoids from a pre-clustered pool under strict constraints, which is critical in
information retrieval contexts. A novel perspective is contributed by [20], leveraging all samples
returned from a quantum annealer to build calibrated posterior distributions over balanced k-means
clusterings. Their probabilistic approach enables uncertainty quantification and ambiguity detection.
In contrast, our work prioritizes determinism and fixed- k control, optimizing medoid selection to
support retrieval performance rather than exploring ensemble uncertainty. A hybrid clustering method
is introduced by [21], combining quantum-inspired optimization with classical updates to handle
imbalanced data. Their simulated bifurcation method ofers fast discrete optimization with high-quality
results, yet focuses on cluster balance in traditional assignments. Our pipeline, by contrast, is structured
for downstream document retrieval and focuses on interpretability, medoid diversity, and robust fixed- k
constraints.</p>
      <p>In summary, prior research lays important groundwork for QUBO-based clustering and hybrid
quantum-classical approaches. Our contribution builds directly on these insights, but advances them
through (1) a principled, modular pipeline for real-world document clustering; (2) comparative evaluation
of multiple classical clustering strategies upstream of quantum refinement; and (3) robust enforcement
of exact medoid count using optimized QUBO constraint encodings. Together, these additions bridge the
gap between theoretical clustering formulations and practical, retrieval-oriented quantum applications.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <sec id="sec-3-1">
        <title>3.1. QUBO Formulation for Quantum Annealing</title>
        <p>
          Quantum annealing is a computational process that uses quantum mechanics to find the best solution
to complex optimization problems. It relies on the adiabatic theorem of quantum mechanics, which
states that a quantum system initially in the ground state of a known, simple Hamiltonian will remain
in the ground state if the system evolves slowly enough and the Hamiltonian is changed gradually [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ].
In QA, this principle is used to guide the system from an initial Hamiltonian with a known ground state
to a final Hamiltonian that encodes the objective function of an optimization problem. If the evolution
follows the conditions of the quantum adiabatic theorem, the system is expected to remain in its ground
state, thereby yielding the optimal solution.
        </p>
        <p>
          To apply quantum annealing to a problem, it must first be formulated as a Quadratic Unconstrained
Binary Optimization (QUBO) problem [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. A QUBO is defined as:
        </p>
        <p>(x) = x Qx
 (x) = ∑︁  + ∑︁  
 &lt;
(1)
(2)
where x ∈ {0, 1} is a binary vector encoding decisions (e.g., feature, instance, or medoid selection),
and Q is an  ×  matrix representing the cost or similarity structure among variables. Q is the QUBO
matrix whose diagonal and of-diagonal entries encode the linear weights and pairwise interactions,
respectively. The entries of the QUBO matrix  can be interpreted in terms of their role in the objective
function. The diagonal terms  represent the linear coeficients associated with individual binary
variables , and they determine how much each variable contributes to the total cost when it is set to
1. The of-diagonal terms  for  ̸=  capture the pairwise interactions between variables  and  .
A negative of-diagonal entry encourages both variables to take the same value (e.g., both 1), while a
positive value penalizes such configurations, promoting diversity or mutual exclusion. This structure
allows QUBO to naturally encode constraints and preferences between variables, making it suitable
for representing complex optimization problems like feature redundancy minimization or balanced
clustering. The goal of quantum or classical annealing is to find the binary vector x that minimizes this
objective function f(x). This formulation serves as the foundation across all tasks in our pipeline, with
task-specific adaptations encoded through the construction of Q. While QUBO problems tend to be
"unconstrained", we can add a penalty term to the QUBO formulation that allows the problem to have a
soft constraint.</p>
        <p>
          To solve QUBO problems via quantum annealing, we use the D-Wave Advantage_System4.1 quantum
processor. This device consists of 5,760 superconducting qubits laid out in a Pegasus P16 topology,
which ofers enhanced connectivity and embedding flexibility compared to earlier architectures like
Chimera [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The QUBO problems are submitted through D-Wave’s Ocean SDK[22], which handles
the necessary problem embedding, chain construction, and solver parameter configuration. The access
to D-Wave’s quantum annealers was provided to us by the qCLEF organizers through a specialized
infrastructure. For comparison, we also evaluate simulated annealing (SA) using D-Wave’s classical
solver under similar settings. By running both solvers across the same QUBO formulations, we explore
the efectiveness, quality, and consistency of quantum annealing versus classical methods in solving
ML-driven optimization problems. The specific formulations for each task are discussed in the following
sections.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Task 1: Feature Selection</title>
        <p>
          Feature selection is a fundamental preprocessing step in many supervised learning pipelines. The goal
is to identify a subset of informative, non-redundant features that improve model generalization and
reduce overfitting. We formulate feature selection as a combinatorial optimization problem suitable for
quantum annealing by leveraging the framework proposed by [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Their approach encodes a balance of
feature importance and redundancy directly into a QUBO matrix, making it amenable to solvers like
D-Wave.
        </p>
        <p>In our formulation, the QUBO matrix  is constructed such that:
• The diagonal entries  represent importance scores of individual features.
• The of-diagonal entries</p>
        <p>encode redundancy between feature pairs.
• A penalty term is included to enforce sparsity and encourage the selection of exactly  features.</p>
        <p>This is formulated as a quadratic penalty on the number of selected features, e.g.,  (∑︀  − )2.
The penalty term also allows us to explicitly control the number of features selected, tuning 
based on performance.</p>
        <p>This formulation incentivizes selecting features that are individually relevant while penalizing
redundancy and constraining the number of selected features via the penalty term.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Importance and Redundancy Measures</title>
        <p>For the MQ2007 dataset, we evaluated multiple configurations of , combining the following measures:</p>
      </sec>
      <sec id="sec-3-4">
        <title>Importance measures (used for ):</title>
        <p>• Mutual Information (MI) between feature  and target label  [24]:</p>
        <p>MI(;  ) = ∑︁ ∑︁ (, ) log
∈ ∈
︂( (, ) )︂</p>
        <p>()()</p>
        <p>PFI() = E[Errorperm() − Errororiginal]
• Permutation Feature Importance (PFI), defined as the change in model error after permuting
feature  [25]:
Redundancy measures (used for  ,  ̸= ):
• Conditional Mutual Information (CMI) between  and  given , estimated between pairs
of features conditioned on the target [24]:</p>
        <p>CMI(;  | ) = ∑︁ (, , ) log
,,
︂(</p>
        <p>(,  | ) )︂
( | )( | )
• Conditional Permutation Feature Importance (CPFI), which measures the importance of
features  and  when used together [26]:</p>
        <p>CPFI(,  ) = E[Errorperm(, ) − Errororiginal]</p>
        <p>We experimented with several combinations (e.g., MI+CMI, PFI+CPFI) to populate , and selected
the combination yielding the best classification accuracy on validation data.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Large-Scale Adaptation for the Istella Dataset</title>
        <p>For the Istella dataset, which contains significantly more features, we limited our analysis to the MI+CMI
combination due to computational constraints. Notably:
• Computing CMI for all feature pairs is computationally expensive. To scale this, we used Python’s
multiprocessing.Pool() to parallelize the computation, reducing runtime considerably.
• The resulting QUBO matrix was too large to be embedded directly onto the D-Wave Advantage
system. To address this, we used the LeapHybridSampler(), which combines classical and
quantum resources to solve large QUBOs that exceed qubit count or connectivity limitations.</p>
        <p>This hybrid strategy allowed us to evaluate the viability of QUBO-based feature selection even on
larger, real-world datasets.</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.3. Task 2: Instance Selection</title>
        <p>
          The second application deals with instance selection, selecting a subset of instances (i.e. a coreset [16]) of
document embeddings with the goal of fine-tuning an LLM on that selected subset, in a subsequent step.
Here we only address the general instance selection challenge as the fine-tuning itself was outside of the
scope of the competition. As with all other QA problems, instance selection has to be transformed into
a QUBO problem first (possibly by incorporating the constraints in the target function) as per equation
3.1. To that end, we used the backbone of the bcos algorithm considered in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] which constructs the
diagonal and of-diagonal elements of the Q-matrix. Another aspect that we took from [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] relates to
handling of the problem size on the QPU: we batched the dataset in batches of size 80 and processed the
data per batch. All the Q matrix entries take that into account - they are calculated on a per-batch basis.
        </p>
        <p>For the of-diagonal entries , , the bcos algorithm it considers two cases between each two
embeddings for each document pair (,  ): − ( , ) if  and  have diferent labels;
( , ) if  and  have the same label. In our work, we kept the of-diagonal entries in
same logic as in bcos and for the diagonal terms , , where  = , we investigated the following
extensions:
svc-method Adding a penalty term in the following way: First running a simple (in-sample)
supportvector-classifier on all documents within a fold, with the document label as a target and the
embeddings as features. Subsequently, by extracting the distance to the fitted support vector
of each instance in the fit. Denoting with  the (estimated) distance to the margin for each
instance , we have for each diagonal ent̂r︀y , = +11− 12 . Lower distances should get
̂︀
higher weight in the Q-matrix as they are more important for the classification. The entries are
subsequently normalized before running the QA step. The hyperparamaters of the support vector
classifier here are not that important (after some experimentation we settled an rbf kernel with a
gamma parameter  = 1.0 for all experiments), as the goal of the distance metric is to establish
a relative ranking between the instances.
instance-deletion Borrowing motivation from Cook’s distance as a measure of influence of a sample
point, we ran a simple iterative instance deletion model (logistic regression) measuring the
decrease of performance when removing each datapoint [15, ch. 31] within a fold. Our goal was
to produce a simple heuristic that measures the direct impact of an instance to a classification
problem and inspired the choice of the logistic regression as a model that is very fast to compute.
The entry for each diagonal element of the Q matrix within a batch  is then simply the value of
the influence measure for the efect on the model prediction:
, =</p>
        <p>1 ∑︁ ⃒</p>
        <p>
          ⃒⃒ ̂︀ −
 =1
̂︀(− )⃒⃒⃒
(3)
, which is akin to the numerator of Cook’s distance, by changing the functional value from
squared distance to absolute distance following [ch. 31][15]. More complex versions of such
measurement instance influence exist and would be subject to further studies, e.g. ([15], [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ])
For our submission, we also included tests with the vanilla bcos method. All methods feature an enforced
constraint such that the desired level of size reduction is achieved, as in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ].
        </p>
      </sec>
      <sec id="sec-3-7">
        <title>3.4. Task 3: Clustering</title>
        <p>This research focuses on a document clustering and retrieval pipeline that combines classical machine
learning techniques with quantum annealing to address the challenges of working with high-dimensional
embedding spaces. The core methodology follows a structured two-stage approach:
1. Reduce and summarize the data using classical clustering algorithms (e.g., k-Medoids, HDBSCAN,</p>
        <p>GMM) to generate candidate medoids.
2. Apply quantum annealing to refine medoid selection using a constrained QUBO formulation.</p>
        <p>The pipeline begins by loading high-dimensional document and query embeddings. To support
faster clustering and enable more eficient experimentation, dimensionality reduction using Uniform
Manifold Approximation and Projection (UMAP) was explored as an optional preprocessing step. UMAP
works by modeling local neighborhood relationships in the high-dimensional space as a graph and then
optimizing a low-dimensional representation that preserves both local and global structure. When used,
this reduction accelerated the initial clustering process and aided in visualizing the overall document
distribution, while reproducibility was ensured through consistent random seeds.</p>
        <p>In the first stage, a classical clustering algorithm, selected from k-Medoids, HDBSCAN (Hierarchical
Density-Based Spatial Clustering of Applications with Noise), GMM (Gaussian Mixture Model), or a
hybrid HDBSCAN-GMM approach, is applied to generate an overcomplete set of candidate medoids.
These are representative data points that summarize local structure in the embedding space and serve
as a compressed input to the quantum stage. This compression is necessary due to the limited scale of
current quantum annealing hardware, which cannot operate over the full embedding space.</p>
        <p>Each clustering algorithm introduces diferent structural assumptions and was evaluated
independently to explore how these influence downstream refinement. K-Medoids was used for its emphasis on
compact, interpretable clusters, with automatic selection of  via silhouette and Davies-Bouldin Index
optimization. HDBSCAN provided a density-based alternative, able to discover clusters of arbitrary
shape and automatically discard low-signal regions as noise. GMM framed clustering probabilistically
as such, producing soft memberships that captured overlapping semantic regions in the embedding
space. The hybrid HDBSCAN-GMM approach layered these strengths by first isolating dense cores
with HDBSCAN and then modeling their uncertainty with GMM. While only one algorithm is used in
any given run, this flexibility allowed the pipeline to examine how diferent clustering assumptions
afect the quality and diversity of medoid candidates.</p>
        <p>The second stage builds on the general QUBO formulation described in Eq. (1), refining the candidate
medoids by solving a constrained optimization problem tailored to clustering. The specific formulation
we adopt is based on [17], which identifies representative medoids without explicitly clustering the
data. To compute pairwise dissimilarities between candidate medoids, we use Welsch’s M-estimator,
which transforms squared Euclidean distances  into robust similarity scores:
Δ = 1 − exp − 2 
︂(
1
︂)</p>
        <p>This formulation, also known as the correntropy loss [27], emphasizes small distances while
suppressing the influence of outliers.</p>
        <p>The weighted QUBO objective used for medoid refinement is given by:

 (x) = x (︁  11 −
Δ︁) x + x ( Δ1 − 2 1)
(4)
(5)</p>
        <p>Here, x ∈ {0, 1} indicates medoid selection, Δ is defined in Eq. (2), 1 is the all-ones vector, and  is
the desired number of medoids. Following Eq. (3), we set  = 1 and  = 1 to normalize contributions
from the dispersion and centrality terms, and use  = 2 to prioritize the fixed-  constraint. This
formulation directly informs our quantum objective matrix Q and provides principled control over
medoid selection behavior.</p>
        <p>The QUBO objective (Eq. 5) encodes both pairwise dissimilarities between medoids and a hard
constraint enforcing the selection of exactly  clusters, expressed as ∑︀
=1  = . This exact constraint
is central to the pipeline’s design, enabling fixed-  clustering in settings where classical methods often
return variable or heuristically chosen cluster counts. We initially experimented with several ways to
enforce the fixed-  constraint, including adding a quadratic penalty term (∑︀  − )2, post-filtering
infeasible samples, and scaling penalty weights. While these worked moderately well with simulated
annealing, quantum annealing frequently failed to return exactly  medoids, particularly at small ,
due to noise and the relatively weak enforcement of linear or diagonal penalties. The quadratic form,
while mathematically equivalent, induces pairwise correlations between all variables, creating a steep
energy valley that better resists hardware noise and fluctuations. Motivated by this, we shifted to a more
principled approach: we first constructed the clustering loss and then applied the fixed-size constraint
using dimod.generators.combinations, which implements the same quadratic constraint in a
way optimized for quantum hardware [28]. To enforce the fixed-  constraint in practice, we scaled the
associated penalty using the maximum energy delta of the clustering term and found that doubling this
value consistently stabilized solutions across  and solvers. All quantum and simulated annealing runs
used 100 reads per solve. This formulation (Eq. 3) proved to be the most robust across both simulated
and quantum annealing settings, ofering clean separation between clustering structure and constraint
enforcement.</p>
        <p>Following refinement, all documents are reassigned to the nearest selected medoid using the
original, unreduced embedding space. This separation between reduced-space clustering and full-space
evaluation ensures that the final cluster assignments remain faithful to the original data distribution.
Cluster quality is measured using the Davies-Bouldin Index (DBI), a metric that balances intra-cluster
compactness and inter-cluster separation. To assess retrieval efectiveness, the pipeline matches query
embeddings to cluster centroids and ranks documents within each cluster by similarity. Retrieval metrics
such as nDCG@10 and relevant document coverage are computed to quantify how well the clusters
support downstream information access. Overall, this methodology combines the interpretability and
scalability of classical clustering with the constraint-enforcing capabilities of quantum optimization. By
decoupling the tasks of structural summarization and hard cluster selection, the pipeline makes
principled use of quantum resources where they are most efective, optimizing over a reduced, meaningful
subset of the data, while retaining the flexibility to experiment with diferent clustering assumptions
upstream.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and Discussion</title>
      <sec id="sec-4-1">
        <title>4.1. Task 1: Feature Selection</title>
        <p>In this section, we reflect on the results of our experiments across both simulated annealing (SA) and
quantum annealing (QA) methods for feature selection. Our primary strategy was to evaluate various
combinations of importance and redundancy metrics and to tune the number of selected features,
denoted by , to maximize performance on a held-out validation set. Based on this tuning, we selected
the best-performing configurations to submit to the qCLEF leaderboard.</p>
        <p>For simulated annealing, we explored a range of  values, from 5 to 40 (out of the total 46 features
for the MQ2007 data), and analyzed their corresponding performance using both local evaluation
(nDCG@10) and leaderboard scores (Table 1). Among the diferent configurations, those involving
mutual information and conditional mutual information (MI + CMI) showed strong performance on the
validation set. We hypothesize that this may be attributed to the model-agnostic, information-theoretic
nature of MI and CMI, which allows for more consistent estimation of feature relevance and redundancy.
However, this advantage appears less pronounced on the held-out test set, where configurations based
on permutation feature importance (PFI) also performed competitively, particularly when evaluated
using LightGBM (LGB) as the underlying model. Notably, LGB-based methods produced the highest
nDCG scores in our local experiments. Unfortunately, we were unable to submit LGB-based feature sets
to the shared qCLEF evaluation infrastructure due to compatibility issues. The LightGBM package relies
on system-level OpenMP support, specifically the libgomp.so.1 shared library. This library was
not provided in our restricted qClef development environment, leading to runtime import errors and
preventing the use of LightGBM. Thus, we were limited to using XGBoost (XGB) for oficial submissions.
This constraint may have impacted the final leaderboard performance of otherwise stronger feature
selection combinations.
Each cell shows the validation nDCG@10 score. These were calculated on the validation set.
The baseline nDCG@10 score including all 46 features is 0.4473.
‡ Configuration submitted to the qCLEF leaderboard. Values in parentheses are the oficial CLEF leaderboard
scores calculated on the held-out test set.</p>
        <p>MI: Mutual Information; PFI: Permutation Feature Importance; CMI: Conditional Mutual Information; CPFI:
Conditional Permutation Feature Importance.</p>
        <p>PFI and CPFI were computed using LightGBM (lgb) and XGBoost (xgb) based importance scores.
All results shown are from local validation runs. While LightGBM (LGB) models often outperformed XGBoost
(XGB) in our local evaluations, we were unable to submit LGB-based models to the shared qCLEF evaluation
infrastructure due to compatibility issues. As a result, only XGB-based feature sets were submitted for final
leaderboard scoring.
“–” indicates configurations that were not evaluated due to resource constraints.</p>
        <p>For quantum annealing, despite our interest in conducting experiments for more configurations,
we were constrained by limitations in the quantum infrastructure. Specifically, the time and resource
availability for the D-Wave quantum annealer limited the breadth of our QA experiments. As a result,
we were only able to submit two QA-based runs, both derived from the same codebase and configuration
(Table 2). Interestingly, these two QA submissions resulted in diferent outcomes: one returned a feature
subset of size 13 with an nDCG of 0.4552, while the other selected 15 features and achieved an nDCG
of 0.4436. This divergence is notable because the code and QUBO formulation were identical in both
cases. We attribute this variance to the inherent randomness and probabilistic nature of the quantum
annealing process, where solution quality can fluctuate between runs due to quantum noise, minor
diferences in embedding, or hardware-level stochasticity.
QA (MI-CMI) 15
QA (MI-CMI) 13
These configurations were submitted directly to the CLEF leaderboard without local validation.
The QA runs were executed using D-Wave’s Advantage_system with 5760 qubits and Pegasus topology.</p>
        <p>An interesting result is that the QA submission with just 13 features achieved the highest nDCG score
among all our submissions, and notably, it also had the fewest selected features among all leaderboard
entries. While the top leaderboard entry achieved an nDCG of 0.4580 using 21 features, our QA
submission reached a comparable score of 0.4552 with only 13 features. This makes it arguably the
most eficient feature subset in terms of predictive performance per feature used.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Simulated vs Quantum Annealing Results</title>
        <p>(b) Annealing time (ms) for SA and QA.</p>
        <p>Furthermore, while SA and QA achieved comparable nDCG scores across the board, the computational
efort difered significantly (Figure 2). Our analysis shows that QA completed the optimization process
in approximately one-tenth the time required by SA for similar configurations (see 2b). This suggests
that quantum annealing may ofer a more eficient route to high-quality solutions, especially in
timesensitive or resource-constrained environments. Overall, these findings underscore the potential of
quantum annealing not just as a novelty, but as a competitive alternative to classical metaheuristics like
simulated annealing for tasks like feature selection in machine learning pipelines.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.2. Task 2: Instance Selection</title>
        <p>
          The way to properly evaluate an instance selection routine in the context of a QA routine is based on
a tripod of criteria, as noted in [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]: size reduction, performance and total inference time. Naturally,
there can be a situation where no method dominates the others in all three aspects. We decided to focus
on F1 score, as there was no guideline regarding the reduction size - all of our submissions were at a
size of a targeted reduction size of 25% - i.e. 75% of the instances were targeted to be kept. Table 3
shows the competition results achieved by our team. We had managed to perform one quantum run
which we kept - using the bcos method, which does not achieve exact 25% reduction due the inherent
randomness in the quantum annealing procedure, but all the simulated annealing runs are at exacty
25% reduction. As evident by the standard deviation numbers (in brackets), while we nominally top the
leaderboard for a fixed reduction size, the diferences to the baseline and to the other teams’ submissions
are not statistically significant for the yelp dataset. For the vader dataset all teams perform worse than
the baseline, which remains puzzling as our own analysis indicates a much higher performance than
indicated by the leaderboard.
        </p>
        <sec id="sec-4-3-1">
          <title>Name</title>
        </sec>
        <sec id="sec-4-3-2">
          <title>Yelp Dataset</title>
          <p>Yelp_SA_qclef_bcos_075
Yelp_QA_qclef_bcos
BASELINE_ALL
Yelp_SA_qclef_it_del_075
Yelp_SA_qclef_svc_075</p>
          <p>Vader Dataset
BASELINE_ALL
Vader_SA_qclef_combined_075
Vader_SA_qclef_it_del_075
Vader_SA_qclef_svc_075
Vader_QA_qclef_bcos
Vader_SA_qclef_bcos_075</p>
          <p>F1 Score</p>
          <p>
            One insight that can be inferred is that the datasets are simply too trivial for this task. [
            <xref ref-type="bibr" rid="ref12">12</xref>
            ] also
analyzed these datasets (albeit not in the context of LLM, but of BERT models as a downstream task) and
recorded performance of the reduced dataset (by 25%) at the level of the the full dataset’s performance.
Another argument that supports this insight can is illustrated by the analysis depicted in figure 3.
In it, we performed (on the test sets across the diferent folds provided by the organizer) simple
logistic regression model as a substitute of the LLM fine-tuning step, which was not accessible to us as
competition participants. The average (across test folds) F1 score is shown for all methods, including a
simple random sample method, which drops 25% of the observations within a training fold randomly.
They remain fairly stable even for high levels of reduction (10% to 60%), as evidenced also by other teams’
submissions on the leaderboard who submitted runs with higher reduction level. We also conducted
experiments with fine-tuning BERT models and results were comparable - at the 25% reduction level,
there was not a significant diference between a random sampling method and all other methods.
          </p>
          <p>Overall, the (simple) heuristics presented here do not exhibit a significant diference across diferent
reduction levels - neither on the LLM evaluations done on the leaderboard, nor in the simple logistic
regression evaluations shown in Figure 3. Likely, this is due to the size of the dataset and the (low)
dificulty of the classification task. A more dificult benchmark dataset could help study these diferences
in higher detail.</p>
          <p>Nonetheless, the SVC-method does show certain promise as being the best performing at higher
reduction levels and could be a good starting point for further research.</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>4.3. Task 3: Clustering</title>
        <p>DBI: Davies-Bouldin Index (lower is better); nDCG: Normalized Discounted Cumulative Gain (higher is
better).</p>
        <p>Internal scores are computed on the training set; leaderboard scores reflect performance on the held-out test
set.</p>
        <p>Experiments 1, 2, and 12 reflect submitted results with experiment 1 achieving top score for the task.
*Dimensionality-reduced centroids were included in the final submission, leading to evaluation errors.
We evaluated a range of clustering configurations to explore how diferent classical methods and
quantum refinement strategies impact retrieval efectiveness. Table 4 summarizes the results from
both submitted and exploratory experiments. Among the submitted runs, Experiment 1 (k-Medoids,
 = 10, no UMAP) achieved the highest performance, with a leaderboard nDCG of 0.58 and an internal
validation score of 0.48, outperforming all baselines. This strong result can be attributed to the simplicity
and structure-preserving nature of the two-step k-Medoids pipeline, which maintained the original
semantic geometry of the embedding space and yielded consistently strong retrieval performance.</p>
      </sec>
      <sec id="sec-4-5">
        <title>K-medoids Clustering Results (Experiment 1)</title>
        <p>(a) Initial clustering
(b) Quantum-refined clustering</p>
        <p>Among all experiments, however, experiment 13 (GMM, k=10) achieved the highest nDCG (0.60) on
training data, followed closely by experiment 17 (HDBSCAN-GMM, k=25) with 0.54, and Experiment
7 (HDBSCAN, k=10, no UMAP) with 0.52 and the lowest DBI overall (3.19). Experiment 13’s strong
performance likely stemmed from GMM’s probabilistic flexibility at low k, which captured nuanced
topical overlap and yielded the best retrieval quality. Experiment 17 benefited from HDBSCAN’s
structure-aware initialization, followed by GMM fitting. This hybrid approach, especially at k=25, struck
a strong balance between granularity and semantic coherence. In contrast, experiment 7 benefited
from density-based clustering (HDBSCAN) applied directly to the high-dimensional space. At k=10, it
efectively discovered dense semantic regions, while the quantum refinement stage helped consolidate
them into meaningful, noise-filtered clusters.</p>
      </sec>
      <sec id="sec-4-6">
        <title>GMM Clustering Results (Experiment 13)</title>
        <p>(a) Initial clustering
(b) Quantum-refined clustering</p>
        <p>The results reveal consistent and interpretable trends across diferent clustering configurations,
particularly with respect to the number of clusters (k), the use of dimensionality reduction, and the
behavior of classical clustering methods prior to quantum refinement. Across all methods, increasing
k generally led to improved Davies-Bouldin Index (DBI) scores, indicating tighter and more distinct
clusters. This was most evident in the k-Medoids experiments, where DBI steadily decreased from
7.48 at k=10 to 3.71 at k=50, reflecting improved intra-cluster compactness and inter-cluster separation.
However, while increasing k improved DBI, retrieval quality often peaked at lower k values using soft
or structure-aware clustering methods. Baseline retrieval scores followed a similar pattern, dropping
from 0.55 at k=10 to 0.47 at k=50, further emphasizing that expressiveness, not just compactness, plays
a key role in modeling topical overlap in retrieval settings.</p>
        <p>Dimensionality reduction using UMAP was explored as an optional preprocessing step to accelerate
clustering and support visualization. While UMAP occasionally led to lower DBI scores as seen in
experiments 7 and 8, its impact was not uniformly positive. In many cases, UMAP had little efect on DBI
or even slightly worsened it. Moreover, improvements in geometric compactness did not consistently
translate into better retrieval performance. In some configurations, especially at lower k, applying
UMAP prior to clustering led to lower nDCG values, suggesting that key structural cues for retrieval
may be lost in the projection to a reduced space. All GMM-based methods were run with UMAP applied
due to their computational cost in high-dimensional space; non-reduced probabilistic clustering was
excluded for tractability reasons, though it remains an area for future exploration.</p>
        <p>Experiments 2 and 12, which applied UMAP before clustering, mistakenly submitted dimensionally
reduced centroids to the leaderboard evaluation. Because retrieval metrics were computed using
fulldimensional query embeddings, this mismatch resulted in invalid similarity calculations and artificially
low leaderboard scores, especially for nDCG. These values should not be interpreted as indicators of
poor clustering quality and instead reflect a representation mismatch during evaluation.</p>
        <p>Together, these results validate the two-stage pipeline’s strategy of first generating an overcomplete
and structurally diverse set of medoid candidates through classical clustering and then refining them
using quantum-constrained optimization. The consistent improvements in DBI with higher k and
the generally reliable performance of classical methods set a strong foundation for the second-stage
quantum refinement, which enforces fixed-k constraints in a principled way.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Future Work</title>
      <sec id="sec-5-1">
        <title>5.1. Task 1: Feature Selection</title>
        <p>While our study focused on QUBO formulations built from combinations of mutual information,
conditional mutual information, and permutation-based importance scores, there remain several promising
directions for future exploration. We intend to extend our methodology to incorporate additional
feature importance techniques inspired by classical literature. In particular, methods such as Functional
ANOVA (fANOVA) [29] and Leave-One-Feature-Out (LOFO) importance [30] ofer intuitive measures
of a feature’s marginal and conditional relevance within a model context. These could potentially be
adapted into the QUBO framework by mapping importance scores to diagonal entries and interactions
(e.g., joint relevance or redundancy) to of-diagonal terms. Another promising candidate is the Relief
family of algorithms [31], which estimate feature relevance based on how well feature values distinguish
between near instances of diferent classes. Since Relief naturally accounts for both relevance and
redundancy, it may be especially well-suited for QUBO-based optimization.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Task 2: Instance Selection</title>
        <p>In the context of Task 2, it is important to note that all of the aforementioned computations were done
on a per-batch basis. Thus every batch would have a most influential datapoint (with either svc or
instance deletion based method) calculated only in relation to the other datapoints in the batch. We
kept this as the same batching logic is applied to the of-diagonal terms. Two documents which are very
highly similar (as measured by cosine similarity) to each other (and have either positive or negative
label) can thus end up being in two diferent batches and their cosine similarity will be never taken
into account. While batching is necessary as the QPU cannot fit all documents at once, this poses a
challenge as the application of the “penalties" and “rewards" for hard instances are not applied on a
global level. This is much easier to achieve at least for the diagonal elements however, as the influence
points can be easily calculated only once and can be handled independently on the batch (e.g. an svc
can be fit using all the training data and the distance from each instance to the support vector can be
calculated for each instance, before the batching).</p>
        <p>In addition, for the diagonal elements, more complex instance influence function methodology as in
[14],[15], [16] can be applied either in combination with, or instead of the simple heuristics presented
here.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.3. Task 3: Clustering</title>
        <p>In future work, we would like to evaluate how nDCG scores change when using the quantum processing
unit. While we were unable to submit our most promising quantum experiments due to hardware
and timing constraints, we remain curious how they would have scored under the competition’s
oficial retrieval metrics on test data. Further future extensions could explore non-reduced probabilistic
clustering to assess GMM performance in full embedding space. Additionally, incorporating probabilistic
or fuzzy refinement in the quantum stage that may better capture semantic overlap in multi-topic
documents.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>In this work, we investigated the use of quantum annealing (QA) and simulated annealing (SA) to
solve key machine learning optimization tasks - feature selection, instance selection, and clustering
- by formulating them as QUBO problems. Across all tasks, we developed principled mappings that
leveraged both classical and quantum resources efectively.</p>
      <p>In Task 1 (feature selection), we explored multiple QUBO formulations that combined diferent feature
importance and redundancy measures. Specifically, we tested combinations of MI, CMI, PFI, CPFI to
construct the Q matrix. We evaluated these QUBOs using both simulated annealing and quantum
annealing, and found that quantum annealing achieved comparable efectiveness to simulated annealing
while requiring significantly less computational efort.</p>
      <p>In Task 2 (instance selection), we extended the BCOS algorithm and introduced two new QUBO-based
scoring mechanisms derived from SVM margins and instance deletion influence. Despite the lack of
statisticaly significant diferences between the methods at a reduction level of 25%, these approaches
showed promising results even at increased levels of instance reduction and can serve as a basis for
further research on more dificult datasets.</p>
      <p>Task 3 (clustering) showcased the versatility of hybrid pipelines, combining classical clustering
algorithms with quantum-constrained refinement. While the best overall clustering performance was
achieved classically, our experiments confirmed that QUBO-based refinement enhances cluster diversity
and compactness, particularly in document retrieval tasks.</p>
      <p>Across tasks, we found that QA often matched or exceeded the performance of SA in less time,
highlighting its potential for more eficient combinatorial optimization. While access to quantum
annealers and scale remain ongoing challenges, our findings support the growing viability of quantum
annealing as a practical tool in real-world ML pipelines.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We thank the DS@GT CLEF team for providing valuable comments and suggestions. We would also like
to thank Ayah Zaheraldeen and Jiangqin Ma for their input and support throughout the project. This
research was supported in part through research cyberinfrastructure resources and services provided by
the Partnership for an Advanced Computing Environment (PACE) at the Georgia Institute of Technology,
Atlanta, Georgia, USA.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used OpenAI-GPT-4o: Grammar and spelling check.
After using this tool, the authors reviewed and edited the content as needed and take full responsibility
for the publication’s content.
[14] P. W. Koh, P. Liang, Understanding black-box predictions via influence functions, in: International
conference on machine learning, PMLR, 2017, pp. 1885–1894.
[15] C. Molnar, Interpretable Machine Learning, 3 ed., 2025. URL: https://christophm.github.io/
interpretable-ml-book.
[16] A. S. Joaquin, B. Wang, Z. Liu, N. Asher, B. Lim, P. Muller, N. F. Chen, In2core: Leveraging influence
functions for coreset selection in instruction finetuning of large language models, arXiv preprint
arXiv:2408.03560 (2024).
[17] C. Bauckhage, N. Piatkowski, R. Sifa, D. Hecker, S. Wrobel, A qubo formulation of the k-medoids
problem., in: LWDA, 2019, pp. 54–63.
[18] W. Alvarez-Giron, J. Téllez-Torres, J. Tovar-Cortes, H. Gómez-Adorno, Team qiimas on task 2
clustering: Quantum annealing for k-medoids optimization, in: Working Notes of CLEF 2024
Conference and Labs of the Evaluation Forum, Grenoble, France, 2024. URL: https://bitbucket.org/
eval-labs/qc24-qiimas/src/main/, cEUR Workshop Proceedings, ISSN 1613-0073.
[19] K. Kurihara, S. Tanaka, S. Miyashita, Quantum annealing for clustering, in: Proceedings of the
Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI), AUAI Press, 2009, pp.
317–324.
[20] J.-N. Zaech, M. Danelljan, T. Birdal, L. Van Gool, Probabilistic sampling of balanced k-means using
adiabatic quantum computing, arXiv preprint arXiv:2310.12153 (2023). URL: https://arxiv.org/abs/
2310.12153.
[21] N. Matsumoto, Y. Hamakawa, K. Tatsumura, K. Kudo, Distance-based clustering using qubo
formulations, Scientific Reports 12 (2022) 2669. URL: https://doi.org/10.1038/s41598-022-06559-z.
doi:10.1038/s41598-022-06559-z.
[22] D.-W. S. Inc., Ocean software documentation, 2023. URL: https://docs.ocean.dwavesys.com/.
[23] T. Morstyn, Annealing-based quantum computing for combinatorial optimal power flow, IEEE</p>
      <p>Transactions on Smart Grid PP (2022) 1–1. doi:10.1109/TSG.2022.3200590.
[24] T. M. Cover, J. A. Thomas, Elements of Information Theory, 2nd ed., Wiley-Interscience, 2006.
[25] L. Breiman, Random forests, Machine Learning 45 (2001) 5–32.
[26] D. Debeer, C. Strobl, Conditional permutation importance revisited, BMC Bioinformatics 21 (2020)
1–19.
[27] W. Liu, P. P. Pokharel, J. C. Principe, Correntropy: Properties and applications in non-gaussian
signal processing, IEEE Transactions on Signal Processing 55 (2007) 5286–5298. doi:10.1109/
TSP.2007.898255.
[28] J. Pasvolsky, D.-W. S. Inc., dimod.generators.combinations — constraint generator for
fixed selection, 2019. URL: https://github.com/dwavesystems/dimod/blob/main/dimod/generators/
constraints.py, accessed: 2025-06-12.
[29] F. Hutter, H. H. Hoos, K. Leyton-Brown, Eficient functional anova: Insights into high-dimensional
model performance, in: Proceedings of the 30th Conference on Uncertainty in Artificial Intelligence
(UAI), 2014.
[30] A. Abid, A. Kamel, J. Zou, Lofo importance: Leave one feature out based feature importance score,
https://github.com/aerdem4/lofo-importance, 2020.
[31] K. Kira, L. A. Rendell, The feature selection problem: Traditional methods and a new algorithm,
AAAI (1992) 129–134.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cuhna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Gonçalves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <year>Quantumclef 2025</year>
          :
          <article-title>Overview of the second quantum computing challenge for information retrieval and recommender systems at CLEF</article-title>
          , in: G. Faggioli,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          , D. Spina (Eds.),
          <source>Working Notes of CLEF 2025 - Conference and Labs of the Evaluation Forum, CEUR Workshop Proceedings</source>
          ,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cuhna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Gonçalves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          , Overview of quantumclef
          <year>2025</year>
          :
          <article-title>The second quantum computing challenge for information retrieval and recommender systems at CLEF</article-title>
          , in: J.
          <string-name>
            <surname>Carrillo-de-Albornoz</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Gonzalo</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Plaza</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Mothe</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Piroi</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Spina</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Sixteenth International Conference of the CLEF Association (CLEF</source>
          <year>2025</year>
          ), Lecture Notes in Computer Science,
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>E.</given-names>
            <surname>Farhi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Goldstone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gutmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sipser</surname>
          </string-name>
          ,
          <article-title>Quantum computation by adiabatic evolution</article-title>
          ,
          <source>arXiv preprint quant-ph/0001106</source>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>K.</given-names>
            <surname>Boothby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bunyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Raymond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <article-title>Next-generation topology of d-wave quantum processors</article-title>
          , arXiv preprint arXiv:
          <year>2003</year>
          .
          <volume>00133</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L. P.</given-names>
            <surname>Yulianti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Surendro</surname>
          </string-name>
          ,
          <article-title>Implementation of quantum annealing: A systematic review</article-title>
          ,
          <source>IEEE Transactions on Emerging Topics in Computing</source>
          <volume>11</volume>
          (
          <year>2023</year>
          )
          <fpage>150</fpage>
          -
          <lpage>162</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Lucas</surname>
          </string-name>
          ,
          <article-title>Ising formulations of many np problems</article-title>
          ,
          <source>Frontiers in Physics 2</source>
          (
          <year>2014</year>
          )
          <article-title>5</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>P.</given-names>
            <surname>Date</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Arthur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Pusey-Nazzaro</surname>
          </string-name>
          ,
          <article-title>Qubo formulations for training machine learning models</article-title>
          ,
          <source>Quantum Computing Applications</source>
          <volume>1</volume>
          (
          <year>2023</year>
          )
          <fpage>100</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S.</given-names>
            <surname>Mücke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Heese</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Piatkowski</surname>
          </string-name>
          ,
          <article-title>Feature selection on quantum computers</article-title>
          ,
          <source>Quantum Machine Intelligence</source>
          <volume>5</volume>
          (
          <year>2023</year>
          )
          <article-title>11</article-title>
          . URL: https://doi.org/10.1007/s42484-023-00099-z. doi:
          <volume>10</volume>
          .1007/s42484-023-00099-z.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Pranjić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. C.</given-names>
            <surname>Mummaneni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Tutschku</surname>
          </string-name>
          ,
          <article-title>Quantum annealing based feature selection in machine learning</article-title>
          ,
          <source>Quantum Machine Learning</source>
          <volume>2</volume>
          (
          <year>2023</year>
          )
          <fpage>11</fpage>
          -
          <lpage>19</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>R.</given-names>
            <surname>Nembrini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Dacrema</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cremonesi</surname>
          </string-name>
          ,
          <article-title>Feature selection for recommender systems with quantum computing</article-title>
          ,
          <source>Journal of Computing Frontiers</source>
          <volume>10</volume>
          (
          <year>2024</year>
          )
          <fpage>45</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>N.</given-names>
            <surname>Borle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Zecevic</surname>
          </string-name>
          , et al.,
          <article-title>Feature selection with quantum annealing for interpretable and robust machine learning</article-title>
          ,
          <source>Quantum Machine Intelligence</source>
          <volume>5</volume>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>15</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Pasin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Gonçalves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ferro</surname>
          </string-name>
          ,
          <article-title>A quantum annealing instance selection approach for eficient and efective transformer fine-tuning</article-title>
          ,
          <source>in: Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>205</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>W.</given-names>
            <surname>Cunha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>França</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Fonseca</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Rocha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Gonçalves</surname>
          </string-name>
          ,
          <article-title>An efective, eficient, and scalable confidence-based instance selection framework for transformer-based text classification</article-title>
          ,
          <source>in: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>665</fpage>
          -
          <lpage>674</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>