<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of Machine
Learning Research 12 (2011) 2825-2830.
[19] J. Montiel</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1109/QCE53715.2022.00023</article-id>
      <title-group>
        <article-title>A Hybrid Quantum-Classical Framework For Binary Classification In Online Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Corrado Loglisci</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivan Diliso</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Donato Malerba</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Bari Aldo Moro</institution>
          ,
          <addr-line>Bari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>5722</volume>
      <fpage>02</fpage>
      <lpage>05</lpage>
      <abstract>
        <p>Quantum machine learning recently gained prominence due to the promise of quantum computers in solving machine learning problems that are intractable on a classical computer. Nevertheless, several studies on problems which remain challenging for classical computing algorithms are emerging. One of these is classifying continuously incoming data instances in online fashion, which is studied in this paper through hybrid computational solution that combines classical and quantum techniques. Hybrid approaches represents one of the current ways that opens to the quantum computation in practical applications. In this paper, we show how typical issues of online learning can be equally addressed with the properties of quantum mechanics, until to ofer better results. We aim at keeping the classification capabilities, which have learned on previously processed data instances, preserved as much as possible, and then acquiring new knowledge on new data instances. To this end, we propose a class of quantum neural networks, variational quantum circuits, that being adapted over time by exploiting techniques of model update used in classical neural networks. Experiments are performed on real-world datasets with quantum simulators.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Quantum machine learning</kwd>
        <kwd>Hybrid quantum-classical framework</kwd>
        <kwd>Online learning</kwd>
        <kwd>Variational quantum circuits</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Nowadays, Machine Learning models are employed in almost every possible task, including
medical diagnoses, fraud detection, and marketing. The widespread adoption of machine
learning in various fields can be attributed to the availability of relatively powerful computers in
recent times. The computational cost, especially for Deep Learning methods, can be extremely
high, necessitating training times of several hours, days, or even months on present-day
computers [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Furthermore, traditional computers are reaching their physical limits, which will
impede their progress in the coming years [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. As a result, many researchers are investigating
alternative computing platforms for training ML models. Among these, quantum computers
have emerged as an intriguing option. On the side of Quantum Computing (QC), the current
status sees the era of noisy intermediate scale quantum (NISQ) computers [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], which are devices
able to deal with low-middle size data problems. An approach which seems bringing practical
advantages is the one of the hybrid solutions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], which combine classical and quantum methods
and allow to exploit quantum physics properties. It is the current standard modality to design
quantum software while overcoming existing limitations of quantum devices, such as noisy
and decoherence. Clearly, these are not yet the technologies which will guarantee exceptional
speed-ups to large data sizes over classical computing, but pave the way to the development of
near future algorithms for data-intensive problems. Actually, when the complexity regards the
tractability of the problems rather than the scalability to data volumes, the current quantum
routines turn out to be already useful, for instance, in cryptography [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>
        One of the categories of data-intensive problems in which the research on classical computing
has still struggling with is the one of learning from sequences of continuously incoming data.
Even the accurate solutions of Deep Learning find challenging working on that data scenario.
This is demonstrated by the recent research line addressing the so-called catastrophic forgetting
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], which is the tendency of an artificial neural network to abruptly and drastically forget
previously learned information upon learning new information. In those cases, it is not important
designing algorithms for massive computation, but keeping the quality of the models high over
unbounded sequences of data.
      </p>
      <p>
        Due to the intrinsic dynamic nature of sequential data, it is not immediate designing efective
technologies to learn models that may provide accurate predictions [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Indeed, the data
distribution can change, that is, the statistical properties of the data-generating process can
change. Also, possible inter-relationships between descriptive features can evolve and concern
other features, as well as, the concept (or class) underlying the data can involve new value ranges.
This can threaten the predictive capabilities of the models, in fact, models that were accurate
before may not do the same next. Therefore, the algorithms should guarantee robustness to
such changes and, to do that, the models be updated by picking new data characteristics. This
cannot be done by collecting data and training models only once or in a shot, but, building
the models progressively as new worthwhile data are available, in a sort of incremental mode.
Problems of such a nature are investigated in Online learning [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. There, models are trained at
the beginning and adapted on new incoming data during training sessions, while preserving
knowledge learned from previously training sessions, without counting on the availability of
the previously processed data and without the need to retrain from scratch at the arrival of new
data. This way, models are updated and ready to work on prediction sessions.
      </p>
      <p>
        QC has the potential and capabilities to reply to these issues and provide alternative
computational solutions. Indeed, classical data instances can be represented as quantum states
corresponding to points of the Hilbert space, which has more dimensions than the original space.
The features of classical data (for instance, a feature vector with three entries) can be blow up
into a representational space larger than the classical one (eight dimensions, resulting from
23, three is the feature vector size)[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. However, this kind of representation does not requires
addressing sparsity data problem, such as embedding functions, because the operators on
quantum states (that is, matrices) inherently handle such a dimensionality. QC ofers the possibility
to naturally account for inter-relationships between both data instances and features thanks to
the property of the entanglement. It refers to the phenomenon in which two or more quantum
particles become so strongly correlated that their quantum states are no longer independent
of one another. Another property of QC is the interference, which has not to be confused with
entanglement. Interference refers to the phenomenon that occurs when two or more quantum
states overlap and interact with each other, which can imply changes of probability, resulting
in the fact that one of the quantum states gains likelihood to determine the final outcome, by
excluding the other one. It handles concurrent and equi-probable outcomes and has the ability
to bias models toward the desired outcome, when the computation leads to wrong outcomes.
By operating within the realm of classical statistical theory, classical machine learning does not
ofer these properties, unless models and algorithms try to explicitly accommodate them, with
additional computational costs, like in the case of the correlation and interactions within data
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>We leverage these properties to design a quantum-classical framework which builds a
classiifcation model in supervised setting and works in online manner by acquiring continuously
incoming data instances. the framework adapts continually a classification model and keeps on
learning over time. More precisely, it trains and updates a classifier on (sub-)sequences of
incoming data instances (data blocks) marked as labelled, while uses it to estimate the class-value of
unlabelled data instances. To do that, we resort to a form of quantum models able with learning
capabilities, the so called variational circuits, which are quantum algorithms characterized by
parameters that are varied and optimized as new data instances are acquired, in the same style
of of the neural models training.</p>
      <p>The framework has been tested on the binary classification task by using two real-world
datasets. It has been also compared against a classical online data learner. The experimental
results are encouraging and show the potential superiority in terms of accurate estimations
over diferent experimental configurations.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Basics on Quantum computing</title>
      <p>
        In QC, data correspond to quantum states, described in terms of qubits and represented by
vectors of complex numbers. Computations are represented by quantum gates and performed
through matrices. The quantum states of a -qubits register are represented by a 2-dimensional
complex vectors of the Hilbert space  = (C2)⊗ , while the quantum states of single qubits
are vectors of the Hilbert space  = C2 and can be formulated as | ⟩ =  |0⟩ +  |1⟩ (in Dirac
notation), where |0⟩ and |1⟩ are the computational basis states of the space C2 that correspond
to the vectors [
        <xref ref-type="bibr" rid="ref10">1 0</xref>
        ] ( is valued,  is zero) and [0 1] ( is valued,  is zero) respectively. The
coeficients  and  are complex numbers called as amplitudes.
      </p>
      <p>A quantum state can be modified by gates, which are implemented by unitary matrices
operating within . For instance, the gates which work on single qubits are 2× 2 matrices. In
the following, we have the matrix representations of four single qubit gates, Hadamard (H),
Rotation Y-gate (), Rotation Z-gate (), Pauli Z-gate ( ) respectively and the gate CNOT
(CX) working on two qbits:
 : 1/√2 [︂ 11 − 1</p>
      <p>1 ]︂</p>
      <p>
        It should be noted that the gates  and  are parameterized on  . This corresponds to
rotate the vector representing the quantum state, by degrees corresponding to  , around the
y-axis and z-axis, within the tri-dimensional space of the Bloch sphere [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>Algorithms are circuits with -qubits and consist of quantum gates which operate on the
quantum state of the -qubits. By considering the qubits individually, the circuit consists of
a sequence of quantum gates that leads the quantum state of the qubit to evolve. Intuitively,
the matrix calculation of the first gate of the circuit is applied to the vector-representation of
the qubit, then, the result becomes a (new) quantum state on which the second gate will be
applied to. A concrete illustration is reported in Figure 1: the qubit |0⟩ at the top of the circuit
is manipulated first by the gate H (we denote as |0⟩) and then by the gate R (we denote
as RH|0⟩). While, by considering a n-qubit register, the circuit consists of the application in
parallel of sequences of quantum gates on individual qubits. This corresponds to the application
of tensor-products of gates on n-qubits, which leads the quantum state of the n-qubit register to
evolve. For instance, in Figure 1, the input two-qubit register |00⟩ (|0⟩ ⊗ | 0⟩) is manipulated first
by the tensor-product between gate H and gate Y (we denote as ( ⊗  )|00⟩) and then by the
tensor-product between gate R(0) and gate R(1) (we denote as R(0)⊗ R(1))(H ⊗  )|00⟩).</p>
      <p>
        At the end of a circuit, the final operation to perform on the qubits is the measurement
(the meter symbol in Figure 1), which, in this paper, is finalized to convert quantum states to
bitstrings and then to class labels. As result of the measurement, the qubits collapses to one of
two possible basis states, while the amplitudes indicate the likelihood to collapse in each of the
basis states. For instance, in the case of | ⟩ =  |0⟩ +  |1⟩, where the basis states are |0⟩ and
|1⟩, the amplitudes  and  indicate the square root of the probability that the qubit measures
as either |0⟩ or |1⟩: when the qubit collapses to |0⟩, it returns the quantum state [
        <xref ref-type="bibr" rid="ref10">1 0</xref>
        ], while it
returns [0 1] if it collapses to |1⟩.
      </p>
      <p>
        A variational quantum circuit  ( ) can be represented by a quantum circuit with adjustable
real-valued parameters  (like those of the gate R( ) ). This kind of quantum circuits is similar
to neural networks and the gate parameters are optimized by classical optimization techniques,
such as gradient descent. Typically, a variational quantum circuit  ( ) can be represented by a
has a layered architecture, where the layers are composed of single-qubit gates and two-qubit
gates. The same structure of gates is used for all the layers, which are repeated to engage the
parameterized gates in a better learning process. The parameters of the gates for all the layers
are updated based on the diference between the estimated class label and ground truth. For
further details the reader can refer to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Quantum-classical framework for binary classification</title>
      <p>The overall framework (illustrated in Figure 2) combines classical computing techniques of
feature selection, data sampling, normalization and model optimization with quantum neural
models in the form of variational quantum circuits for the problem of binary classification. To
distinguish the ones from the others, in Figure 2, they are tagged as either [CC] or [QC]. In the
binary classification setting, we have data instances described by X ∪ , X are values of the set
 of descriptive attributes/features, while  ∈ {− 1, 1}denotes the class label.</p>
      <p>Feature selection. The component is used only at the beginning and selects the subset
of descriptive features which will consider afterwards. It exploits a technique based on the
mutual information between the class labels. This operation has been used to perceive the most
relevant characteristics and alleviate the problem of choosing the dimension of the quantum
circuits in terms of qubits. In fact, the number of the features of the input data determines the
number of qubits.</p>
      <p>
        Normalization. The component is used to scale the values of the previously selected features
within the range of [
        <xref ref-type="bibr" rid="ref1">0,1</xref>
        ] by using the standard min-max function on the original ranges. It is
performed for each incoming data block, both those of training and those of prediction.
      </p>
      <p>Data sampling. The component is used to select a subset of the labelled data instances
included within the previously processed data blocks. More precisely, when building the data
block +1, the component takes a number of data instances equal to the sample size from
the data block  (see Figure 2), previously used for a training session. The samples will
contain data instances of both the class labels and, for each class label, the component takes
data instances with simple random techniques without replacement. The sample size is fixed</p>
      <p>Classification model . The component includes two quantum circuits and presents a number
of qubits determined by the number features selected by the component Feature selection.
In the following, we report the technical details of the two circuits.</p>
      <p>• The first quantum circuit takes the classical data and represents them as quantum states
to be assigned to the qubits. Initially, all the qubits are set as |0⟩, which is the default
where, |0⟩⊗  denotes the register with d-qubits at the state |0⟩ (|0⟩ ⊗ . . . ⊗ | 0⟩).
In this work, ℱ has been implemented as follows
| (X)⟩ = ℱ (X)|0⟩⊗ 
⊗ (X)⊗ |0⟩⊗ 
where, the parameter for each gate  is the normalized real-valued of of the feature
(corresponding to the qubit on which  works). The term ⊗  denotes the tensor
product H ⊗ . . . ⊗ H over  occurrences (that is, the number of selected features) of the
gate H (the same holds for ).</p>
      <p>
        This ensures each possible input X has a unique qubit encoding before being passed to
next gates. Clearly, to acquire all the data instances of the data block, the gates of the
feature mapping are not replicated along the circuit, instead the input data are enqueued
to the circuit as they arrive. We are aware that more sophisticated feature maps could be
used [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], but on modern quantum machines, it is also important to use a feature map
with limited numbers of gates, since each additional gate could introduce noise into the
quantum states.
• The second circuit is variational and manipulates the quantum states returned by the first
circuit. It implements a quantum neural network composed of layers of entangled rotation
gates. Generally, entangled rotation gates are matrix operations which combine the gates
Hadamard, CNOT and Rotation under the quantum physics efect of the entanglement
[
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. The second circuit with the first completes the structure of gates which builds the
classifier :
value of the quantum states at the beginning of any circuit. More precisely, this circuit
implements a feature mapping operation ℱ which encodes real-valued data instances X
into quantum states spanning  qubits:
(1)
(2)
(4)
|(X,  )⟩ = ( )| (X)⟩
(3)
where,  is the variational circuit,  denotes the parameters of the parameterized gates
that being optimized.
      </p>
      <p>In this work,  has been implemented as follows</p>
      <p>⊗ ⊗ ⊗ 
where, each occurrence of the two-qubit gate CX takes one pair of qubits (over the
d-qubit register) composed by the consecutive qubits indexed as  and  + 1.
Finally, we perform measurements on the qubits and the measured state is recorded.
Indeed, we record a collection of measured states because, in quantum machine learning
[13], when training a classifier, the variational quantum circuit is run many times with
the same input  and parameter  . So, we can estimate the expectation value of the circuit
on  and  , over multiple runs, with the following:</p>
      <p>
        |ℰ (X,  ) = ⟨(X,  )| ⊗ |(X,  )⟩
where,  ⊗  is the tensor product of the single qubit gate   over  occurrences. As
anticipated in Section 2, the notation |⟩ refers to a column vector of , while the
notation ⟨| refers to the row column calculated as the transpose conjugate of |⟩.
The gate   has the interesting property that if the measured quantum state has odd
parity, it returns -1 (as eigenvalue), while, if the measured quantum state has even parity,
it returns 1. This implies that the expectation value of the circuit will always be within the
interval [
        <xref ref-type="bibr" rid="ref1">− 1, 1</xref>
        ]. We can use this property to relate the expectation value to the probability
that a data instance X being assigned to a class label , that is :
 (|X) = ℰ (X,  ) + 1 (6)
2
The probability P(y|X) is exploited in the optimization process concerning the parameters
 . In particular, the optimizer iteratively updates the circuit parameters by minimizing a
cost function, which accounts for the negative log-likelihood of the probabilities P(y|X)
computed on the current labelled data-blocks, that is:
1
      </p>
      <p>∑︁ ( (|X))
−  =1
where,  is the number of data instances of the data block.</p>
      <p>The cost function is minimized by a classical computing optimizer based on gradient
descent. The derivative concerns the expectation value ℰ () with respect to the current
values of  and is computed by means of the parameter shift rule [14]:
ℰ = ℰ ( +  ) − ℰ ( −  ) (8)
 2
The gradient value is the diference between the two output values of the circuit: the first
value is the output of the circuit with the parameter   increased by a value  , and the
second value is the parameter   decreased by  . Intuitively, the gradient is determined
by running the circuit on the same input with two diferent automatically-computed
configurations of the parameters.</p>
      <p>Methodology of the framework. Learning classification models on continuously incoming
data can be faced with time-windows models [15]. Time-windows models allows us to handle
data instances by equally-sized blocks on which we train, update and apply the predictive
capabilities of the classifier. Thus, we distinguish training sessions from prediction sessions.
During a training session, the component Classification model is activated, which implies
the execution the feature mapping ℱ on the data instances of the current data block and
optimization process of the parameters  of the variational circuit . During a prediction
session, the component Classification model is activated only to estimate the class labels on the
current data block by using the classifier up there updated.
(5)
(7)</p>
      <p>To keep the classifier updated, we have to deal with the catastrophic forgetting efect raising
when updating neural networks. In the literature, three alternatives are mainly suggested, replay
methods, regularization-based methods, parameter isolation methods [16]. Shortly, replay methods
store a limited set of data instances which are replayed while learning on a new training session.
Regularization-based methods introduce regularization term in the loss function, consolidating
for new training session the previously learned knowledge. Parameter isolation methods
introduce new hyper-parameters for each new training session. Considering that the replay
methods represent the solution which asks for less and leave unchanged the number of
hyperparameters of the neural network, we lean for this approach when updating the classifier.</p>
      <p>The framework operates in three steps, namely initialization, update, prediction. Training
sessions are performed at the initialization and update steps. In the initialization step, the
classifier is trained from scratch on the first data block 1 (Figure 2). The component Feature
selection is used only at the initialization step. The other steps work on the feature here selected.
Then, the framework prepares the update step and prediction step by collecting labelled data
instances and unlabelled data instances for two diferent data blocks  and  respectively.
Actually, the data block  being already populated with the data instances provided by
the component of Data sampling, while, those necessary to reach the size are taken from the
incoming data instances, as they arrive. As explained above, this is done to mitigate the efect
of catastrophic forgetting. As soon as one of the two data blocks has been filled (the number of
collected data instances is equal to the predefined size), the respective step is performed. By
supposing the data block  of labelled data instances is completed for first, the update step
will be performed, otherwise it will be the turn of prediction step working on  . Clearly, the
data block  contains (unlabelled) data instances, as they arrive.</p>
      <p>The succession of training sessions and prediction sessions is not predefined, coherently with
the realistic assumption according to which the distribution of labelled and unlabelled data
instances is not previously established and therefore not all the data instances are labelled.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experiments on real-world datasets</title>
      <p>We implemented the proposed framework in IBM Qiskit [17] and run experiments by using
simulators on two real-world datasets, more precisely Ozone level detection 1 (having 2536 data
instances, 73 features) and Spambase (having 4600 data instances, 57 features) 2. Data blocks
have been partitioned so as having a portion of 75% of the dataset as labelled data instances
(training sessions) and the remaining 25% as unlabelled data instances (prediction sessions and
testing sets of the evaluation). The classical computing components described in Section 3 are
those available in the toolkit Scikit-learn [18]. The number of runs of the classification model
to estimate the expectation values is 1024, while the number of iterations (epochs) to optimize
the parameters is 20. The number of layers for the variational quantum circuit is 3. The size of
the sample (component Data sampling) is 20% the data-block size.</p>
      <p>The experiments have been performed to emphasize the impact of the technical configuration
of the framework on the accuracy, namely number of qubits (corresponding to the features
1https://archive.ics.uci.edu/ml/datasets/ozone+level+detection
2https://archive.ics.uci.edu/ml/datasets/Spambase
selected) and size of the data blocks (number of data instances in each training/prediction
session). In Table 1, we report the accuracy of the proposed framework ( ) compared to
i) a classical computing solution (, originally designed for data stream learning) [19] and ii) a
baseline of the framework that works on the whole dataset ( ). The values illustrated have
been computed as the average computed over the data blocks of testing. As we can see, except
two trials, HYQOL does not never worst than CC, even when the number of qubit is the higher
(i.e., 10). Also, we note that the configurations of HYQOL with smallest set of qubits (i.e., 2) are
better than those with largest set (i.e., 10), without, however, particular discrepancy between
the two endpoints. The size of the data blocks seems not be determinant for the accuracies, but,
it is evident that online learning can be beneficial for quantum-based classifiers compared to
the version that works on the whole dataset ( ).</p>
    </sec>
    <sec id="sec-5">
      <title>5. Related work</title>
      <p>Hybrid quantum-classical machine learning methods have recently garnered significant
attention due to their potential to leverage the strengths of both classical and quantum computing in
order to solve computationally intractable problems in a more eficient manner. These
methods aim to harness the power of QC to perform certain tasks while still relying on classical
computing for other tasks, such as data preprocessing or postprocessing of results.</p>
      <p>In [20], the authors proposed a hybrid quantum-classical convolutional Neural Networks
model for prediction by on X-Ray images. The quantum part of the model consists of encoding,
random quantum circuit, and decoding phases. The hybrid model delivers high accuracy and
outperforms previous studies (classical machine learning models approaches) in sensitivity and
F1-measure. A hybrid quantum CNN model was also introduced in [21], where the authors
adopted a federated learning approach to protect models and avoid privacy failures attacks.
Their experiment results show that the models with additional quantum convolution have
slightly improved accuracy than the baseline classical models.</p>
      <p>A hybrid quantum-classical model of Long short-term memory (LSTM), a kind of recurrent
neural network (RNN), is proposed in [22]. The authors performed a study comparing their
proposed hybrid model’s capability and performance with its classical counterpart. They found
that the hybrid model converges faster and reaches a better accuracy than its classical
counterpart. However, their simulations assumed conditions of absence of noise and decoherence. A
hybrid quantum-classical approach was designed for generative adversarial learning in [23],
in order to develop an anomaly/fraudulent transaction detection. The performances of their
model were on par with the classical counterpart in terms of the F1 score.</p>
      <p>One of the very few works which investigates the catastrophic forgetting with quantum
algorithms has been recently published [24]. It focuses on the incremental learning, which,
contrary to online learning, considers training sessions on diferent classification tasks. Inspired
by the replay methods, they propose to constrains the model updated by projecting the gradient
direction on the region outlined by previous task gradients. This is done also by storing a
fraction of the training data of previous tasks on which the gradient descent is computed. A
drawback is the necessity of computing gradients of previous tasks at each training iteration.</p>
      <p>Overall, the recent works in the field of hybrid quantum-classical machine learning methods
demonstrate the potential for these approaches to outperform classical machine learning
algorithms on certain tasks, particularly those that involve large or complex datasets. As quantum
computing technology continues to improve, it is likely that we will see further advancements
in this field and the development of new, more powerful hybrid quantum-classical algorithms.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>In this paper, we investigated the viability of quantum machine learning solutions to work on the
realistic scenarios of changeability of the statistical properties of the data, which often implies
the variability of the performances of the model. We conjecture this can be a machine learning
problem in which the quantum solutions can lead innovation. On simulated quantum machines,
the hybrid quantum-classical proposal ofers encouraging results, in terms of accuracy, often
better than a classical computing solution working on data stream and hybrid solution working
in batch mode (no online learning). As our opinion, three take-home messages can be identified
from this paper. The first one is methodological, in that the online learning opens to practical
applications able to combine quantum computing and classical computing techniques, which
is likely the only way to concretely use current quantum technologies. The second one is
experimental, in that it provides arguments on the fact that stable quantum devices could even
do better in terms of performances and quality of the results, when used in predictive tasks.
The third one tell us that, although the high-performance computation and tractability of hard
problems are the promises of quantum computing which, with the current devices, often are
not kept, the research on the lifelong computation can be a field in which quantum computing
can already bring interesting results..
Corrado Loglisci acknowledges the financial support from the project "PNRR MUR project
PE0000023-NQSTI" for this research.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>N. C.</given-names>
            <surname>Thompson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Greenewald</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. F.</given-names>
            <surname>Manso</surname>
          </string-name>
          ,
          <article-title>The computational limits of deep learning</article-title>
          ,
          <source>MIT INITIATIVE ON THE DIGITAL ECONOMY RESEARCH BRIEF 4</source>
          (
          <year>2020</year>
          ). doi:
          <volume>10</volume>
          .1140/epjqt/s40507-021-00091-1.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Peper</surname>
          </string-name>
          ,
          <article-title>The end of moore's law: Opportunities for natural computing?, New Gen</article-title>
          . Comput.
          <volume>35</volume>
          (
          <year>2017</year>
          )
          <fpage>253</fpage>
          -
          <lpage>269</lpage>
          . doi:
          <volume>10</volume>
          .1007/s00354-017-0020-4.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Preskill</surname>
          </string-name>
          ,
          <article-title>Quantum Computing in the NISQ era and beyond</article-title>
          ,
          <source>Quantum</source>
          <volume>2</volume>
          (
          <year>2018</year>
          )
          <article-title>79</article-title>
          . URL: https://doi.org/10.22331/q-2018
          <source>-08-06-79</source>
          . doi:
          <volume>10</volume>
          .22331/q-2018
          <source>-08-06-79.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Callison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chancellor</surname>
          </string-name>
          ,
          <article-title>Hybrid quantum-classical algorithms in the noisy intermediatescale quantum era and beyond</article-title>
          ,
          <source>Phys. Rev. A</source>
          <volume>106</volume>
          (
          <year>2022</year>
          )
          <article-title>010101</article-title>
          . URL: https://link.aps.org/ doi/10.1103/PhysRevA.106.010101. doi:
          <volume>10</volume>
          .1103/PhysRevA.106.010101.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Bova</surname>
          </string-name>
          , Francesco, Goldfarb, Avi, Melko, Roger G.,
          <article-title>Commercial applications of quantum computing</article-title>
          ,
          <source>EPJ Quantum Technol</source>
          .
          <volume>8</volume>
          (
          <issue>2021</issue>
          )
          <article-title>2</article-title>
          . URL: https://doi.org/10.1140/epjqt/ s40507-021-00091-1. doi:
          <volume>10</volume>
          .1140/epjqt/s40507-021-00091-1.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <article-title>Overcoming long-term catastrophic forgetting through adversarial neural pruning and synaptic consolidation</article-title>
          ,
          <source>IEEE Trans. Neural Networks Learn. Syst</source>
          .
          <volume>33</volume>
          (
          <year>2022</year>
          )
          <fpage>4243</fpage>
          -
          <lpage>4256</lpage>
          . URL: https://doi.org/10.1109/TNNLS.
          <year>2021</year>
          .
          <volume>3056201</volume>
          . doi:
          <volume>10</volume>
          .1109/TNNLS.
          <year>2021</year>
          .
          <volume>3056201</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>C.</given-names>
            <surname>Loglisci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Malerba</surname>
          </string-name>
          ,
          <article-title>Discovering evolution chains in dynamic networks</article-title>
          , in: A.
          <string-name>
            <surname>Appice</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Ceci</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Loglisci</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Manco</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          <string-name>
            <surname>Masciari</surname>
            ,
            <given-names>Z. W.</given-names>
          </string-name>
          <string-name>
            <surname>Ras</surname>
          </string-name>
          (Eds.), New Frontiers in Mining Complex Patterns - First International Workshop, NFMCP 2012,
          <article-title>Held in Conjunction with ECML/PKDD 2012, Bristol</article-title>
          , UK,
          <year>September 24</year>
          ,
          <year>2012</year>
          , Revised Selected Papers, volume
          <volume>7765</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2012</year>
          , pp.
          <fpage>185</fpage>
          -
          <lpage>199</lpage>
          . URL: https: //doi.org/10.1007/978-3-
          <fpage>642</fpage>
          -37382-4_
          <fpage>13</fpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>642</fpage>
          -37382-4\_
          <fpage>13</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>S. C. H.</given-names>
            <surname>Hoi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sahoo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <article-title>Online learning: A comprehensive survey</article-title>
          ,
          <source>Neurocomputing</source>
          <volume>459</volume>
          (
          <year>2021</year>
          )
          <fpage>249</fpage>
          -
          <lpage>289</lpage>
          . URL: https://doi.org/10.1016/j.neucom.
          <year>2021</year>
          .
          <volume>04</volume>
          .112. doi:
          <volume>10</volume>
          .1016/j.neucom.
          <year>2021</year>
          .
          <volume>04</volume>
          .112.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>T.</given-names>
            <surname>Hur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. K.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <article-title>Quantum convolutional neural network for classical data classiifcation, Quantum Mach</article-title>
          .
          <source>Intell</source>
          .
          <volume>4</volume>
          (
          <year>2022</year>
          ). URL: https://doi.org/10.1007/s42484-021-00061-x. doi:
          <volume>10</volume>
          .1007/s42484-021-00061-x.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>C.</given-names>
            <surname>Loglisci</surname>
          </string-name>
          ,
          <article-title>Using interactions and dynamics for mining groups of moving objects from trajectory data</article-title>
          ,
          <source>Int. J. Geogr. Inf. Sci</source>
          .
          <volume>32</volume>
          (
          <year>2018</year>
          )
          <fpage>1436</fpage>
          -
          <lpage>1468</lpage>
          . URL: https://doi.org/10.1080/ 13658816.
          <year>2017</year>
          .
          <volume>1416473</volume>
          . doi:
          <volume>10</volume>
          .1080/13658816.
          <year>2017</year>
          .
          <volume>1416473</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Nielsen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. L.</given-names>
            <surname>Chuang</surname>
          </string-name>
          ,
          <source>Quantum Computation and Quantum Information</source>
          , Cambridge University Press,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>H.</given-names>
            <surname>Yano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Suzuki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Raymond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Yamamoto</surname>
          </string-name>
          ,
          <article-title>Eficient discrete feature encoding for variational quantum classifier</article-title>
          ,
          <source>in: 2020 IEEE International Conference on Quantum Computing and Engineering (QCE)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>11</fpage>
          -
          <lpage>21</lpage>
          . doi:
          <volume>10</volume>
          .1109/QCE49297.
          <year>2020</year>
          .
          <volume>00012</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>