<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Feature Deletion and Case Discovery in Case-Base Maintenance</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Brian Schack</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Indiana University Bloomington, 700 N Woodlawn Ave Ofc 3061T</institution>
          ,
          <addr-line>Bloomington IN, 47408-3901</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Case-based reasoning solves a problem by adapting the solution to a similar problem already solved. Motivated in part by the swamping utility problem, case-base maintenance strategies support a compact, competent case base by deleting, modifying, or discovering cases. This research summary briefly presents four case-base maintenance strategies developed by the author: flexible feature deletion, adaptationguided feature deletion, expansion-contraction compression, and predictive case discovery. Flexible feature deletion deletes components of cases instead of whole cases. Adaptation-guided feature deletion prioritizes components for deletion according to their recoverability via adaptation knowledge. Expansion-contraction compression, in addition to deleting cases, also adds cases in unexplored regions of the problem space. And predictive case discovery anticipates and acquires cases expected to be useful for solving future problems.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;case-based reasoning</kwd>
        <kwd>artificial intelligence</kwd>
        <kwd>swamping utility problem</kwd>
        <kwd>case-base maintenance</kwd>
        <kwd>flexible feature deletion</kwd>
        <kwd>adaptation-guided feature deletion</kwd>
        <kwd>expansion-contraction compression</kwd>
        <kwd>predictive case discovery</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>A case base can contain cases from training data, knowledge engineering by human experts, or
the retention phase of the the case-based reasoning cycle. Each case could potentially, through
adaptation, solve future problems. If other cases could not solve these problems or could only
solve them with greater adaptation cost or inferior solution quality, then that case contributes to
overall problem-solving competence. On the other hand, each retained case makes the case base
larger. A larger case base requires more storage, more time to search through, more bandwidth
to transmit over a network, and more expert attention to manually review.</p>
      <p>
        The swamping utility problem describes this trade-of between the competence, quality, and
speed contribution of a case versus its storage, retrieval, and bandwidth cost [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Technological
progress has shifted the priority from storage cost to retrieval speed, but the swamping utility
problem remains. Legacy systems, embedded systems, and unreliable networks worsen the
problem by constraining resources. Big data and streaming data worsen the problem by
increasing resource usage. Case-base maintenance strategies attempt to mitigate the utility problem by
judiciously choosing the most valuable cases to retain and the least valuable cases to delete in
order to maintain a compact and competent case base [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Their efectiveness depends on their
suitability to a particular dataset [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This research summary briefly presents four case-base
maintenance strategies developed by the author that go beyond the deletion of cases by deleting
features within cases or discovering new cases. (Due to space constraints, for more information
about the strategies and for figures showing the experimental results, please see [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ].)
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Flexible Feature Deletion</title>
      <p>Maintenance strategies normally make two assumptions: (1) that all cases have a uniform
storage cost and (2) that they must retain or delete whole cases. For the first assumption, the
storage costs of cases can vary when the cases contain varying amounts of data at varying
levels of detail. Furthermore, the storage cost of both the problem and the solution can vary
independently because a simple problem may require a complex solution and vice versa. For the
second assumption, a maintenance strategy could delete an entire case, but it could also delete
a single feature across all cases or a single feature from a single case. Each of these alternatives
tends to degrade problem-solving competence –– but not necessarily to the same extent.</p>
      <p>
        Domains that diverge from these two assumptions call for a diferent maintenance strategy:
Flexible feature deletion subdivides variable-size cases for deletion of their components [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. For
example, cases based on medical imagery may have various resolutions and a large number of
features of which only some are relevant to the diagnosis. For suitable data sets, compared to
per-case strategies, flexible feature deletion can reduce the size of a case base with less reduction
in the number of cases.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Adaptation-Guided Feature Deletion</title>
      <p>
        Building on flexible feature deletion, instead of ordering features according to a knowledge-light
metric, adaptation-guided feature deletion integrates additional knowledge from the solution
transformation container about the recoverability of features [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Similar to how reachability
measures the ability of adaptation knowledge applied to other cases to restore the solution
to a case considered for deletion, recoverability measures the ability of adaptation knowledge
applied to other features to restore a feature considered for deletion. For example, if a cooking
system has an adaptation rules for adding toppings to a pizza, then it can remove those toppings
from the ingredients for the pizza recipe.
      </p>
      <p>A solution with recovered features may exactly match the original uncompressed solution,
or it may solve the same problem in a diferent way. Compression to smaller sizes can increase
the time required for recovery and decrease the quality of the recovered solution until
adaptation knowledge can no longer recover any solution at all. Therefore, in order to preserve
problem-solving competence, adaptation-guided feature deletion deletes features in order from
most recoverable to least. Evaluation in a path finding domain showed superior retention of
competence compared to flexible feature deletion.</p>
      <p>
        Alternatively, instead of deleting features, maintenance could also replace them with a
smaller substitution or abstraction. Occasionally, this reorganization makes case contents more
accessible to adaptation rules of limited power. For example, consider a maintenance strategy
that extracts a component shared by multiple cases into a separate case leaving behind a marker
in the cases from which the component was removed. This decreases the size of the case base by
avoiding duplication. But it also makes the component available for reasoning as a standalone
case independent of the cases from which it came. Even though case-base compression normally
degrades competence, compression under these circumstances, termed creative destruction, can
improve competence instead [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Expansion-Contraction Compression</title>
      <p>
        By the representativeness assumption, maintenance strategies predict that future problems will
follow a similar distribution to the current case base [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], and this works reasonably well for
mature case bases in stable domains. But this assumption may apply less accurately during
early case base growth, to dynamically changing domains, or in cross-domain transfer learning.
For example, a recommender system for a travel agency needs to change with the seasons of
the year.
      </p>
      <p>
        In these situations, case-base maintenance strategies optimizing for assumed
representativeness may instead cause overfitting. Overfitting means that a statistical model or a machine
learning algorithm makes predictions based on peculiarities in the training data not reflected in
the testing data thereby improving performance on the training data and sacrificing performance
on the testing data [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The overfitting problem has received significant attention in the context
of artificial neural networks [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Among several mitigations, neural networks may employ data
augmentation which modifies training data in order to supplement it with additional instances
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. For example, cropping images without obscuring their subjects.
      </p>
      <p>
        Case-based reasoning does not normally apply data augmentation, but the solution
transformation container provides a natural source for such adaptations. Expansion-contraction
compression aims to improve competence preservation in case-base compression by combining
exploration and exploitation [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The algorithm generates ghost cases by eagerly adapting
existing cases and adds the ghost cases to the case base. For example, if a real estate appraisal system
has a case for a house without a pool and a rule to adjust the price for putting in a pool, then it
can construct a ghost case for a house with a pool. Then the condensed nearest neighbor
algorithm selects cases for retention based on their competence contribution. Expansion-contraction
compression has the potential to improve competence preservation by adding ghost cases that
cover areas of the problem space not covered by the original case base. These ghost cases can
provide diversity and increase the range of cases available for compression.
      </p>
      <p>Experimental results show that expansion-contraction compression can outperform
condensed nearest neighbor in terms of quality and competence preservation for some datasets
–– especially when the training case base has gaps making it unrepresentative of the testing
problems. The length of the adaptation path also influences competence retention, with longer
paths associated with greater retention. The sparsity of the initial case base afects the
competence trend, with a steeper decrease in competence for expansion-contraction compression
in the early phases of case base growth. The evaluation suggests that expansion-contraction
compression has potential benefits for compression of unrepresentative case bases.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Predictive Case Discovery</title>
      <p>
        Case-based reasoning depends on three types of regularity: problem-solution regularity,
problem-distribution regularity, and concept regularity. Problem-solution regularity says that
similar problems will have similar solutions, and this is necessary for the adaptation of a solution
to a similar problem to be useful for solving the given problem. Problem-distribution regularity
says that future problems will resemble past problems, and this is necessary for stored cases
to remain useful over time. Concept regularity says that learned concepts remain valid, and
this is necessary for machine learning generally. Leake and Wilson provide a formalization of
problem-solution regularity and problem-distribution regularity [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>The many efective applications of case-based reasoning prove that suitable task domains
and thoughtful system design provide practical approximations of these regularities. But they
are not guaranteed! Changes in the environment, user preferences, technology, or the problem
space can degrade regularity and cause practical shortcomings –– especially for long-running
systems. Lack of concept regularity, such as a working solution that becomes obsolete, leads
to concept drift. Lack of problem-solution regularity, such as a divergence between similarity
measures and adaptation rules, leads to problem-solution drift. And lack of problem-distribution
regularity, such as heretofore unforeseen problems, leads to problem-distribution drift.</p>
      <p>
        Much machine learning research has explored concept drift [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], but problem-distribution
drift has received limited attention in case-based reasoning. In particular, problem-distribution
drift impacts case-based maintenance because it breaks the representativeness assumption ––
which says that the case base is a representative sample of the target problem space [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. How
to mitigate problem-distribution drift? One answer is to discover cases to add to the case
base. Case discovery can be seen as similar in spirit to oversampling in SMOTE [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and data
augmentation in neural networks [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and adaptation knowledge provides a natural source of
cohesion-preserving transformations (á la ghost cases in expansion-contraction compression
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]).
      </p>
      <p>
        Predictive case discovery attempts to mitigate problem-distribution drift by anticipating and
acquiring cases expected to be useful for solving future problems. Building on prior work in drift
detection [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], it identifies “hot spots" in the problem space to target for case discovery. When
drift is detected (or at each time step, if the objective is to grow the case base), it divides the
case base into clusters using the k-means algorithm [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] and a distance metric. Then predictive
case discovery randomly chooses a cluster and finds the case at the center of that cluster. It
discovers a variation on the centroid case by eagerly applying an adaptation rule or altering the
value of a single feature. For example, when navigating an autonomous vehicle, a variation on
a route could change one of the waypoints.
      </p>
      <p>Evaluation on four scenarios (no drift, abrupt drift, cyclical drift, and drift from obsolescence)
demonstrated that it outperformed baselines. But as the efectiveness of the case discovery
strategy depends on characteristics of the drift itself, there is no universal strategy. An important
next step is to investigate other strategies for drift detection and case discovery and how to
select the ideal strategy for a specific task domain.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The introduction to this research summary explained the swamping utility problem and how it
motivates case-base maintenance. Middle sections briefly presented four case-base maintenance
strategies that go beyond the deletion of cases by deleting features within cases or discovering
new cases: flexible feature deletion, adaptation-guided feature deletion, expansion-contraction
compression, and predictive case discovery. Evaluation of these case-base maintenance
strategies, compared to appropriate baselines on suitable data sets, generally showed improvements
in adaptation cost, competence, or solution quality. Future work will explore how to select the
most appropriate strategy or combination of strategies for a particular scenario –– either using
domain knowledge to guide maintenance policy or using cross-validation strategies that do not
make assumptions about the case distribution.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>B.</given-names>
            <surname>Smyth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cunningham</surname>
          </string-name>
          ,
          <article-title>The utility problem analysed: A case-based reasoning perspective</article-title>
          ,
          <source>in: Advances in Case-Based Reasoning</source>
          , Springer,
          <year>1996</year>
          , pp.
          <fpage>392</fpage>
          -
          <lpage>399</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J. M.</given-names>
            <surname>Juarez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Craw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Lopez-Delgado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Campos</surname>
          </string-name>
          ,
          <article-title>Maintenance of case bases: Current algorithms after fifty years</article-title>
          ,
          <source>in: International Joint Conference on Artificial Intelligence</source>
          , volume
          <volume>27</volume>
          ,
          <source>International Joint Conferences on Artificial Intelligence Organization</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>5457</fpage>
          -
          <lpage>5463</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D. W.</given-names>
            <surname>Aha</surname>
          </string-name>
          ,
          <article-title>Generalizing from case studies: A case study</article-title>
          ,
          <source>in: Machine Learning Proceedings, Elsevier</source>
          ,
          <year>1992</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          . doi:
          <volume>10</volume>
          .1016/B978-1
          <source>-55860-247-2</source>
          .
          <fpage>50006</fpage>
          -
          <lpage>1</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Leake</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schack</surname>
          </string-name>
          ,
          <article-title>Flexible feature deletion: compacting case bases by selectively compressing case contents</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development</source>
          , Springer,
          <year>2015</year>
          , pp.
          <fpage>212</fpage>
          -
          <lpage>227</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -24586-7_
          <fpage>15</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>D.</given-names>
            <surname>Leake</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schack</surname>
          </string-name>
          ,
          <article-title>Adaptation-guided feature deletion: Testing recoverability to guide case compression</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development</source>
          , volume
          <volume>9969</volume>
          , Springer,
          <year>2016</year>
          , pp.
          <fpage>234</fpage>
          -
          <lpage>248</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -47096-2_
          <fpage>16</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.</given-names>
            <surname>Leake</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Schack</surname>
          </string-name>
          ,
          <article-title>Exploration vs. exploitation in case-base maintenance: Leveraging competence-based deletion with ghost cases</article-title>
          ,
          <source>in: Case-Based Reasoning Research and Development</source>
          , volume
          <volume>11156</volume>
          , Springer,
          <year>2018</year>
          , pp.
          <fpage>202</fpage>
          -
          <lpage>218</lpage>
          . doi:
          <volume>10</volume>
          .1007/ 978-3-
          <fpage>030</fpage>
          -01081-2_
          <fpage>14</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Smyth</surname>
          </string-name>
          , E. McKenna,
          <article-title>Competence models and the maintenance problem</article-title>
          ,
          <source>Computational Intelligence</source>
          <volume>17</volume>
          (
          <year>2001</year>
          )
          <fpage>235</fpage>
          -
          <lpage>249</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>T. G.</given-names>
            <surname>Dietterich</surname>
          </string-name>
          ,
          <article-title>Overfitting and undercomputing in machine learning</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>27</volume>
          (
          <year>1995</year>
          )
          <fpage>326</fpage>
          -
          <lpage>327</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lawrence</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. L.</given-names>
            <surname>Giles</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. C.</given-names>
            <surname>Tsoi</surname>
          </string-name>
          ,
          <article-title>Lessons in neural network training: Overfitting may be harder than expected</article-title>
          ,
          <source>in: Proceedings of the 14th National Conference on Artificial Intelligence</source>
          ,
          <year>1997</year>
          , pp.
          <fpage>540</fpage>
          -
          <lpage>545</lpage>
          . URL: http://citeseerx.ist.psu.edu/viewdoc/download?doi
          <source>= 10.1.1.38.6468&amp;rep=rep1&amp;type=pdf.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S. C.</given-names>
            <surname>Wong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stamatescu</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. D. McDonnell</surname>
          </string-name>
          ,
          <article-title>Understanding data augmentation for classification: when to warp?</article-title>
          ,
          <source>in: 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA)</source>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . URL: https://arxiv. org/pdf/1609.08764.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>D. B. Leake</surname>
          </string-name>
          , D. C. Wilson,
          <article-title>When experience is wrong: Examining CBR for changing tasks and environments</article-title>
          , in: ICCBR, volume
          <volume>1650</volume>
          , Springer,
          <year>1999</year>
          , pp.
          <fpage>218</fpage>
          -
          <lpage>232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Dong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Zhang, Learning under concept drift: A review, IEEE transactions on knowledge and data engineering 31 (</article-title>
          <year>2018</year>
          )
          <fpage>2346</fpage>
          -
          <lpage>2363</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Fernández</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Garcia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Herrera</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Chawla</surname>
          </string-name>
          ,
          <article-title>SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary</article-title>
          ,
          <source>Journal of artificial intelligence research 61</source>
          (
          <year>2018</year>
          )
          <fpage>863</fpage>
          -
          <lpage>905</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B. K.</given-names>
            <surname>Iwana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Uchida</surname>
          </string-name>
          ,
          <article-title>An empirical survey of data augmentation for time series classification with neural networks</article-title>
          ,
          <source>PLOS one 16</source>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gama</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Medas</surname>
          </string-name>
          , G. Castillo,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rodrigues</surname>
          </string-name>
          ,
          <article-title>Learning with drift detection</article-title>
          ,
          <source>in: Advances in Artificial Intelligence-SBIA 2004: 17th Brazilian Symposium on Artificial Intelligence</source>
          , Sao Luis, Maranhao, Brazil,
          <source>September 29-October 1</source>
          ,
          <year>2004</year>
          . Proceedings 17, Springer,
          <year>2004</year>
          , pp.
          <fpage>286</fpage>
          -
          <lpage>295</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahmed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Seraj</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M. S.</given-names>
            <surname>Islam</surname>
          </string-name>
          ,
          <article-title>The k-means algorithm: A comprehensive survey and performance evaluation, Electronics 9 (</article-title>
          <year>2020</year>
          )
          <fpage>1295</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>