<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>CKD-TREE: AN IMPROVED KD-TREE CONSTRUCTION ALGORITHM</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Y Narasimhulu</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ashok Suthar</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Raghunadh Pasunuri</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>V China Venkaiah</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>CSE, Malla Reddy Engineering College(Autonomous)</institution>
          ,
          <addr-line>Maisammaguda(H), Gundlapochampally Village, Medchal Mandal, Medchal-Malkajgiri District, Telangana State, 500100</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>SCIS, University of Hyderabad</institution>
          ,
          <addr-line>Prof. CR Rao Road, Gachibowli, Hyderabad, 500046</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <fpage>211</fpage>
      <lpage>218</lpage>
      <abstract>
        <p>Data structures such as VP-Tree, R-Tree and KD-Tree builds an index of all the data available in the ofline phase and uses that indexed tree to search for and answer nearest neighbor queries or to classify the input query. We use a Lightweight Coreset algorithm to reduce the actual data size used to build the tree index, resulting in a faster index building time. We improve on already available Nearest Neighbor based Classification techniques and pit our classification method against the widely accepted, state of the art data structures such as VP-Tree, R-Tree and KD-Tree. In terms of speed the proposed method out performs the compared data structures, as the size of the data increases.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;KD Tree</kwd>
        <kwd>Coresets</kwd>
        <kwd>Nearest Neighbor</kwd>
        <kwd>Classification</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
by selecting a position in the space called vantage point
and partitions the data into two parts. The first part
 -Nearest Neighbor ( NN) problem refers to the prob- contains data that are closer to vantage point and the
lem of finding  points or samples in the data which other part which are not closer to the point. The
diare closest to the query point. Nearest Neighbor al- vision process continues until there are smaller sets.
gorithm finds its use in several machine learning ar- Finally a tree is constructed such that the neigbors in
eas, such as classification and regression and is also the tree are also neigbors in the real space. R-Tree[2]
the most time-consuming part of these applications. is another data structure that is most commonly used
In diferent use cases such as in recommendation sys- to store spatial objects such as location of gas stations,
tems, computer vision and robotics etc, fast response restaurants, outlines of agricultural lands and etc.
times are critical and using brute force approaches such In this paper we consider  NN for classification, where
as linear search is not feasible. Hence there are sev- nearest neighbors of a query point in the dataset are
eral approaches to solve these Nearest Neighbor prob- used to classify the query point. Nearest neighbor in
lems which are based on Hashing, Graphs or Space- essence is a lazy learning algorithm, i.e. it memorizes
Partitioning Trees. Space-partitioning methods are gen- the whole training dataset to provide the nearest
neigherally more eficient due to less tunable parameters. bors of an incoming query point. Consequently, though</p>
      <p>One such algorithm is KD-Tree. It is a space parti- the algorithms provide very eficient solutions to the
tioning algorithm which divides space recursively us- nearest neighbor problem, they might run into
probing a hyper-plane based on a splitting rul. It reduces lems. This is because data size becomes too large due
the search space by almost half at every iteration. An- to the high magnitudes of data available today to
proother space partitioning algorithm is Vantage Point Tree cess. In critical systems where time is of essence,
loos(VP-Tree)[1], which divides the data in a metric space ing even a few seconds while processing all that data
might cause issues. The author in [3] uses SVM to
tackle a similar problem by reducing the size of data
on which Nearest Neighbor algorithm runs. We use
coresets for a similar efect, but on very large datasets.</p>
      <p>The concept of coresets follows a data
summarization approach. Coresets are small subsets of the
original data. They are used to scale clustering problems
in massive data sets. Models trained on Coresets
provide competitive results against a model trained on full
original dataset. Hence these can be very useful in
speeding up said models while still keeping up
theoritISIC 2021: International Semantic Intelligence Conference, February
25–27, 2021, New Delhi, India
" narasimedu@gmail.com (Y. Narasimhulu);
ashok.suthar.sce@gmail.com (A. Suthar);
raghupasunuri@gmail.com (R. Pasunuri); venkaiah@hotmail.com
(V.C. Venkaiah)
~ https://github.com/Narasim (Y. Narasimhulu);
https://www.linkedin.com/in/ashok-suthar (A. Suthar);
http://cse.mrec.ac.in/StafDetails?FacultyId=3072 (R. Pasunuri);
https://scis.uohyd.ac.in/People/profile/vch_profile.php (V.C.</p>
      <p>Venkaiah)
0000-0001-5440-0200 (V.C. Venkaiah)</p>
      <p>© 2021 Copyright for this paper by its authors. Use permitted under Creative
CPWrEooUrckReshdoinpgs IhStpN:/c1e6u1r3-w-0s.o7r3g CCoEmUmoRns CLicoennsefeArtterinbuctieonP4.r0oIncteerenadtiionnagl s(CC(CBYE4U.0)R.-WS.org)
ical guarantees upto a level. Coresets are often used in
clustering algorithms to improve their speed even
further. To achieve this, first construct a coreset —
usually in linear time — and then use an algorithm that
works on coreset to solve the clustering problem. As
the coreset size is very small compared to the actual
data size, this can provide significant speed in the said
algorithms.</p>
      <p>We use a state of the art lightweight coreset
construction algorithm to improve time in the case of
solving Nearest Neighbor problem using KD-Tree space
partitioning algorithm. We use the end result of
Nearest Neighbor query to classify our input query point
based on its nearest neighbor points found.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>As foretold there are a number of approaches to solve
a nearest neighbor problem based on hashing, graphs,
space-partitioning etc. We focus mainly on KD-Tree, a
widely accepted, widely used and fast space partition- Figure 2: Examples of splitting rules on a common data set.
ing technique for Nearest Neighbors or Classification
problems than VP-Tree and R-Tree.
plane is not moved. However, if one side of the
split2.1. KD-Tree (K Dimensional-tree) ting plane is empty, i.e. without any point, then they
slide the splitting plane towards the data points until
KD Tree is a space-partitioning algorithm. Its main it encounters at least one of the points. That point is
work flow can be divided in two parts, called ofline then a child or leaf cell containing single point, while
phase and online phase. It builds the index(tree) in the the algorithm continues to work recursively on the
reifrst phase called, ofline phase, so that it can answer maining points.
the queries later on in the other phase called online Once the tree is built completely, and a query comes
phase. During the ofline phase it subdivides space in, we trace the query point down along the tree, by
into rectangular cells through the recursive use of some comparing it’s dimension value with the cut
dimensplitting rule. In the standard split the splitting dimen- sion value at each level of the tree. Continuing this
sion is chosen based on the maximum spread along a process leads to a leaf node. This leaf node to which
dimension in the data. Whereas in midpoint split the the query point reaches in the end is called its
nearsplitting hyper-plane bisects the longest side of the est neighbor. A query point can have more than one
cell and passes through the center of the cell. Most nearest neighbors.
widely accepted KD-Tree implementation in work
today is based on the paper written by Songrit
Maneewongvatana and David M. Mount[4]. 2.2. KNN Classification Based on</p>
      <p>They consider a variant of the midpoint splitting KD-Tree
rule called sliding-midpoint to overcome the problem
of having a data set in which points might be clustered
together. The example of sliding-midpoint rule is
presented in Figure 1 and Figure 2 depicts the working of
other splitting rules.</p>
      <p>In this approach data is first split at midpoint, by
considering a hyper-plane which passes through the
center of the cell, dividing the cell’s longest side in
two parts. Then they check if data points exist on
both sides of the splitting plane. If so, the splitting
The authors of [5] talks about how KD-tree is a
special storage structure for data and how it represents
the training data eficiently. They propose a 
NN-KDtree classification algorithm utilizing the advantages
ofered by the kNN and KD-tree algorithms. They use
the proposed method on eleven datasets and show that
their  NN-KD-tree algorithm reduces time
complexity while significantly improving search performance.</p>
      <p>Another nearest neighbor search algorithm that uses
random projection and voting is discussed in [6].</p>
      <sec id="sec-2-1">
        <title>2.3. SVM Based reduced NN</title>
      </sec>
      <sec id="sec-2-2">
        <title>Classification</title>
        <p>that the proposed algorithm outperformed other
existing practices.</p>
        <p>The authors in [3] propose a novel SVM based instance
selection method. Here Support Vector Machine (SVM) 3. Construction of Coreset
tion accuracy with a small number of training instances, where it traces down to one of the leaf nodes in the tree
is used to form an instance space for instances of a
particular pattern classification problem. A
wrapperbased classification performance validation technique
is used to find the best hyperparameters which
identify the support vector. Then they identify and select
informative support vectors (instances) lying on the
margin hyper-plane in the instance space. Thus
deriving a reduced training set for reduced nearest neighbor
classification. This reduced training set is then used to
classify new instances.</p>
        <sec id="sec-2-2-1">
          <title>They demonstrate the performance of the proposed</title>
          <p>instance selection method on some datasets. Although
their method maintained or even increased
classificaall the datasets chosen are very small in nature. The
highest number of instances in a dataset being 699 in</p>
        </sec>
        <sec id="sec-2-2-2">
          <title>Breastw dataset. We focus more on datasets with very large number of instances.</title>
        </sec>
      </sec>
      <sec id="sec-2-3">
        <title>2.4. Lightweight Coresets</title>
        <sec id="sec-2-3-1">
          <title>Coresets are compact representations of original data</title>
          <p>sets. They provide provably competitive results with
models trained on the full data set. As such, coresets
are successfully used to scale up diferent clustering
models to very large data sets.</p>
          <p>There are several existing Coreset building
methodologies[7]. O. Bachem, and M. Lucic[8]
proposed a notion of lightweight coresets that allowed for
both multiplicative and additive errors. They provided
an algorithm to construct lightweight coresets for
kmeans clustering and soft and hard Bregman
clustering. Their algorithm is substantially faster than the
allel in the sense that the data can be divided between
several threads for parallel computation. Also as the
name suggests, it results in smaller coresets.</p>
        </sec>
        <sec id="sec-2-3-2">
          <title>Lightweight coreset algorithm uses variance as a</title>
          <p>means to create a probability distribution for the points
in data set. Points farther from the mean have higher
probability when compared to points which are closer
to the mean. A small subset of points can then be
sampled based on the derived probability distribution.</p>
        </sec>
        <sec id="sec-2-3-3">
          <title>As points are sampled based on the variance in data,</title>
          <p>it helps keep classification or clustering properties of
datasets. They have showed that their approach
naturally generalizes to statistical  -means clustering. They
also performed extensive experiments, demonstrating
other existing coreset construction algorithms. It’s par- from the actual dataset. This algorithm takes as
in</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>KD-Tree (CKD-Tree)</title>
      <sec id="sec-3-1">
        <title>Though KD-Tree for classification is a pretty fast algo</title>
        <p>rithm in itself, it may not be so for very large datasets.</p>
      </sec>
      <sec id="sec-3-2">
        <title>To improve on the already fast KD-Tree classification</title>
        <p>algorithm, and to create an even faster version of
KD</p>
      </sec>
      <sec id="sec-3-3">
        <title>Tree we use similar approach as in the case of cluster</title>
        <p>ing algorithms, i.e. make use of Coresets. We first use
a Coreset algorithm to create a representative set of
points from the original data set. This representative
set is then fed to the KD-Tree algorithm to build a tree
index (ofline phase) based on the representative set.</p>
      </sec>
      <sec id="sec-3-4">
        <title>When a query point arrives, we feed it into the tree, index. At this point any suitable search method can be used to find nearest neighbors to the query point in the leaf node.</title>
      </sec>
      <sec id="sec-3-5">
        <title>Algorithm 1 Lightweight Coreset Construction</title>
        <p>Require: Set of data points X, coreset size m
1:  ←− mean of X
2: for  ∈</p>
        <p>do
3:
5: 
4: end for
 ( ) ←− 12 | 1 | + 21 ∑ ′∈  (, )2</p>
        <p>(, )2
each point  has weight
probability  ( )
1
. ( )</p>
      </sec>
      <sec id="sec-3-6">
        <title>6: return lightweight coreset C</title>
        <p>←− sample m weighted points from</p>
        <p>where
and is sampled with</p>
        <p>We use Algorithm 1, Lightweight Coreset
Construction [8] (LWCS) to create the set of representative points
put a dataset</p>
        <p>and the coreset size  , i.e. the
number of representative points in the coreset. It creates
a probabilty distribution based on a point’s distance
from the mean, w.r.t. the total of all such distances.</p>
        <p>Distance metric used here is euclidian distance. Once
every point has a probability assigned to it, we sample

points with weight
.
1
( )</p>
        <p>and probabiltiy  ( ).</p>
        <p>Algorithm 2, CKD-Tree Algorithm for classification,
uses Algorithm 1 Lightweight Coreset Construction
(LWCS), to process and get a compact version of the
original large dataset 
is then used to build the</p>
        <p>. This coreset</p>
        <p>index at line 2 of the
algorithm. To build the tree index we use sliding-midpoint[4]
technique. The 
index can then be used to  
the index with a query point. Query requires you to classes. While datasets bio_train and MiniBooNe
Partispecify  i.e. number of nearest neighbors required cle are both very large datasets, HTRU2 and spambase
along with the point to query with, i.e.    . are relatively very small. This helps in showing the
relThe flow diagram of the entire work is presented in ative performance of CKD-Tree algorithm on diferent
the Figure 3. types of datasets. Dataset default of credit card clients</p>
        <p>In our specific use case, we use nearest neighbors to is a more balanced dataset in terms of sample size and
classify the query point into a class. This can be done dimensionality.
easily based on the majority class in nearest neighbors We kept 1000 samples from each dataset as test dataset
returned. for testing the models. These samples are used to check
the accuracy of the prediction made by the algorithm.</p>
        <p>Algorithm 2 CKD-Tree Algorithm For Classification While testing the VP-Tree and R-Tree, we considered
Require: Large dataset X, coreset size m test sample sizes to 10, 50, 100, 200, and 500. Later we
1: repData ←− ℎℎ  ℎ (   calculated the average times for them. While building
  ,    ) the tree index for KD-Tree,   was kept same
23:: ,     =    ( =  .  ) (  ,  = ians ethacehnlueam=f bne.ordHoeefornefet ahreestrteneeiingidshebtxho.ersnquumebrieerdo(f p).oiin.et.s,
  ℎ  ) We measure the performance based on three
fac45:: forp r int po∈in  t at index indorepData i.e. Nearest ttaokrse,nAicncubruaicldyinofgththeeretrseueltsin,adveexraagned taimveer(aingesetcimoned(isn)
Points</p>
        <p>seconds) taken in answering the query. Each of these
6: end for factors are compared and tabulated separately for all
7: queryPointClass ←− Majority class of Nearest the data structures that were used and also for the
proNeighbor Points. posed work.</p>
        <p>Table 2 shows the results of CKD-Tree for  = 10.</p>
        <p>We use 3 diferent coreset sizes  = 1000,  = 2000
and  = 5000 and find the average of all of them(Avg.
4. Experiments and Results 1). For spambase dataset coreset size  = 5000 is not
generated as data size itself is only 4601. Consider the
We implement the CKD-Tree using the above method- bio_train dataset, here in table the average value of
ology and compare it against KD-Tree[4] [9] to see the ’Indexing time’ is calculated by adding the Indexing
performance diference it can provide and the cost. time of  = 1000,  = 2000 and  = 5000 and finally</p>
        <p>
          All of the datasets [
          <xref ref-type="bibr" rid="ref2">10</xref>
          ] in table 1 have two target dividing it by 3. The same is applied for Querying time
and Accuracy of Avg. 1. Tree, KD-Tree and the proposed work.
        </p>
        <p>Table 3 show the results of CKD-Tree for  = 50. It is observed from Table 5 that the proposed work
We use 3 diferent coreset sizes  = 1000,  = 2000 out performs all the data structures in Querying time.
and  = 5000 and find the average of all of them(Avg. Considering the Indexing time, the proposed work also
2). For spambase dataset coreset size  = 5000 is not performed better than that of VP-Tree and R-Tree. The
generated as data size itself is only 4601. Consider the accuracy of the proposed work is approximatley close
bio_train dataset, here in table the average value of to other data structures.
’Indexing time’ is calculated by adding the Indexing The Figures 4 and 5, given below, show the
compartime of  = 1000,  = 2000 and  = 5000 and finally ison of Indexing time and Querying time respectively
dividing it by 3. The same is applied for Querying time among R-Tree, VP-Tree,KD-Tree and Proposed Work.
and Accuracy of Avg. 2. Among the data structures that were used for
com</p>
        <p>Table 4 presents the average of Avg. 1 and Avg. 2. parison, KD-Tree is considered to be the best. So we
This average is called ’Overall Avg’. Indexing time of concentrated mostly on KD-Tree. Here in Table 6 we
’Overall Avg’ is obtained by averaging the ’Indexing present the indexing time comparison of the KD-Tree
time’ of Avg. 1 and Avg. 2. The same is applied for and proposed work. As the size of the data increases
’Querying time’ and ’Accuracy’. the performance of the proposed work increases and</p>
        <p>Table 5, given below, is the final comparison table. at a point of time it even starts performing better than
The table presents the comparison among R-Tree, VP- the KD-Tree. So, for large datasets the proposed work
Comparison among R-Tree, VP-Tree, KD-Tree and Proposed Work</p>
        <p>Indexing time</p>
        <p>Querying time</p>
        <p>Indexing time</p>
        <p>Querying time
spambase
bio_train</p>
        <p>HTRU2
credit card
MiniBooNE
spambase
bio_train</p>
        <p>HTRU2
credit card
MiniBooNE</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Conclusion</title>
      <p>The above tables show that for at least one value of</p>
      <p>each dataset showed competitive or in some cases
better accuracy (default credit card and HTRU2) when
used with coresets. In case of larger Datasets such as
bio_train and MiniBooNE, the coreset size is very less
compared to the original dataset size. But they still
manage to provide almost same results in terms of
accuracy as the original dataset. Also KD-Tree built on
the coresets of these datasets see a significant speed
boost in ofline (indexing) and Online (Query) phases.</p>
      <sec id="sec-4-1">
        <title>We can also notice that as the dataset size starts to</title>
        <p>decrease, the gap in indexing and query speeds starts
to become smaller and smaller. For smaller datasets
(spambase and HTRU2), this might even lead to higher
query or indexing times.</p>
      </sec>
      <sec id="sec-4-2">
        <title>This shows that CKD-Tree is a very eficient algorithm for nearest neighbor based classification in large datasets.</title>
        <p>Accuracy
R-Tree
0.025816591</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Future Work</title>
      <p>CKD-Tree gives good results on the experimented
datasets. The probability distribution used here is based
on the variance of data. Consequently this approach
might not perform well on noisy or locality sensitive
data. In such cases a better probability distribution
approach for data summarization could be based on
the locality of data. Locality Sensitive Hashing (LSH)
functions could provide more accurate form of data
summarization. For image data vector quantization based
approach might help in data summarization.</p>
      <sec id="sec-5-1">
        <title>In the online phase, implementation of a more ro</title>
        <p>bust and faster search technique could also be fruitful.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Randomizing the algorithm or having multiple tree indexes to increase the accuracy is another option.</title>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>International Conference on Big Data</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <source>doi:10</source>
          .1109/BigData.
          <year>2016</year>
          .
          <volume>7840682</volume>
          . [1]
          <string-name>
            <surname>Yianilos</surname>
          </string-name>
          ,
          <article-title>Data structures</article-title>
          and algorithms for [7]
          <string-name>
            <given-names>O. B.</given-names>
            <surname>Mario Lucic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Krause</surname>
          </string-name>
          , Strong coresets for
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>crete algorithms</surname>
          </string-name>
          (
          <year>1993</year>
          ). doi:
          <volume>10</volume>
          .1145/313559. (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          313789. [8]
          <string-name>
            <surname>A. K. O. Bachem</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lucic</surname>
          </string-name>
          ,
          <string-name>
            <surname>Scalable</surname>
            k-means clus[2]
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Guttman</surname>
          </string-name>
          , R-trees:
          <article-title>A dynamic index structure tering via lightweight coresets</article-title>
          ., ACM SIGKDD
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <article-title>for spatial searching</article-title>
          ,
          <source>Proceedings of the 1984 International Conference on Knowledge Discov-</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <source>ACM SIGMOD international conference on Man- ery and Data Mining (KDD)</source>
          (
          <year>2018</year>
          ). doi:10.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>agement of data - SIGMOD '84</source>
          (
          <year>1984</year>
          ). doi:
          <volume>10</volume>
          . 1145/3219819.3219973.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <volume>1145</volume>
          /602264.602266. [9] scipy.org, Spatial kdtree class,
          <year>2020</year>
          . URL: [3]
          <string-name>
            <surname>C.-C. Huang</surname>
            , H.-
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Chang</surname>
          </string-name>
          ,
          <article-title>A novel svm-based https://docs</article-title>
          .scipy.org/doc/scipy-0.
          <fpage>14</fpage>
          .0/
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>reduced nn classification method</article-title>
          ., 11th Interna- reference/generated/scipy.spatial.KDTree.html.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <source>tional Conference on Computational Intelligence</source>
          [10] c. ics.uci.edu, Datasets used,
          <year>2007</year>
          . URL:
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          and
          <string-name>
            <surname>Security</surname>
          </string-name>
          (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1109/CIS.
          <year>2015</year>
          . https://archive.ics.uci.edu/ml/datasets.php,
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          23. osmot.cs.cornell.edu./kddcup/datasets.html. [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Maneewongvatana</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Mount</surname>
          </string-name>
          , It's okay
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <article-title>to be skinny, if your friends are fat</article-title>
          .,
          <source>4th An-</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>try</surname>
          </string-name>
          (
          <year>1999</year>
          ).
          <source>doi:10.1.1.39</source>
          .8380. [5]
          <string-name>
            <given-names>C. X. H. Z.</given-names>
            <surname>Wenfeng</surname>
          </string-name>
          <string-name>
            <surname>Hou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Daiwei</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>tization (</surname>
            <given-names>IICSPI</given-names>
          </string-name>
          ) (
          <year>2018</year>
          ). doi:
          <volume>10</volume>
          .1109/IICSPI.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <year>2018</year>
          .
          <volume>8690508</volume>
          . [6]
          <string-name>
            <given-names>S. T. E. J. R. T. L. W. J. C. V.</given-names>
            <surname>Hyvönen</surname>
          </string-name>
          , T. Pitkänen,
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>