<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Workshops, Los Angeles, USA,
March</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Explainable Structuring and Discovery of Relevant Cases for Exploration of High-Dimensional Data</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Frédéric Blanchard CReSTIC Reims</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michel Herbin CReSTIC Reims</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>France</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Joris Falip CReSTIC Reims</institution>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <volume>20</volume>
      <issue>2019</issue>
      <abstract>
        <p>Data described by numerous features create a challenge for domain experts as it is dificult to manipulate, explore and visualize them. With the increased number of features, a phenomenon called "curse of dimensionality" arises: sparsity increases and distance metrics are less relevant as most elements of the dataset become equidistant. The result is a loss of eficiency for traditional machine learning algorithms. Moreover, many state-of-the-art approaches act as black-boxes from a user point of view and are unable to provide explanations for their results. We propose an instance-based method to structure datasets around important elements called exemplars. The similarity measure used by our approach is less sensitive to high-dimensional spaces, and provides both explainable and interpretable results: important properties for decision-making tools such as recommender systems. The described algorithm relies on exemplar theory to provide a data exploration tool suited to the reasoning used by experts of various fields. We apply our method to synthetic as well as real-world datasets and compare the results to recommendations made using a nearest neighbor approach.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>CCS CONCEPTS</title>
      <p>• Information systems → Decision support systems; •
Humancentered computing → Interactive systems and tools; •
Computing methodologies → Knowledge representation and reasoning.</p>
    </sec>
    <sec id="sec-2">
      <title>INTRODUCTION</title>
      <p>This work addresses the task of structuring a dataset to facilitate
exploration and visualization. The proposed idea is to create pairwise
connections between similar elements in order to provide a
similarity graph usable for exploration and recommendation. Our goal is a
solution well-suited for domain experts, that takes into account and
IUI Workshops’19, March 20, 2019, Los Angeles, USA
© 2019 Copyright ©2019 for the individual papers by the papers’ authors. Copying
permitted for private and academic purposes. This volume is published and copyrighted
by its editors.
even leverages the analogical reasoning most experts use: they
often compare new cases or problems to previously encountered ones
and base their decisions on past experiences. This way, the tool and
answers it provides would feel more intuitive for end-users. We also
emphasize compatibility with high-dimensional data: experts often
rely on datasets described by hundreds or thousands of features,
meaning that any solution must remain eficient even with a high
number of dimensions. Finally, explicability and interpretability are
critical in many fields including health and medicine, insurance,
ifnance or security: a recommendation system is even more useful
when end-users understand the process behind suggested decisions.
It allows them to adapt and adjust the algorithm to their needs but
most of all, it increases confidence in the results.</p>
      <p>To reach these objectives, we created an unsupervised
instancebased algorithm that enriches data and structures it, linking each
element to an exemplar exhibiting typical characteristics. The
resulting graph can be used by a recommender system, building a
solution to explore data by following edges, each one representing
strong similarities on specific features. When exploring the dataset,
going from an instance to its exemplar means going from a
specific element to a more generic and archetypal one. Our approach
processes each dimension individually then aggregate the results,
making this method suitable for very high-dimensional analysis.
Moreover, it identifies features contributing to similarities between
elements, making it straightforward to explain the structuration
choices. This strategy can be adapted to multiple contexts as it does
not require any a priori knowledge.</p>
      <p>In the following section, we detail the problems encountered when
manipulating high-dimensional datasets and elaborate on the
requirements of a successful exploration tool that would allow
visualization of similarities between data points and could give insights on
hidden patterns. The next section describes the proposed algorithm,
the resulting structure and how the input parameter influences
it. We then detail applications to various datasets, illustrating the
benefits of this method. Finally, we discuss the proposed approach,
review current work in progress and possibilities for future work.
2</p>
    </sec>
    <sec id="sec-3">
      <title>BACKGROUND</title>
      <p>
        While high-dimensional databases are frequently encountered with
many real-world data, experts lack appropriate and simple tools to
make the best use of these datasets and information. Many
knowledge discovery tools provide relevant results but lack
interpretability or explainability: essential features when making any critical
decision [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Also, when exploring and visualizing data, most algorithms rely
on prototypes [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] generated with averages from similar elements.
These prototype-based methods can be efective in some cases but
when manipulating data, experts of a specific field tend to
understand a new element (or instance) by comparing it to the most
similar instance they are familiar with [
        <xref ref-type="bibr" rid="ref16 ref8">8, 16</xref>
        ]. Experts naturally
favor this approach when confronted with problems within their
ifeld of expertise, and it provides better results. [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This reasoning
is known as analogical reasoning and is formalized by the
exemplar theory [
        <xref ref-type="bibr" rid="ref11 ref12">11, 12</xref>
        ]. For example, a physician will use his work
experience to diagnose a new patient by linking it to the most
similar element in a set of typical patients he encountered in the past.
In analogical reasoning, typical instances of a dataset are called
exemplars, each one subsuming a set of similar elements. These
exemplars form a subset that can describe the whole dataset but
remains small enough to manipulate and explore easily.
When analyzing data described by a high number of features, it
becomes harder to use the distance between instances as a measure
of similarity. This is because working with data described by a
high number of features comes with two inherent problems: data
become sparser as density decreases and distances between
elements normalize, so neighbors are harder to distinguish [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. These
phenomena, often referred to as "curse of dimensionality" [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], make
many analysis and recommendation algorithms unreliable [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]
because they use distances between instances to compute similarities.
A standard solution to this problem is to use feature projection
methods to reduce dimensionality, but this is not suitable if we
wish to retain the interpretability of the process. On the other hand,
feature selection reduces the genericity of the approach and require
time and a priori knowledge to select and filter attributes.
High-dimensionality also provides new properties to elements, like
the hubness phenomenon [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], which can help to improve and
develop data mining techniques. Hubness is the fact that increased
dimensionality correlates with some instances being in the
nearest neighbors of the majority of the other elements, turning these
instances into so-called hubs. Hubness reduction methods, when
coupled with traditional approaches, do not improve their results
[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Hopefully this phenomenon can be used to adapt algorithms
to high-dimensional data [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-4">
      <title>STRUCTURING AROUND EXEMPLARS</title>
      <p>
        Our proposed approach finds the most typical and "interesting"
neighbor of each data point, without relying on a distance metric
in the entire space [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. We process each dimension separately
before aggregating the results: our algorithm is less afected by high
dimensionality, and interpretability is improved as we can quantify
the importance of each dimension regarding associations. We also
chose to use ranks instead of distances, to avoid excluding outliers
elements.
      </p>
      <p>Structuring a dataset by linking each element to its exemplar results
in a graph where each connected component is a group of elements
similar enough to be subsumed by a few selected exemplars. This
graph allows for exploration of the dataset, with edges representing
strong similarity between individuals.</p>
      <p>
        From a practical point of view, the proposed method is currently
implemented as a recommender system used by medical experts.
This allows them to visualize and understand diabetes-related data
by exploring an association graph [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] where each vertex is a
patient and patients in the same connected component had a similar
evolution of the disease.
      </p>
      <p>
        Our algorithm is based on the Degree of Representativeness [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] (DoR).
The DoR measures the typicality of an element and its ability to be
an exemplar for other elements. Algorithm 1 illustrates the
selection of an exemplar for each element according to the DoR, while
Algorithm 2 details computing of the DoR. This new algorithm
addresses the problem of high-dimensionality: our last work was
using distance in the full description space when computing the
DoR, thus being unreliable with datasets described by a high number
of features.
3.1
      </p>
    </sec>
    <sec id="sec-5">
      <title>Algorithm</title>
      <p>Let Ω be a set of N objects in a multidimensional space. These N
objects, or elements, are characterized by D qualitative or
quantitative features.</p>
      <p>For each of the D dimensions, we compute the distance matrix.
Each object then ranks every other object according to their
similarity on this dimension: low distance translates to a good ranking,
with the nearest element being ranked 1 and the farthest ranked
N − 1. This step is repeated on each dimension to transform the D
distance matrices into D rank matrices.</p>
      <p>Let us transform the ranks into scores. Let x be an object of Ω : for
each dimension d, x assigns a relative score Scorexd to every other
object y of Ω . Scorexd , relative to x , can be any arbitrary function,
but in this paper it will be defined by:</p>
      <p>Scorexd (y) = max(1 + T − Rankxd (y), 0)
where Rankxd (y) is the rank of an object y relative to x on
dimension d, and each element only assigns a non-zero score to its T
nearest neighbors. For each element y, we can compute the sum of
all scores it received on a specific dimension:</p>
      <p>Scored (y) =</p>
      <p>X Scorexd (y)
x ∈Ω
. Let us define k as a numeric parameter allowing control over
the total number of exemplars. To find the exemplar of a given
object x , we introduce the DoRx (y) of another element y as the
sum of the scores Scored (y) for every dimension d where y is in
the k nearest neighbors of x .</p>
      <p>DoRx (y) = X Scored (y)</p>
      <p>d ∈D′
where D ′ is the set of dimensions on which y is in the k-nearest
neighbors of x . The element chosen as an exemplar for object x is
the element with the highest DoRx .
(1)
(2)
(3)
4.5
4.0
thd3.5
i
W
l
pae3.0
S
2.5
2.0
4.5
4.0
thd3.5
i
W
l
pae3.0
S
2.5
2.0
5
5
Species
setosa
versicolor</p>
      <p>virginica
6
Sepal Length
7
8
end
Data: N elements defined on D dimensions, neighborhood
size K
Result: list of N exemplars
foreach dimension d in D do</p>
      <p>Compute dissimilarity matrix of elements;
Transform similarities into ranks;
Convert ranks into scores with arbitrary scoring function;
foreach element e in N do</p>
      <p>Scored (e ) ←− P Scorexd (e ) given by x to y;</p>
      <p>∀x ∈N
end
foreach element e in N do
exemplar (e ) ←− max (DoR(x, e, K ));</p>
      <p>∀x ∈N
end
Result ←− list of exemplars determined above;</p>
      <sec id="sec-5-1">
        <title>Algorithm 1: Structuring algorithm</title>
        <p>Data: elements x and e defined on D dimensions,
neighborhood size K
Result: Degree of Representativeness of element x according
to element e
Knnd (e ) ←− K -nearest neighbors of e, on dimension d;
DoR ←− 0
foreach dimension d in D do
if x ∈ Knnd (e) then</p>
        <p>DoR ←− DoR + Scored (x );
end
end
Result ←− DoR;</p>
      </sec>
      <sec id="sec-5-2">
        <title>Algorithm 2: DoR computing algorithm</title>
        <p>3.2</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Neighborhood size k and the resulting structure</title>
      <p>When establishing pairwise connections between elements,
parameter k determines the total number of exemplars used to summarize
the dataset. A low value for the neighborhood size (k about 20%
of the number of elements) gives numerous exemplars, each one
closely matching the few instances they subsume. The high number
of connected components in the resulting graph limits exploration
of related instances by following edges but this configuration will
only group very similar elements. On the other hand, selecting a
higher value (about 30% to 50% of the dataset’s size) gives fewer
exemplars that each subsumes a significant subset of the population.
These exemplars provide a meaningful but condensed
representation of the data and exploration of the resulting structure is easy:
with few connected components in the graph, following edges
allows the discovery of many similar elements. Figure 6, discussed in
the results section, illustrates the number of connected components
obtained for every possible value of k on a real-world dataset and
thus the diferent granularities obtainable in the final graph.
Figures 1 and 2 illustrate these diferences: the former uses a
neighborhood size of 30 and is composed of 44 exemplars, with an average
6
Sepal Length
7
8
of 2 elements subsumed by each exemplar; while the later is
obtained with a parameter set to 50, resulting in 25 exemplars for an
average of 5 elements represented by each exemplar. Either one of
these two graphs makes for an excellent recommendation system
where, for each element, a similar but more generic and typical one
can be suggested. It is also useful as a visualization tool, guiding
users as they explore the dataset. Features playing the most
prominent role in each association can even be overlaid on the edges to
inform on the nature of the similarities guiding the exploration
process.</p>
      <p>Given the graph resulting from the structuring process, another
option is to study its evolution for every value of the parameter
k. For a set of N elements, this means 1 ≤ k ≤ N . From this, we
can also establish other measures including the total number of
iterations a given element is chosen as an exemplar, or the average
number of elements subsumed by each exemplar. These additional
insights put more perspective on the usefulness of each instance as
an exemplar and its ability to subsume other elements.
4</p>
    </sec>
    <sec id="sec-7">
      <title>EXPERIMENTS</title>
      <p>To evaluate the eficiency of the described approach, we analyze
graphs created by our structuration algorithm and compare them
to graphs generated using the nearest neighbor algorithm in
highdimensional datasets. We compare both methods using key
indicators related to connected components in the graphs, to study the
usefulness of each structure as a recommendation tool in a data
exploration scenario. Table 1 summarizes the studied datasets along
with their respective number of elements and dimensions.</p>
      <p>normal is a synthetic dataset composed of 300 elements described
on 1000 dimensions. 80% of their dimensions are sampled from a
normal distribution centered on 0 with a standard deviation of 1.
The remaining 20% dimensions are noise sampled from a uniform
distribution between 0 and 1. This very high-dimensional example
outlines the previously mentioned "curse of dimensionality": with
a thousand dimensions, it is a sparse dataset and distances between
elements become hard to distinguish.</p>
      <p>
        Residential Buildings [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] (abbreviated "residential") and
Communities and Crime [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] (abbreviated "communities") both are datasets
made available through the UCI machine learning repository [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ].
They are described with more than a hundred features, some of
them being potential goals for prediction algorithms. residential
contains physical, financial and economic data related to 372
construction projects of residential apartments in Tehran, the capital
and largest city of Iran. communities combines data related to law
enforcement as well as crime and its socio-economic factors. It
aggregates information about 2215 cities from the United States,
collected between 1990 and 1995.
      </p>
      <p>Before any analysis, we removed prediction goals and irrelevant
features (such as elements IDs) from both datasets to guarantee
that two similar elements will be close to one another in the full
description space. Dimensions containing missing values were also
pruned in the preprocessing step, representing a total of 2 and 46
features removed from the residential and communities datasets
respectively, the later containing numerous features that are
prediction goals. This was done to avoid noise that would be detrimental
mostly to the nearest neighbor method. The number of dimensions
noted in Table 1 accounts for the remaining dimensions.</p>
      <p>To create the similarity graph with our structuration around
exemplars, we set to 10 the parameter T from Formula 1, meaning
the scoring function will give a positive score to the ten nearest
neighbors of each instance. As a reminder, this function is chosen
arbitrarily as a generic scoring function but can be adapted for a
specific dataset, given some a priori knowledge. The parameter we
can fine-tune is the neighborhood size used in the last step of our
algorithm when choosing an exemplar for each element. However,
because this parameter allows for a tradeof between edges
representing very strong similarities and graphs exhibiting structure
and properties suitable for exploration (low number of connected
components with high diameter), we will not seek to optimize this
parameter to avoid skewing the comparison. As suggested in the
previous section, we define k as 30% of the number of elements in
the dataset so we can expect the right balance between meaningful
associations and optimal graph structure.</p>
      <p>
        This graph is compared to another graph generated by linking each
element of the dataset to its nearest neighbor using the Minkowski
distance of order 0.75. While using an order lower than 1 violates
the triangular inequality, it provides significantly better results in
high-dimensional spaces than Euclidian or even Manhattan
distances [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This way we can be sure that elements linked together
by the nearest neighbor method will be as close as possible in the
full dimensional space.
      </p>
      <p>We select three criteria to evaluate the suitability of each graph
as an exploration tool. The first two are the number of connected
components and their size: they inform us on whether, starting
from an element, following edges representing similarities will
allow visualization of a broad selection of related instances or if
exploration will quickly stop due to a lack of similar elements.
We also study the average diameter of the components for each
graph: a lower diameter indicates that most elements chose the same
exemplar, as in Figure 3.a. A greater diameter, similar to the structure
exemplar structuring
nearest neighbor
illustrated in Figure 3.b, gives the possibility to progressively explore
from a specific element to the most typical one.
5</p>
    </sec>
    <sec id="sec-8">
      <title>RESULTS</title>
      <p>Results of our experiments are shown in Table 2. For each method
and dataset we display the number of connected components
obtained, compute their mean size and their mean diameter. We can
see that, even for very high-dimensional data, our approach results
in better structuring of the instances where elements are linked to
another more archetypal exemplar within each component. This is
outlined by the smaller number of connected components and their
overall larger diameter. This trend can be seen for every dataset we
studied: reducing the number of components by 18% to 39% and
demonstrating promising results on the two real-world applications.
Figure 4 shows the structure obtained with the nearest neighbor
approach on the residential dataset. In this case, data are weakly
structured, making it hard for a user to gain any insight or to use this
graph for exploration. Figure 5 illustrates the same dataset where
instances are structured around exemplars. The structure makes
more sense if used to visualize similarities: connected components
contain more elements and have a larger diameter. Navigating data
is easier and more instances are available as exemplars so users
can choose the level of genericity they require for an exemplar, by
following multiple edges to an exemplar subsuming more nodes.
Such an exemplar will be farther from the initial instance, but more
archetypal and may better suit analogical reasoning by exhibiting
typical characteristics.</p>
      <p>With Figure 6 we illustrate the number of connected components
in the graph created with various values for the parameter k. This
ifgure demonstrates how k can be used as a granularity factor to
choose on a spectrum ranging a few components subsumed by very
archetypal exemplars to numerous subsets composed of closely
matching elements. The maximum number of components for any
value of k is equal to the number of components created by the
nearest neighbor approach, meaning that even for a very low of k
the proposed structure is no less suitable for exploration than the
nearest neighbor one.</p>
      <p>To provide insights on how the data is structured, we detail
edges from a small connected component shown in Figure 7. This
component was extracted from the residential dataset, with the
same parameters previously used for Figure 5: neighborhood size k
of 124 and the threshold T from Formula 1 set to 10. We can describe
the following:</p>
      <p>• Vertices 1 and 2 both chose vertex 3 as their exemplar
because 3 is in the k-nearest neighbors of those elements on
89 dimensions, so they each are similar to 3 regarding most</p>
      <p>features of the dataset. The dimensions linking 1 to 3 are the
same as those linking 2 to 3.
• Elements 1 and 2 are similar to each other on 96 features.</p>
      <p>However, they have not chosen each other as exemplars
because they both gave instance 3 a cumulated score of
more than 4, 600 while they evaluated each other at only
around 2, 000.
• 3 is similar to 9 on 96 features, attributing it a total score
of 5, 000. As 3 only had a score of 4, 600, 9’s higher score
confirms that it is indeed more typical. 1 and 2 were similar
to 3 on 89 features, from which 88 are also among the 96
similar features between 3 and 9.</p>
    </sec>
    <sec id="sec-9">
      <title>6 CONCLUSION</title>
      <p>To summarize this work, we proposed an instance-based
structuration method that computes similarities between complex elements.
It creates pairwise connections between related instances with
exemplars subsuming similar elements. This structure allows for
intuitive exploration of a dataset and is relevant when creating
a recommendation system. We validated our approach by
studying a simulated dataset and data from real estate market and law
enforcement. We compared our results to a nearest neighbor
approach suited for high-dimensionality where proximities in the full
dimensional space were computed using a Minkowski distance of
order 0.75 to avoid concentration of the distances. On these
highdimensional examples we outlined better performances that are
less subject to the "curse of dimensionality" usually encountered
with these types of data. We also experimented on the use of
parameter k as a granularity factor giving users a straightforward
way to influence similarities and the structuring process.
We are currently testing this algorithm’s results when applied to
medical data, to help endocrinologists to better understand diabetes
and diagnose patients more accurately. Future works include
further development of our currently deployed prototype to easily
gather user feedback on each association. This feedback could be
used to create an automatic weighing of the features, tailoring the
recommendation to each user’s preferences.</p>
    </sec>
    <sec id="sec-10">
      <title>6.1 Source code</title>
      <p>The source code implementing the proposed algorithm and used
to create the figures and results in this paper is available on the
following public Github repository: https://github.com/Elesday/
ESIDA-IUI19.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Charu</surname>
            <given-names>C Aggarwal</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alexander Hinneburg</surname>
          </string-name>
          , and Daniel A Keim.
          <year>2001</year>
          .
          <article-title>On the Surprising Behavior of Distance Metrics in High Dimensional Space</article-title>
          .
          <source>In Database Theory - ICDT 2001</source>
          . Springer Berlin Heidelberg, Berlin, Heidelberg,
          <fpage>420</fpage>
          -
          <lpage>434</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Frédéric</given-names>
            <surname>Blanchard</surname>
          </string-name>
          , Amine Aït-Younes, and
          <string-name>
            <given-names>Michel</given-names>
            <surname>Herbin</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Linking Data According to Their Degree of Representativeness (DoR)</article-title>
          .
          <source>EAI Endorsed Transactions on Industrial Networks and Intelligent Systems</source>
          <volume>2</volume>
          ,
          <issue>4</issue>
          (
          <year>June 2015</year>
          ),
          <year>e2</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>Dua</given-names>
            <surname>Dheeru</surname>
          </string-name>
          and Efi Karra Taniskidou.
          <year>2017</year>
          .
          <article-title>UCI Machine Learning Repository</article-title>
          . http://archive.ics.uci.edu/ml
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Pedro</given-names>
            <surname>Domingos</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>A few useful things to know about machine learning</article-title>
          .
          <source>Commun. ACM</source>
          <volume>55</volume>
          ,
          <issue>10</issue>
          (
          <year>2012</year>
          ),
          <fpage>78</fpage>
          -
          <lpage>87</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>David L Donoho</surname>
          </string-name>
          et al.
          <year>2000</year>
          .
          <article-title>High-dimensional data analysis: The curses and blessings of dimensionality</article-title>
          .
          <source>AMS Math Challenges Lecture</source>
          <volume>1</volume>
          (
          <year>2000</year>
          ),
          <fpage>32</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Joris</given-names>
            <surname>Falip</surname>
          </string-name>
          , Amine Aït Younes, Frédéric Blanchard, Brigitte Delemer, Alpha Diallo, and
          <string-name>
            <given-names>Michel</given-names>
            <surname>Herbin</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Visual instance-based recommendation system for medical data mining.</article-title>
          .
          <source>In KES</source>
          .
          <volume>1747</volume>
          -
          <fpage>1754</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Roman</given-names>
            <surname>Feldbauer</surname>
          </string-name>
          and
          <string-name>
            <given-names>Arthur</given-names>
            <surname>Flexer</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>A comprehensive empirical comparison of hubness reduction in high-dimensional spaces</article-title>
          .
          <source>Knowledge and Information Systems (May</source>
          <year>2018</year>
          ),
          <fpage>1</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Gary</given-names>
            <surname>Klein</surname>
          </string-name>
          , Roberta Calderwood, and
          <string-name>
            <surname>Anne</surname>
          </string-name>
          Clinton-Cirocco.
          <year>2010</year>
          .
          <article-title>Rapid Decision Making on the Fire Ground: The Original Study Plus a Postscript</article-title>
          .
          <source>Journal of Cognitive Engineering and Decision Making</source>
          <volume>4</volume>
          ,
          <issue>3</issue>
          (Sept.
          <year>2010</year>
          ),
          <fpage>186</fpage>
          -
          <lpage>209</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Zachary</surname>
            <given-names>C</given-names>
          </string-name>
          <string-name>
            <surname>Lipton</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>The mythos of model interpretability</article-title>
          .
          <source>arXiv preprint arXiv:1606.03490</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Michael</surname>
            <given-names>L Mack</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alison R Preston</surname>
          </string-name>
          , and Bradley C Love.
          <year>2013</year>
          .
          <article-title>Decoding the Brain's Algorithm for Categorization from Its Neural Implementation</article-title>
          .
          <source>Current Biology</source>
          <volume>23</volume>
          ,
          <issue>20</issue>
          (Oct.
          <year>2013</year>
          ),
          <fpage>2023</fpage>
          -
          <lpage>2027</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Douglas L Medin and Marguerite M Schafer</surname>
          </string-name>
          .
          <year>1978</year>
          .
          <article-title>Context theory of classification learning</article-title>
          .
          <source>Psychological Review</source>
          <volume>85</volume>
          ,
          <issue>3</issue>
          (
          <year>1978</year>
          ),
          <fpage>207</fpage>
          -
          <lpage>238</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Robert</surname>
            <given-names>M</given-names>
          </string-name>
          <string-name>
            <surname>Nosofsky</surname>
          </string-name>
          .
          <year>1986</year>
          .
          <article-title>Attention, similarity, and the identification-categorization relationship</article-title>
          .
          <source>Journal of Experimental Psychology: General</source>
          <volume>115</volume>
          ,
          <issue>1</issue>
          (
          <year>1986</year>
          ),
          <fpage>39</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>Mohammad</given-names>
            <surname>Hossein</surname>
          </string-name>
          Rafiei and
          <string-name>
            <given-names>Hojjat</given-names>
            <surname>Adeli</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>A novel machine learning model for estimation of sale prices of real estate units</article-title>
          .
          <source>Journal of Construction Engineering and Management</source>
          <volume>142</volume>
          ,
          <issue>2</issue>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>Michael</given-names>
            <surname>Redmond</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Communities and crime data set</article-title>
          .
          <source>UCI Machine Learning Repository</source>
          (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Eleanor</surname>
            <given-names>H</given-names>
          </string-name>
          <string-name>
            <surname>Rosch</surname>
          </string-name>
          .
          <year>1973</year>
          .
          <article-title>Natural categories</article-title>
          .
          <source>Cognitive psychology 4</source>
          ,
          <issue>3</issue>
          (
          <year>1973</year>
          ),
          <fpage>328</fpage>
          -
          <lpage>350</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Jefrey</surname>
            <given-names>N Rouder</given-names>
          </string-name>
          and
          <string-name>
            <given-names>Roger</given-names>
            <surname>Ratclif</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Comparing Exemplar-</article-title>
          and
          <source>Rule-Based Theories of Categorization. Current Directions in Psychological Science</source>
          <volume>15</volume>
          ,
          <issue>1</issue>
          (
          <year>2006</year>
          ),
          <fpage>9</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>Nenad</given-names>
            <surname>Tomasev</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dunja</given-names>
            <surname>Mladenic</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Nearest Neighbor Voting in HighDimensional Data: Learning from Past Occurrences</article-title>
          .
          <source>In International Conference on Data Mining Workshops. IEEE</source>
          ,
          <fpage>1215</fpage>
          -
          <lpage>1218</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Nenad</surname>
            <given-names>Tomasev</given-names>
          </string-name>
          , Milos Radovanovic, Dunja Mladenic, and
          <string-name>
            <given-names>Mirjana</given-names>
            <surname>Ivanovic</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>The Role of Hubness in Clustering High-Dimensional Data</article-title>
          .
          <source>IEEE Transactions on Knowledge and Data Engineering</source>
          <volume>26</volume>
          ,
          <issue>3</issue>
          (
          <year>2014</year>
          ),
          <fpage>739</fpage>
          -
          <lpage>751</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Yingzhen</surname>
            <given-names>Yang</given-names>
          </string-name>
          , Xinqi Chu, Feng Liang, and Thomas S Huang.
          <year>2012</year>
          .
          <article-title>Pairwise Exemplar Clustering</article-title>
          .
          <source>AAAI</source>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>