<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Statistical concept of interpretable artificial intelligence</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Dmytriy Klyushin</string-name>
          <email>dmytroklyushin@knu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Taras Shevchenko National University of Kyiv</institution>
          ,
          <addr-line>Volodymyrska Str., 60, 01601, Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <fpage>20</fpage>
      <lpage>21</lpage>
      <abstract>
        <p>The paper describes the concept of interpretable artificial intelligence and machine learning based on the assessment of statistical homogeneity of classification objects described by features that are random variables. Objects are considered homogeneous if random values of their features have identical distributions. Mathematical theories of machine learning use two postulates: 1) the feature space is a vector space, i.e. the classified object can be represented as a vector of numbers (the vector space postulate), and 2) objects belonging to the same class form a compact set with a relatively simple boundary and the distance between them is less than to objects belonging to another class (the compactness postulate). However, in many practically essential situations, for example, in biomedical research, objects are associated not with feature vectors, but with samples of measured random variables. Therefore, we must suppose alternatives for the vector space and compactness postulates in such cases. The paper describes components of the suggested theory (measure of homogeneity, prediction set, and statistical depth).. Based on the proposed statistical postulates of machine learning, machine learning algorithms would be classified as interpretable if they comply with them.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;interpretability</kwd>
        <kwd>explainability</kwd>
        <kwd>statistical homogeneity 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Computer scientists are intensively researching the concepts of explainable and interpretable
artificial intelligence. Although the black box model provides high classification accuracy, it no
longer fully satisfies researchers or users regarding its interpretability [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The issues of trust in
machine-human systems come to the fore. This fact is especially evident in medical applications
with extremely high error costs. The problem of trust in the conclusions of artificial intelligence is
closely related to understanding the logical mechanism. Four requirements are imposed on
explainable and interpretable artificial intelligence: it should inspire trust, demonstrate logical
functioning, have the property of generalization, and be able to discover new data [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. In other
words, interpretable artificial intelligence should not raise doubts about the correctness of its
algorithms, should be able to identify logical cause-and-effect relationships between the original
data and the final result, generalize them to new data, and generate new knowledge. Interestingly,
in recent works, authors have begun to consider explainable and interpretable artificial intelligence
as different entities, although previously they
were considered interchangeable concepts.
      </p>
      <p>Interpretability is treated as the development and application of an understandable model, and
explainability now means understanding the relationship between input data and the result. A
typical example of an interpretable model is a decision tree, in which each step of logical inference
is understandable, and, for example, a convolutional neural network is an example of an
explainable but not interpretable model, since we know that the neural network minimizes the loss
function, but do not know what features it generates and includes in the model.</p>
      <p>
        The concepts of explainability and interpretability have a pronounced psychological and
subjective nature [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. For example, a mathematician considers a linear regression model
interpretable because he knows how it is structured and how it works. Still, a physician does not
know this information and, as a result, this model will remain a black box for him. This fact creates
some difficulties in perceiving the results of such systems on the part of physicians, who, for
natural reasons, mistrust the conclusions of such models. To eliminate such mistrust, proposing a
concept of explainable and interpretable artificial intelligence that would be a priori
understandable to both specialists and non-specialists in computer science is necessary. Such a
concept should appeal not to mathematical competence, but to common sense, which all thinking
beings have. We believe the statistical idea of machine learning in artificial intelligence satisfies
this requirement best.
      </p>
      <p>
        Artificial intelligence models have found wide application in medicine [4 7]. Explainable
artificial intelligence models that analyze medical images have proven particularly useful [8 10].
However, these models' utility results from a compromise between explainability and
interpretability since physicians do not understand how the model is constructed and limit
themselves to explaining the input and output data (explainability). In contrast, interpretability
remains the prerogative of mathematicians [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The content of the explainability concept is disclosed in the work [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], in which the authors
reduced it to three points: 1) explainability of input data; 2) explainability of output data; 3)
explainability of the algorithm. Based on the above, we must recognize that this structure requires
clarification, since the third point appeals to the user's competence. In the work [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], the authors
classified the known models by the degree of their explainability, arguing that linear regression,
logistic regression, decision trees, kNN method, rule-based inference algorithms, generalized
additive models, and Bayesian models are self-evident. The authors consider the random forest
method, SVM, and various neural networks less explainable. In this case, there is a typical
aberration of the professional point of view. After all, the complexity of explaining each of these
models depends on the degree of professionalism of the mathematician, and for a
nonmathematician, all of them are equally incomprehensible. In our view, this deficiency can be
addressed by shifting the emphasis from explainability to interpretability, which is sometimes
called model transparency.
      </p>
      <p>
        In this direction, it is worth highlighting the works of Cynthia Rudin [
        <xref ref-type="bibr" rid="ref15 ref16">15, 16</xref>
        ], devoted to
studying the interpretability of machine learning. Rudin considers explainability and
interpretability to be different properties of machine learning and suggests not to explain the work
of black boxes, but to develop transparent, interpretable models. This approach is correct, but
Rudin also does not go beyond traditional models, ignoring the subjectivity of interpretability
assessments if they depend on the degree of competence of a specialist. This fact is especially
evident in medical applications. For example, how can a mathematician explain the input data if he
has no idea what condensed chromatin is in Feulgen-stained buccal epithelial nuclei? In turn, a
doctor cannot say anything about a nonparametric criterion for assessing the homogeneity of
samples containing measurements of the level of condensed chromatin in healthy people and
patients with breast cancer. They do not have a common point of view. We propose such a point of
view as the concept of results typicality expressed by the elliptical statistical depth. The
explanation of typicality does not require mathematical knowledge, but it is based on common
sense: the greater the statistical depth of a result, the more typical it is. For example, the more
statistical depth of a patient's features, the more likely the patient is sick. This does not mean that
she is more seriously ill. It means a higher probability (but not the subjective confidence of the
doctor) that the patient is sick.
      </p>
      <p>
        An excellent analysis of the psychological foundations of explainability and interpretability was
given by David Broniatowski [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Based on the analysis of the literature on experimental
194
psychology in the field of interpretation of numerical data, the author proves that the concepts of
interpretability and explainability reflect different requirements for machine learning algorithms.
From the author's point of view, interpretability is the ability to understand to what extent the
output of a machine learning model corresponds to its intended purpose, as well as to the goals and
preferences of its users. In turn, explainability means the ability to accurately understand the
mechanism of obtaining the result in order to improve the algorithm. The author analyzes the
psychological aspects of decision-making, in particular, distinguishing between users who prefer to
make decisions based on detailed explanations and users who want to receive meaningful
interpretations of the model's output. This aspect is clearly manifested in diagnostic systems used
by doctors and patients. The doctor must be confident in the diagnosis, since he is legally
responsible for it, so explainability and interpretability are equally important for him, and it is
important for the patient to know the level of reliability of the diagnosis in order to make a
decision on further treatment, so interpretability is more important for him than explainability.
      </p>
      <p>It is obvious that machine learning systems should have both explainability and interpretability.
The only question is in what proportions. Currently, more attention is paid to explainability, and
relatively little attention has been paid to the interpretability of machine learning models. As
research in the field of experimental psychology of numerical stimuli shows, people understand the
concept of interpretability by spacing and connecting the output of a model with its inference
engine by spacing. According to Broniatowski, it is necessary to study to what extent this issue can
be automated, since this problem is still poorly understood. Summarizing his analysis,
Broniatowski argues that interpretable models should take into account the context of the user's
knowledge and present the results in a simple form, justifying their reliability.</p>
      <p>According to Rudin, many machine learning models are too complex for humans to understand.
For this reason, their explanation usually boils down to a theoretical description of the inference
mechanism rather than a description of its actual implementation. Focusing on the explainability of
machine learning models and ignoring issues related to their interpretability hinders the
widespread use of machine learning models. Cynthia Rudin makes several points about
interpretability: 1) accurate models do not have to be complex; 2) explanations of machine learning
methods often do not match the computations of the original model; 3) explanations are often
meaningless or unclear; 4) unexplainable systems should not be used in high-risk situations; 5)
unexplainable systems complicate the human decision-making process.</p>
      <p>
        As noted above, reliable explanations are essential for machine learning models involving
highrisk decisions (particularly in medical applications). It is natural to use algorithms that reveal the
decision-making mechanism for such models. At the same time, the commercial interests of
corporations put the protection of decision-making mechanisms from copying to the forefront,
preventing their explanation and interpretation. There is even a separate line of research devoted
to finding a compromise between explainability and interpretability, on the one hand, and
preserving commercial secrets, on the other [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ].
      </p>
      <p>Of course, many statistical methods are already widely used in machine learning (logistic
regression, Bayesian methods, and many others), but each has limitations in its explainability and
interpretability. For example, logistic regression allows you to find the probability of a particular
event only if the probability distribution of this event obeys the Bernoulli distribution with a
specific parameter. Bayesian classification methods are relative; they allow you to compare
estimates of the probability of an event, but do not estimate these probabilities themselves. As a
result, their explainability and interpretability are pretty weak.</p>
      <p>We propose 1) new postulates of statistical machine learning; 2) a new method for assessing the
homogeneity of objects; 3) a new method for assessing the typicality of an object based on its
statistical depth in the prediction set; 4) a new method for ranking random points in a
multidimensional space; 5) a new concept of interpretability of machine learning algorithms;
6) dimensionality reduction method. Consider them step by step.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Homogeneity measure</title>
      <p>Machine learning is based on two key principles, which we have previously alluded to: first, objects
should be represented as feature vectors within a feature vector space; second, feature vectors
representing items within the same class are closer together in the feature space than those from
different classes. The first principle reflects the tendency of machine learning practitioners to
utilize algebra, geometry, and optimization techniques. This allows us to frame machine learning
challenges as optimization problems, explicitly focusing on minimizing or maximizing a function
under certain constraints. The second principle suggests a relatively simple function can separate
these vector sets. Techniques such as Fisher's linear discriminant, support vector machines, and the
nearest neighbor approach are notable examples built on these foundations.</p>
      <p>However, despite the success of these methods, it is essential to recognize that the vector space
and compactness postulates do not apply universally. In many medical and biological contexts, a
patient is not represented by a single feature vector an ordered set of numerical characteristics
but rather by a random sample, an unordered collection of measurements (e.g., nuclear area, optical
density). For instance, when analyzing samples from a patient, which may consist of dozens of
cells, the patient is represented as a cloud of points rather than a single point in vector space.
While averaging these sample values can simplify the process and allow for the application of the
established postulates, it also results in a loss of significant information regarding the distribution
of the measured parameters.</p>
      <p>We propose alternative statistical hypotheses: 1) sample parameter values can represent objects,
and 2) parameters of objects within the same class exhibit similar distributions, while those from
different classes show distinct distributions. This approach allows us to tackle the challenge of
assessing similarity between objects by verifying whether two or more samples are homogeneous.
The method we suggest for determining similarity is outlined below. It possesses statistical
universality, meaning it performs consistently across samples with varying means and identical
standard deviations, as well as samples with the same means but differing standard deviations
unlike traditional methods such as the Kolmogorov-Smirnov and Mann-Whitney-Wilcoxon tests.</p>
      <p>
        Consider a sample of size n composed of continuous random variables drawn from a
exchangeable distribution. According to Hill's assumption, the probability that a random value
from the same distribution falls between the i-th and j-th order statistics of the sample is given by
(j-i)/(n+1) [
        <xref ref-type="bibr" rid="ref18 ref19">18, 19</xref>
        ]. Notably, the only factors influencing this probability are the sample size and the
order numbers of the statistics.
      </p>
      <p>This insight enables us to test the hypothesis of homogeneity between two samples. To do this,
we first arrange the elements of the first sample in ascending order to obtain its order statistics.
Next, we calculate the relative frequency of occurrences from the second sample that fall between
the i-th and j-th order statistics of the first sample.</p>
      <p>Using these relative frequencies, we can construct a confidence interval (for example, the
Wilson interval) for the binomial proportion in the generalized Bernoulli framework we are
examining. We then assess whether this confidence interval covers the value (j i)/(n+1).</p>
      <p>
        To quantify this, we compute the so-called p-statistics by measuring the relative frequency of
the event in question. Finally, we establish a confidence interval for the p-statistics based on a
predetermined significance level. If this confidence interval does not include 1 , were  is the
given confidence level, we reject the null hypothesis of homogeneity [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Statistical depth</title>
      <p>
        For a comprehensive overview of the various concepts related to statistical depth, refer to [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. The
primary objective of these concepts is to establish an ordering of multidimensional random
variables..
      </p>
      <p>
        Consider a distribution D. A depth function d is defined to order points from this distribution in
a manner that monotonically decreases from the center outward. The depth of a point x is
represented as d(x) [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The center of a distribution can be defined in various ways, such as the
median, centroid, or geometric center. A depth function must satisfy the following properties [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]:
1. Affine Invariance: The depth function should be independent of the coordinate system
used and should remain unchanged under affine transformations.
2. Maximum at the Center: The depth function attains its maximum value at the center of
the distribution, which is the point of greatest depth.
3. Monotonicity: The depth function must decrease monotonically from the deepest point to
the least deep points.
4. Limit Property: As the distance from a point x to the center of the distribution approaches
infinity, the depth must approach zero.
      </p>
      <p>
        When we lack specific information about the distribution D but have a sample containing n
points from it, we denote this sample as S. Below are examples of different depth functions.
1. Tukey Depth [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ]: To understand Tukey depth, we first need to define a center of a
sample as a point such that every hyperplane passing through it divides the sample into
two nearly equal subsets. When this point is part of the sample, it corresponds to the
sample's median. The Tukey depth of a sample element x is defined as the minimum
number of sample elements that lie on one side of a random hyperplane passing through x.
2. Convex Hulls Peeling [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ]: The convex hull of a set of points is the smallest polygon that
encompasses all the given points. Convex hull peeling is a method that involves
sequentially identifying and removing enclosed convex hulls. All vertices of the same
convex hull share the same statistical depth.
3. Oja Depth [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ]: The Oja depth of a sample element x is calculated as the average volume of
the simplex formed by d random sample points and the point x.
4. Simplex Depth [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]: The simplex depth of a sample element x is defined as the number of
simplexes formed by a random sample of points that include x.
5. Zonoid depth [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ]. The zonoid depth of a sample element x is the number
d (x | x1,..., xn ) = sup : y  D (x1,..., xn ), where
      </p>
      <p> n n 1 
D (x1,..., xn ) = i xi : i = 1, 0  i ,i : i  </p>
      <p>
         i=1 i=1 n 
Mahalanobis depth [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. The Mahalanobis distance is a generalization of the Mahalanobis
distance. It is defined by the formula MHDF (x) = (1+ d 2 (x, E(F)))−1, where
d 2 (x, y) = (x − y)T −F1(x − y) , E(F) is the distribution expectation, and F is the
covariance matrix/
Elliptical statistical depth [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. Elliptical statistical depth is a function that maps points
of sample to increasing ranks using the confidence Petunin ellipsoids [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ]. These ellipsoids
are concentric and cover a sample. Thus, we have a sequence of ellipsoids
E1  E2  ...  En . Every sample point lies on a surface of only one ellipsoid, and the
n −1
probability that a random point from F lies in En is . Thus, the elliptical statistical
n +1
depth is a monotonous function that attains a maximum at the deepest point and decrease
from the center to outward.
      </p>
      <p>
        Depth-ordered regions [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] is a set of points where the statistical depth is greater or
equal to a given value D (F ) = x  Rd : DF ( x)   , where DF ( x ) is a statistical depth of
the point x obeing F . Depth-ordered regions are affine equivariant, nested, monotonical,
compact, and subaddituce. Obviously, the Petunin ellipsoids are depth-ordered regions.
      </p>
    </sec>
    <sec id="sec-4">
      <title>4. Petunin ellipsoids</title>
      <p>Consider is a set of random points X = x1,..., xn , xi  d . For simplicity and easy visualization, we
shall descride the case d = 2 (Petunin ellipses).</p>
      <p>Find a convex hull of X = ( x1, y1 ),...,( xn , yn ) and a diameter of this convex hull with ends
( xk , yk ) and ( xl , yl ) . Connect these point by a segment L . Find points ( xr , yr ) and ( xq , yq ) that are
most distant from L . Find segments L1 and L2 passing through ( xr , yr ) and ( xq , yq ) parallel to L .
Find segments L3 and L4 passing through ( xk , yk ) and ( xl , yl ) orthogonal to L and. Segments L1 ,
L2 , L3 and L4 are sides of a rectangle  . Let us denote by a a short side and by b a long side).</p>
      <p>Translate, rotate and shrink  with a coefficient  = a to obtain a square  with a center
b
( x0 , y0 ) . The random points X = ( x1, y1 ),...,( xn , yn ) are mapped to points ( x1, y1) , ( x2 , y2 ) , ...,
( xn , yn )   . Find distances r1, r2 ,..., rn between ( x0 , y0 ) and ( x1, y1) , ( x2 , y2 ) , ..., ( xn , yn )   . Find
R = max (r1, r2 ,..., rn ) . Consider a circle C</p>
      <p>with the center ( x0 , y0 ) and radius R containing
( x1, y1) , ( x2 , y2 ) , ..., ( xn , yn ) . Perform inverse transformations of C . As a result, we obtain an
ellipse E containing points X = ( x1, y1 ),...,( xn , yn ) .</p>
      <p>We can generalize this algorithm to construct a Petunin ellipsoids. Construct a convex hull of
X = x1,..., xn , xi  d . Find ends of a diameter of the convex hull ( xk , yk ) and ( xl , yl ) . Align the
diameter along to Ox1 . Project points ( x1, y1) , ( x2 , y2 ) , ..., ( xn , yn ) to the orthogonal complement of
Ox1 . Construct a convex hull of projections, rotate and translate it up to a two-dimensional
rectangle  . Construct an axis-aligned parallelogram of minimum volume in d-dimensional space
containing the projections of input points. Shrink this parallelogram to hypercube. Find center x0
of the hypercube and the distances r1, r2 ,..., rn from x0 to x1,..., xn . Compute R = max (r1, r2 ,..., rn ) .
Construct a hypersphere with the center x0 and radius R . Make the inverse transformations. The
result of the operations is the Petunin ellipsoid covering X = x1,..., xn .
distribution is equal to n −1.</p>
      <p>n +1</p>
      <p>We also note an essential property of Petunin ellipsoids: their concentricity. This property
allows for automatic and unambiguous ranking of multidimensional points. Ellipses in this case are
chosen for convenient visualization.</p>
      <p>Therefore, we can find most and least probable points of a sample. The deepest point has the
highest statistical depth.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Similarity space and Petunin ellipses</title>
      <p>Duin and Pekalska [31 34] and others in their works proposed the concept of relational
discriminant analysis. They suggested replacing the feature vector of an object with an estimate of
its proximity to some training set using a metrics or a measure of proximity between random
198
samples. This approach is well suited to solving problems often encountered in biomedical
research. Let's say a researcher studies the parameters of a set of cells. In this case, it gets samples
of real numbers, not an ordered vector. In such cases, the metric is not applicable and the only
useful tool is the uniformity measure.</p>
      <p>The homogeneity measure described above can be used to solve dimensionality reduction and
feature selection. To do this, we calculate the measure of homogeneity between the samples from
G1 and G2 for two features, i-th and j-th, and consider the matrices of features of the k-th object
from G1 and the l-th object from G2, where is the number of features, and is the number of
measured values of each feature.</p>
      <p>Uk =  u2(k1)
v2(ln)  .</p>
      <p>Denote the ith columns corresponding to ith feature from uk and vl as Ui(k) = (u1(ik) ,u2(ki) ,...,um(ki) )T
and Vi(l) = (v1(il) ,v2i ,...,vm(li) )T . Then, compute p-statistics for samples Ui(k) and Vi(l) and find the
(l)
vector of p-statistics for uk and vl with respect to every feature:</p>
      <p>k(1l) =  (U1(k) ,V1(l) ), k(l2) =  (U2(k) ,V2(l) ), ..., k(ln) =  (U N(k) ,VN(l) ).</p>
      <p>Then, compute the average p-statistics.</p>
      <p> k(1) = 1 N k(1t) ,  k(2) = 1 N k(t2) , ...,  k(n) = 1 N k(tn) .</p>
      <p>N t=1 N t=1 N t=1
for Uk and an object from G2 with respect to ith feature. This scheme allows estimating the
proximity of Uk to other object from G1.</p>
      <p>Pairing p-statistics we form a proximity vector space corresponding to ith and jth features:
( t(i) , t( j) ) and ( s(i) , s( j) ) , i, j =1,2,...,m; t, s =1,2,...,n. Thus, we have two sets of points consisting
of average interclass homogeneity measure and average intraclass homogeneity measure in the
proximity space but not feature space. This allow using any method of classification developed for
metric spaces but in the proximity measure of less dimensions. The average intraclass homogeneity
measure allows estimating intrinsic diversity of objects in the population, and the average
intraclass homogeneity measure allows estimating the feature significance.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Uncertainty and Petunin ellipses</title>
      <p>When using Petunin ellipses to classify an object, uncertainty may arise: the point corresponding
to the object may not fall into any ellipses or into their intersection. In turn, the intersection may
also be such that one ellipse completely covers the other. In this case, you can use the remarkable
property of Petunin ellipses, namely, their concentricity. Since at the penultimate stage of
constructing the Petunin ellipse, we obtain concentric circles containing only one point, we can
automatically rank the points by statistical depth, simply by calculating the circle number relative
to the center of gravity of the points. Next, we alternately include the point under study in one or
another set of training samples and find its statistical depth in each of them. By comparing these
statistical depths, we assign the point to the set with greater statistical depth.</p>
      <p>Knowing the number of the ellipse on which the test point lies, we can even estimate the
probability with which it belongs to the class. We can decide with a given significance level by
constructing a confidence interval for this probability. This fact allows for a significant increase in
classification sensitivity by eliminating uncertainty</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion</title>
      <p>Based on the homogeneity of random variables, the paper proposes consider that objects belong to
the same class if the random values of their features have the same distribution. The paper
describes the constituent parts of the proposed theory (homogeneity measure, prediction set as
Petunin ellipse, statistical depth based on Petunin ellipsoids, and decreasing of dimensionality
using the similarity space). Based on the statistical postulates of machine learning, it is shown that
the proposed machine learning algorithms can be classified as interpretable and explainable. We
proposed and justified new postulates of statistical machine learning; a new method for assessing
the homogeneity of objects; a new method for assessing the typicality of an object based on its
statistical depth in the prediction set; a new method for ranking random points in a
multidimensional space; a new concept of interpretability of machine learning algorithms, and
dimensionality reduction method.</p>
    </sec>
    <sec id="sec-8">
      <title>8. Declaration on Generative AI</title>
      <p>The author have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead</article-title>
          .
          <source>Nature Machine Intelligence</source>
          <volume>1</volume>
          (
          <year>2019</year>
          )
          <fpage>206</fpage>
          215. doi:
          <volume>10</volume>
          .1038/s42256-019-0048-x.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lipton</surname>
          </string-name>
          ,
          <article-title>The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery</article-title>
          .
          <source>ACM Queue</source>
          <volume>16</volume>
          (
          <issue>3</issue>
          ) (
          <year>2018</year>
          ):
          <fpage>31</fpage>
          <lpage>57</lpage>
          . doi:
          <volume>10</volume>
          .1145/3236386.3241340.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Broniatowski</surname>
          </string-name>
          ,
          <source>Psychological Foundations of Explainability and Interpretability in Artificial Intelligence</source>
          ,
          <source>NISTIR</source>
          (
          <year>2021</year>
          )
          <article-title>8367</article-title>
          . doi:
          <volume>10</volume>
          .6028/NIST.IR.
          <volume>8367</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>R.-K. Sheu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Pardeshi</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          <article-title>Survey on Medical Explainable AI (XAI): Recent Progress, Explainability Approach, Human Interaction and Scoring System</article-title>
          ,
          <source>Sensors</source>
          <volume>22</volume>
          (
          <year>2022</year>
          )
          <article-title>8068</article-title>
          . doi:
          <volume>10</volume>
          .3390/s22208068.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Lund</surname>
          </string-name>
          ,
          <source>Applications of Explainable Artificial Intelligence in Diagnosis and Surgery, Diagnostics</source>
          <volume>12</volume>
          (
          <issue>2</issue>
          ) (
          <year>2022</year>
          )
          <article-title>237</article-title>
          . doi:
          <volume>10</volume>
          .3390/diagnostics12020237.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Amann</surname>
          </string-name>
          et al.,
          <article-title>To explain or not to explain? Artificial intelligence explainability in clinical decision support systems</article-title>
          ,
          <source>PLOS Digital Health</source>
          <volume>1</volume>
          (
          <issue>2</issue>
          ) (
          <year>2022</year>
          ) e0000016, doi:10.1371/journal.pdig.
          <volume>0000016</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>W.</given-names>
            <surname>Bi</surname>
          </string-name>
          et al.,
          <article-title>Artificial intelligence in cancer imaging: Clinical challenges and applications</article-title>
          .
          <source>CA: A Cancer Journal for Clinicians</source>
          <volume>69</volume>
          (
          <issue>2</issue>
          ) (
          <year>2019</year>
          )
          <fpage>127</fpage>
          157. doi:
          <volume>10</volume>
          .3322/caac.21552.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          et al.,
          <article-title>Artificial intelligence for assisting cancer diagnosis and treatment in the era of precision medicine</article-title>
          ,
          <source>Cancer Communications</source>
          <volume>41</volume>
          (
          <issue>11</issue>
          ) (
          <year>2021</year>
          )
          <fpage>1100</fpage>
          1115. doi:
          <volume>10</volume>
          .1002/cac2.
          <fpage>12215</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Borys</surname>
          </string-name>
          et al.,
          <string-name>
            <surname>Explainable</surname>
            <given-names>AI</given-names>
          </string-name>
          <article-title>in medical imaging: An overview for clinical practitioners Saliency-based XAI approaches</article-title>
          ,
          <source>European Journal of Radiology</source>
          <volume>162</volume>
          (
          <year>2023</year>
          )
          <article-title>110787</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.ejrad.
          <year>2023</year>
          .
          <volume>110787</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A.</given-names>
            <surname>Chaddad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bouridane</surname>
          </string-name>
          ,
          <source>Survey of Explainable AI Techniques in Healthcare, Sensors</source>
          <volume>23</volume>
          (
          <issue>2</issue>
          ) (
          <year>2023</year>
          )
          <article-title>634</article-title>
          . doi:
          <volume>10</volume>
          .3390/s23020634.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Nazir</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dickson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Akram</surname>
          </string-name>
          ,
          <article-title>Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks</article-title>
          .
          <source>Computers in Biology and Medicine</source>
          <volume>156</volume>
          (
          <year>2023</year>
          )
          <fpage>106668</fpage>
          . . doi:
          <volume>10</volume>
          .1016/j.compbiomed.
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Folke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Shafto</surname>
          </string-name>
          ,
          <article-title>A psychological theory of explainability</article-title>
          .
          <source>In: Proceedings of the 39th International Conference on Machine Learning</source>
          , Baltimore, Maryland, USA, PMLR
          <volume>162</volume>
          .
          <article-title>(</article-title>
          <year>2022</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>P.</given-names>
            <surname>Love</surname>
          </string-name>
          et al.,
          <source>Explainable Artificial Intelligence</source>
          (XAI): Precepts, Methods, and Opportunities for Research in Construction, arXiv:
          <fpage>2211</fpage>
          .
          <string-name>
            <surname>06579v2</surname>
          </string-name>
          (
          <year>2022</year>
          ) doi:10.48550/arXiv.2211.06579.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Arrieta</surname>
          </string-name>
          et al.,
          <string-name>
            <surname>Explainable Artificial</surname>
          </string-name>
          <article-title>Intelligence (XAI): Concepts, taxonomies, opportunities, and challenges toward responsible AI</article-title>
          ,
          <source>Information Fusion</source>
          <volume>58</volume>
          (
          <year>2022</year>
          ) 82 115, doi:10.1016/j.inffus.
          <year>2019</year>
          .
          <volume>12</volume>
          .012.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>K.</given-names>
            <surname>Sokol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Flach</surname>
          </string-name>
          , Explainability Fact Sheets:
          <article-title>A Framework for Systematic Assessment of Explainable Approaches</article-title>
          , arXiv:
          <year>1912</year>
          .
          <year>05100v</year>
          . (
          <year>2019</year>
          ) doi:10.1145/3351095.3372870.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          et al.,
          <article-title>Interpretable machine learning: Fundamental principles and 10 grand challenges</article-title>
          .
          <source>Statistical Surveys</source>
          <volume>16</volume>
          (
          <year>2022</year>
          )
          <article-title>1 85</article-title>
          . doi:
          <volume>10</volume>
          .1214/21-
          <fpage>SS133</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>C.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          ,
          <article-title>Models That Are Interpretable But Not Transparent</article-title>
          . arXiv:
          <volume>2502</volume>
          .19502 (
          <year>2025</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2502.19502.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>B.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <source>Journal of American Statistical Association</source>
          <volume>63</volume>
          (
          <year>1968</year>
          )
          <fpage>677</fpage>
          <lpage>691</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>A</surname>
          </string-name>
          <article-title>(n) or Bayesian nonparametric predictive inference (with discussion)</article-title>
          . In: D. V.
          <string-name>
            <surname>Lindley</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          <string-name>
            <surname>Bernardo</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. H. DeGroot</surname>
          </string-name>
          , &amp;
          <string-name>
            <surname>A. F. M. Smith</surname>
          </string-name>
          (Eds.), Bayesian statistics (
          <year>1988</year>
          , Vol.
          <volume>3</volume>
          , pp.
          <fpage>211</fpage>
          <lpage>241</lpage>
          ). Oxford: Oxford University Press.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D.</given-names>
            <surname>Klyushin</surname>
          </string-name>
          , Yu. Petunin,
          <article-title>A Nonparametric Test for the Equivalence of Populations Based on a Measure of Proximity of Samples</article-title>
          .
          <source>Ukrainian Mathematical Journal</source>
          <volume>55</volume>
          (
          <issue>2</issue>
          ) (
          <year>2003</year>
          ) 181
          <fpage>198</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>K.</given-names>
            <surname>Mosler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mozharovskyi</surname>
          </string-name>
          ,
          <article-title>Choosing among notions of multivariate depth statistics</article-title>
          .
          <source>Statistical Science</source>
          <volume>37</volume>
          (
          <issue>3</issue>
          ) (
          <year>2022</year>
          )
          <fpage>348</fpage>
          368. doi:
          <volume>10</volume>
          .1214/
          <fpage>21</fpage>
          -
          <lpage>sts827</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Serfling</surname>
          </string-name>
          ,
          <article-title>General notions of statistical depth function</article-title>
          ,
          <source>Annals of Statistics</source>
          <volume>28</volume>
          (
          <year>2000</year>
          )
          <fpage>461</fpage>
          482. doi:
          <volume>10</volume>
          .1214/aos/1016218226.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tukey</surname>
          </string-name>
          ,
          <article-title>Mathematics and the picturing of data</article-title>
          .
          <source>In: Proceedings of the International Congress of Mathematician</source>
          , Montreal, Canada,
          <year>1975</year>
          , pp.
          <fpage>523</fpage>
          <lpage>531</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>V.</given-names>
            <surname>Barnett</surname>
          </string-name>
          ,
          <article-title>The ordering of multivariate data</article-title>
          ,
          <source>Journal of the Royal Statistical Society</source>
          , Series A (
          <year>General</year>
          )
          <volume>139</volume>
          (
          <issue>3</issue>
          ) (
          <year>1976</year>
          ) 318
          <fpage>355</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>H.</given-names>
            <surname>Oja</surname>
          </string-name>
          , Descriptive statistics for multivariate distributions,
          <source>Statistics and Probability Letters</source>
          <volume>1</volume>
          (
          <year>1983</year>
          )
          <fpage>327</fpage>
          332. doi:
          <volume>10</volume>
          .1016/
          <fpage>0167</fpage>
          -
          <lpage>7152</lpage>
          (
          <issue>83</issue>
          )
          <fpage>90054</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <article-title>On a notion of data depth based on random simplices</article-title>
          ,
          <source>Annals of Statistics 18: 405</source>
          <volume>414</volume>
          (
          <year>1990</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>G.</given-names>
            <surname>Koshevoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mosler</surname>
          </string-name>
          ,
          <article-title>Zonoid trimming for multivariate distributions</article-title>
          .
          <source>Annals of Statistics</source>
          <volume>25</volume>
          (
          <year>1997</year>
          )
          <year>1998</year>
          2017. doi:
          <volume>10</volume>
          .1214/aos/1069362382.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lyashko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Klyushin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Alexeyenko</surname>
          </string-name>
          ,
          <article-title>Mulrivariate ranking using elliptical peeling</article-title>
          .
          <source>Cybernetic and Systems Analysis</source>
          <volume>49</volume>
          (
          <issue>4</issue>
          ):
          <fpage>511</fpage>
          <lpage>516</lpage>
          . doi:
          <volume>10</volume>
          .1007/s10559-013-9536-x (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <surname>Yu. Petunin</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Rublev</surname>
          </string-name>
          .
          <article-title>Pattern recognition using quadratic discriminant functions</article-title>
          .
          <source>Numerical and Applied Mathematics</source>
          <volume>80</volume>
          (
          <year>1996</year>
          ) 89
          <fpage>104</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <surname>I. Cascos</surname>
          </string-name>
          ,
          <article-title>Depth function as based of a number of observation of a random vector</article-title>
          .
          <source>Working Paper 07-29, Statistic and Econometric Series</source>
          <volume>2</volume>
          (
          <year>2007</year>
          ) 1
          <fpage>28</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>R.P.W.</given-names>
            <surname>Duin</surname>
          </string-name>
          , D. de Ridder,
          <string-name>
            <given-names>D.N.J.</given-names>
            <surname>Tax</surname>
          </string-name>
          ,
          <article-title>Experiments with a featureless approach to pattern recognition</article-title>
          ,
          <source>Pattern Recognit Lett</source>
          <volume>18</volume>
          (
          <year>1997</year>
          )
          <fpage>1159</fpage>
          1166. doi:
          <volume>10</volume>
          .1016/S0167-
          <volume>8655</volume>
          (
          <issue>97</issue>
          )
          <fpage>00138</fpage>
          -
          <lpage>4</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <given-names>R.P.W.</given-names>
            <surname>Duin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Pekalska</surname>
          </string-name>
          , D. de Ridder,
          <article-title>Relational discriminant analysis</article-title>
          ,
          <source>Pattern Recognition Letters</source>
          <volume>20</volume>
          (
          <year>1999</year>
          )
          <fpage>1175</fpage>
          1181. doi:
          <volume>10</volume>
          .1016/S0167-
          <volume>8655</volume>
          (
          <issue>99</issue>
          )
          <fpage>00085</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pekalska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.P.W.</given-names>
            <surname>Duin</surname>
          </string-name>
          ,
          <article-title>On combining dissimilarity representations</article-title>
          , in: J.
          <string-name>
            <surname>Kittler</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Roli</surname>
          </string-name>
          (Eds.),
          <source>Multiple Classifier Systems, LNCS</source>
          , vol.
          <source>2096</source>
          , Springer-Verlag,
          <year>2001</year>
          , pp.
          <fpage>359</fpage>
          <lpage>368</lpage>
          . doi:
          <volume>10</volume>
          .1007/3-540-48219-9_
          <fpage>36</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>E.</given-names>
            <surname>Pekalska</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.P.W.</given-names>
            <surname>Duin</surname>
          </string-name>
          ,
          <article-title>The Dissimilarity Representation for Pattern Recognition, Foundations</article-title>
          and Applications, World Scientific, Singapore,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>