<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>S. Dolgikh);</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>multifactorial constrained prior data spaces for intelligent decision systems⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Serge Dolgikh</string-name>
          <email>sdolgikh@kai.edu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oksana Mulesa</string-name>
          <email>oksana.mulesa@unipo.sk</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Volodymyr Sabadosh</string-name>
          <email>vsabadosh@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Aviation University</institution>
          ,
          <addr-line>Lubomyra Huzara 1, Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Presov</institution>
          ,
          <addr-line>Presov</addr-line>
          ,
          <country country="SK">Slovakia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Uzhhorod National University</institution>
          ,
          <addr-line>Universytetska St 14, Uzhhorod</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>In the field of artificial intelligent decision systems, the challenge of learning to construct effective decisions in problems, scenarios and environments characterized by significant uncertainty is encountered commonly. An extensive body of research has been devoted to the development of learning processes and methods which are able to operate within the constraints of uncertainty often described by the shortage of prior information about the problem distribution while having the ability to produce effective decisions and improve their quality in the process. Due to the nature of the constraints in this type of problem, the necessary core capacity of such methods is the ability to extract maximum information from the problem data, including in the raw form and utilize it effectively for the construction of correct, i.e., empirically successful decisions. In this work, we propose and demonstrate an intelligent process of analysis and construction of the conceptual structure of problem data in the “constrained prior” context that requires effective learning with minimal prior data, based on the determination of a structure of probabilistic, “fuzzy” prototype classes/regions. The process and application of iterative learning, starting with minimal sets of problem data is demonstrated with a model dataset of images of basic geometric shapes. The proposed approach demonstrated an effective ability to learn the conceptual structure of problem data with minimal samplings and improve the quality of learning and associated decisions over learning iterations.</p>
      </abstract>
      <kwd-group>
        <kwd>Intelligent decision systems</kwd>
        <kwd>concept learning</kwd>
        <kwd>prototype learning</kwd>
        <kwd>clustering</kwd>
        <kwd>fuzzy clustering 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Intelligent decision systems which find applications in many functions and domains of modern
society and technology depend on correct interpretation of inputs, expressed in certain measurable
parameters that described the data space of the problem. It is logical that similar inputs or
observations should induce similar decisions: this logic allows a decision that has been verified as
correct and effective for one input sample to be applied to the class of inputs that are essentially
similar to it, allowing for effective (the decisions produced by the system are consistent) and efficient
(completely new decisions do not need to be constructed for every new input) process and models
of constructing decisions.</p>
      <p>However, the problem of how the relationship of essential similarity between inputs in general
data spaces can be determined appears to be not so trivial. Particular challenge is presented by the
cases and scenarios where the knowledge or information about the distribution of data points in the
problem space is not available “at prior” that is, to a system in the training regime before it can be
set in operation as is the case with conventional methods of supervised classification [1]. In such
cases that can be designated as “learning with constrained prior” intelligent systems must possess
the ability to bring out, determine or calculate the relationship of similarity directly from samplings
of data in the problem space and without massive prior information about characteristics of its
distribution [2]. Developing approaches to deal with this type of problems is the subject of this work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Prior work</title>
      <p>The subject of this study lies at the conjunction of several actively researched directions and fields
in data science and theory and practice of intelligent systems. The region of the problems we
approach here can be defined as learning with constrained prior information, “learning with
constrained prior” problem, that is, data with limited and/or insufficient prior knowledge about
the distribution of the problem to employ conventional methods of machine intelligence. In this
regard, methods of self-supervised, unsupervised and generative learning and dimensionality
reduction were proven effective in their ability to identify and determine characteristic structures
of types or patterns of similarity with a wide array of data of realistic complex types.</p>
      <p>To reduce the dimensionality, improve the interpretability of data and, as a result, reduce the
computational complexity of subsequence methods of analysis, a wide range of methods, both linear
and non-linear were researched, including PCA, SNE [3, 4], prototype learning including with
generative neural models [5, 6], dimensionality reduction and manifold learning [7, 8] and others.
These methods compress data of large dimensions while preserving its essential information content,
not in the least, in the context of this study, the relationship of similarity in the spaces (embeddings)
of informative factors/features. It was shown that these methods can be used in combination with
fuzzy approaches such as fuzzy C-means [9].</p>
      <p>This observation brings us to the field of fuzzy sets that have been applied extensively to problems
in multifactorial data spaces. Papers [10, 11] demonstrated successful applications of fuzzy pattern
recognition and fuzzy modeling to evaluate, compare, select, prioritize, and/or organize alternative
decision options. Such approaches have been shown to simplify decision-making processes and
reduce their complexity.</p>
      <p>Fuzzy models are effectively used in intelligent decision-making systems. Such integration allows
to work effectively with multidimensional data and ensure the accuracy of decisions in complex
conditions of uncertainty.</p>
      <p>For example, [12] presented a novel neuro-fuzzy diagnostic system based on a non-iterative ANN
and an original fuzzy information model. In [13] a novel effective decision-making method aimed to
assist in clinical practice based on integrated fuzzy information models and data mining was
proposed.</p>
      <p>An interesting approach demonstrated recently in [14] is a combination of prototype models and
fuzzy methods such as fuzzy C-means in the analysis of conceptual data structures. The method
demonstrated the potential for accurate classification while managing data uncertainty. A related
study [15] presented a multi-objective optimization method that showed potential in confident
determination of the optimal prototype structures via simultaneous minimization of the training
error on historical training data and minimization of the intra-cluster variance.</p>
      <p>In this work we attempted to advance and refine methods of determination of the prototype
structure, including fuzzy prototypes to the case and essential limitations in the problem of
constrained prior by incorporating optimization and iterative improvement of conceptual models of
problem data.</p>
      <p>In approaching the problem of learning with constrained prior for optimal decisions produced by
intelligent decision systems with arbitrary type of problem data space, we first examine the process
of construction of fuzzy-prototype models with arbitrary samplings of the problem data, attempting
to avoid essential assumptions and/or constraints such as pre-known content of protype classes.
Next, observing that in the constrained setting, general data of the problem, not necessarily
associated with verified outcomes, can be accumulated in the active process of decision making, we
examine quality characteristics of fuzzy prototype models, such as precision/resolution relative to
the size of the data samplings. Formulating these methods and processes allows us to test the
hypothesis that using a combination of iterative accumulation of problem data and construction of
fuzzy-prototype models with more representative and detailed samplings produces more precise
models of the problem data distribution leading to improve quality of decisions based on such
models.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methods</title>
      <p>3.1.</p>
      <sec id="sec-3-1">
        <title>Problem formulation</title>
        <p>In this work we will deal with data that is obtained as a sampling of an unknown distribution D, W
= { P, F } where P = { p }: the individual points of observation that describe the domain, F = { f }, the
observable factors recorded in the sampling. Accordingly, each data point  ∈  is described by a
set of observable factors   =  ( ).</p>
        <p>In the task of interpretation of the data D for making decisions the critical challenge is to establish
an association between an observation  ∈  and a class of similar observations K(x) that can be
associated with a correct decision M(K, a), where a: the parameters of the decision function.</p>
        <p>An additional challenge we will be discussing in this work relaters to the case/scenarios where
the information about the association, the factorization relationship R:</p>
        <p>( ) =   ( ) .
is limited or absent at the outset of the study. In such cases, one cannot rely on a known function or
logical sequence to connect an observed instance of the phenomena of interest to the correct
decision. This range of problems/scenarios will be referred to as “learning with constrained prior”
problem.</p>
        <p>
          In approaching this problem, one is faced with the challenge of determination of classes of
similarity in complex data described by a large set of observable factors F without sufficient prior
information about the factorization relationship (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ), often inferred from sets of known associations
(p, K(p)) known as annotated or labeled data.
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
3.2.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Fuzzy prototype analysis</title>
        <p>
          Prototype analysis is a well-known approach in problems and scenarios where prior information
about a given type of data or distribution, D is limited or not available, confounding or precluding
resolution of the factorization relationship (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) from known associations observation, class [5, 16].
        </p>
        <p>These methods work particularly well with data that can be described or expressed by a large
number of observable factors with a possibility of a strong redundancy in the observation points
described by them (multiple homogeneous observable factors).</p>
        <p>Following the well-researched process of unsupervised analysis and determination of conceptual
structure that is in data science that commonly involves strong reduction of dimensionality, a
structure of natural prototypes or concepts can be derived from a general representative sampling
of the distribution D: P(D) = { pk } that can be interpreted as the basic framework of the essential
types of similarity pk that approximate the distribution:</p>
        <p>∈  → ∃  ,   :  ( ) ∈   ,
where t(x) is the image of the observation x in the informative prototype space (the embedding)
commonly of reduced functionality [5, 6]. In many specific models and methods of unsupervised
embedding, the inverse association from the prototype to the observable space exists as well  ( ) =
 ( ),  ∈  , where G(t): the generative transformation [17].</p>
        <p>
          In this work, we propose an extension of the prototype analysis based on the observation that the
association between an observation and its prototype class (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) in most practical cases with many
methods is not categorical but rather, probabilistic, i.e. the probability of association of an
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
 ( ,  k) =   ( ) ∈   ,
        </p>
        <p>( ,  k) = 1

where E(x): the embedding transformation</p>
        <p>
          →  , W: the probability of a point in the prototype
space belonging to a specific prototype class. The relationship in (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) defines probabilistic or “fuzzy”
association between the observations in the observable data space D and the prototype classes P
resolved with methods of prototype analysis.
        </p>
        <p>The advantage of the methods just described in application to the learning with constrained prior
problem comes from the observation that whereas prior knowledge of the problem can be limited or
constrained, it may not necessarily be the case for the general raw data in the problem data space.</p>
        <p>Moreover, this data can be accumulated in the process of the interactions of the learning system
with the problem data space, with a possibility of iterative, progressive learning from the experience,
based on the results of the earlier observations and learning iterations.</p>
        <p>Then, with the data accumulated through this process, methods of analysis of its informative
structure can be applied, including, as discussed earlier, neural generative learning, prototype
learning, deep dimensionality reduction with preservation of the information content and many
others. This approach can offer essential insights into the conceptual composition of the distribution
of the problem that can be used for determination of the prototype structure and subsequent
applications in intelligent decision systems.
3.3.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Integration with decision systems</title>
        <p>
          observation  ∈  to the prototype class pk is described by the prototype probability distribution
ρ(x, pk):
Application of the fuzzy prototype analysis in intelligent decision systems operating in the context
of constrained prior problems can be proposed straightforwardly via the construction of decisions
based on the natural, intrinsic similarity of the observations, as:
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
 ( ),  ( ) ∈
        </p>
        <p>→  ( ) ≅  ( )
where D(x), D(y): decisions produced (constructed) for the observations x, y,  ( ),  ( ): their images
in the informative embedding space of the problem.</p>
        <p>In other words, once the structure of fuzzy prototypes in the problem data D has been determined
via application of the fuzzy prototype analysis as described in the preceding sections, observations
that belong in the same prototype class with high confidence can be associated with similar decisions.
Fuzzy prototype models can as well provide some informative insights for observations with less
confident association to prototype classes, as will be discussed further in the results section.
3.4.</p>
      </sec>
      <sec id="sec-3-4">
        <title>A demonstration of fuzzy prototype analysis</title>
        <p>In this work we illustrate the methods and workflow of the fuzzy prototype analysis for intelligent
decision systems with a dataset that models a case of a learning with constrained prior problem.</p>
      </sec>
      <sec id="sec-3-5">
        <title>3.4.1. Model dataset</title>
        <p>For an illustration of the fuzzy prototype method, we will consider here an example of an observable
distribution is described by a large number of numerical parameters of the same type such as images.
We will use the dataset of images of basic geometric shapes that was described in [18].</p>
        <p>The images in the dataset were of three basic types: circles, triangles and backgrounds, of variable
size and grayscale contrast. The resolution of the images was 64 × 64 pixels i.e. each data point
corresponding to a single observation was expressed in 4,096 numerical factors in the range [0, 1].</p>
      </sec>
      <sec id="sec-3-6">
        <title>3.4.2. Informative dimensionality reduction</title>
        <p>An informative embedding the model dataset was obtained by applying a generative neural network
model with the architecture of a convolutional encoder, as described in [18]. This class of neural
architectures, being of the type of self-supervised learning, does not require annotated datasets for
training [17]. It is trained by reducing the error of reproduction (generation) of the samples in a set
of observables samples that can represent a sampling of the distribution of the problem.</p>
        <p>Thus, it is essential to note that informative embedding spaces constructed by application of
generative models to the problem dataset do not depend on any prior information about the
distribution and fully satisfy the constraints of the problem. Examples of distributions of data
samples in the embedding spaces of trained generative models are shown in Figure 1.</p>
        <p>Another point that is essential for the analysis and discussion here is that successful learning
could be achieved with relatively small samplings of the distribution, in the model example, as small
as dozens or even single samples per class. Granted, this observation needs to be taken with caution
and may not be readily extendable to significantly more complex problem data. Still, it shows that
meaningful initial learning of problem data of significant complexity as described earlier can be
achieved with limited samplings and moreover, as noted earlier, do not require annotations with
known types or classes, or any other form of prior knowledge about the problem distribution.</p>
        <p>Finally, it is worth noting that the method of construction of informative low-dimensional
embeddings of the problem data used here is not unique and a wide selection of methods of
selfsupervised, unsupervised learning and dimensionality reduction has been studied and applied
successfully. A more detailed discussion of the types of the methods, their differences, etc., would
fall beyond the scope of this work.</p>
      </sec>
      <sec id="sec-3-7">
        <title>3.4.3. Fuzzy-prototype structure of problem data</title>
        <p>In the example that we use in this work, groups of samples of the unknown distribution of the
problem in the high dimensional space of observable factors characterized by essential similarity are
modeled by the types of the geometric shape in the dataset of images.</p>
        <p>This type of model corresponds to problems and scenarios where an unknown distribution of
the problem is described by a large number of the numerical factors with approximately equal
significance in the effect of interest in the distribution (homogeneous multifactorial data).</p>
        <p>It was shown [18] that in some such cases, the prototype/conceptual structure of the data P(D)
discussed in the preceding sections can be determined or resolved with sufficient confidence by
application of methods of unsupervised ensemble learning and clustering that do not depend on
significant or, in fact, any prior knowledge about the unknown distribution.</p>
        <p>As a result of application of such methods, the distribution of data points that correspond to a
certain sampling of the original data S in the informative embedding space E(S) can be represented
by the fuzzy-prototype structure Pf(D):</p>
        <p>( ): {  ( ),  ( ,  k) }
where P(D) = P(E(S)), the sequence of the prototypes derived from the distribution E(S) in the derived
informative embedding space,  : the prototype probability distribution.</p>
        <p>
          Again, one can observe that the fuzzy-prototype structure of the problem data (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) effectively
approximates the observable distribution by providing a probabilistic model of distribution of
arbitrary representative of the problem data D between the conceptual prototypes.
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
        </p>
      </sec>
      <sec id="sec-3-8">
        <title>3.4.4. Iterative learning</title>
        <p>One can observe that the process of the resolution or construction of the fuzzy-prototype structure
of the problem data described in the preceding sections was in effect, static with respect to to the
basic sampling of the problem data S(D) that was used for the derivation of the structure/model of
the fuzzy prototypes.</p>
        <p>As we noted earlier, an essential advantage of the methods of unsupervised learning with
constrained prior problems is the potential to accumulate general, non-annotated data in the course
of the learning process. Such more extensive and detailed samplings can in their turn provide
additional, more detailed information about the conceptual structure of the distribution. Then,
repeating the described process iteratively with a sequence of extended samplings of data S1, S2, .. Sk
in the problem space it can be possible to produce more precise, “sharper” fuzzy prototype models
with diminishing uncertainty in the probability distribution of the prototype classes. This iterative
process is illustrated in Figure 2.</p>
        <p>Thus, one can expect that the iterations of the fuzzy-prototype models Pf(D)k obtained with more
descriptive samplings Sk would produce more precise models of the prototype probability
distribution, reducing the uncertainty in the distribution of the observations between the prototype
classes. In the next section we attempt to verify this hypothesis with the model dataset of images.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>In this section we illustrate the method of construction of fuzzy-prototype models of problem data
in constrained prior problems with the dataset of images as described in Section 3.3.1. To examine
the relationship between the precision or resolution of the fuzzy-protype model on the size of
sampling, we used samples of three different sizes: S-90, having 30 images per type (i.e., geometric
shape); S-150, 50 images per type; and S-300, with 100 images per type.
4.1.</p>
      <sec id="sec-4-1">
        <title>Samplings and construction of fuzzy-prototype models</title>
        <p>Iterative learning of fuzzy-prototype models of problem data in the constrained prior setting can face
another challenge in the early stages of the process: that of instability of learning with small data.
This limitation applies only to some problems, where both known and general data are constrained,
whereas in other cases, samplings of general data that is not associated with prior knowledge would
not be limited or constrained. Still, in this work we chose to address the case where general data is
limited along with the annotated one, and the processes of learning of prototype structure and
collection of general samplings proceed alongside each other. For this reason, the initial, starting
sample was chosen to be of a rather small size, about two dozen instances per conceptual class, that
is in our case, the type of geometric shape.</p>
        <p>One can note before proceeding to further analysis that the minimum threshold of the size of
samplings that is necessary for initial learning of the prototype structure in the data is not an obvious
choice; it depends on several factors such as conceptual complexity of data, characteristics of the
variation in the informative embedding factors of the prototype classes and others. Addressing this
question in full detail would be a challenging problem of its own that merits another study. With the
data used in this work it was found by trial that the chosen size of the initial sampling was sufficient
for the purposes of the study.</p>
        <p>To construct fuzzy-prototype models with samplings of the problem data modeled by the model
dataset of images, the process described in [18] was used. To address the challenge of stability in
learning with small data for smaller-size samplings [19, 20], an ensemble [21, 22] of generative neural
models was used. To outline it briefly, after construction of informative embeddings with neural
models of self-supervised learning, clustering in the embedding space was applied to identify
characteristic regions/clusters of samples that were associated with the concept/prototype classes.
In this work we used clustering by a visual observation method, but the demonstrated approach can
be extended straightforwardly to use known methods of unsupervised clustering such as DbScan,
MeanShift and others [23-25].
4.2.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Construction and resolution/precision of fuzzy-prototype models</title>
        <p>Based on the structure of characteristic types/concepts/prototypes associated with clusters in the
informative embedding space of the problem data, one can derive the fuzzy-prototype model of the
data (in the iteration of the sampling Sk used to calculate the structure)   (  ) via the process
illustrated in Figure 3. Many specific implementations of the process are possible with arbitrary
number of clusters and without limitations for the dimensionality of the informative embedding
space nor the methods of constructing it.</p>
        <p>At a given iteration characterized by a general sampling of the problem distribution D, Sk the
precision or resolution of the fuzzy prototype model   (  ) can be characterized by the confidence
factor tc indicating the minimal confidence threshold for an association of an observation x to a
prototype class Pk, and the confusion matrix M(P(x), Ptrue):</p>
        <p>
          ( ,  k) ≥   →  ( ) =     ( ): {  ( ),  ( ,  k) },
where  ( ): the prototype class associated with an observation x by the fuzzy-prototype model at
the confidence threshold tc, Ptrue: the true, known class associated with the observation.
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
        </p>
        <p>An example of the confusion matrix produced by a fuzzy prototype model for the model data of
geometric images is shown below (from [9]).
Confusion Matrix, S-150 Fuzzy Prototype Model, tc = 0.95</p>
        <p>Shape, Cluster
Circle
Triangle
Background</p>
        <sec id="sec-4-2-1">
          <title>Cluster 0 1.0 0.25 0.</title>
        </sec>
        <sec id="sec-4-2-2">
          <title>Cluster 1 0. 0.75 0.15</title>
          <p>of construction of fuzzy-prototype models with three samplings of the model problem data of
progressively larger size, as described earlier in this section: S-90, S-150, and S-300.</p>
          <p>As can be seen in Table 2, where the precision characteristics of fuzzy-prototype models
constructed with the samplings at two confidence thresholds, tc = 0.8, 0.9 are given, fuzzy-prototype
models obtained with iteratively extended samplings via the process shown in Figure 2 showed
progressive improvement in the precision/resolution of the prototype classes.</p>
          <p>The precision/resolution used to measure the performance of the models was calculated from the
confusion matrix M of the model as a pair (tuple) (a, c) of:
1. The accuracy, a, measured as the sum of the diagonal (correct) predictions of the prototype
class, divided by the number of classes:  =

1 ∑  
.
2. The confusion, c, measured as the sum of the non-diagonal (incorrect) predictions of the
prototype class, divided by the number of combinations of classes:  =
1
 ( −1) ∑ ≠  
.
Precision/resolution in Iterative Learning with Model Data, Accuracy/Confusion Metrics
Sampling</p>
          <p>S-90
S-150
S-300</p>
        </sec>
        <sec id="sec-4-2-3">
          <title>Size (per type)</title>
          <p>Precision, tc=0.8</p>
          <p>Precision, tc=0.9
30
50
0.811/ 0.082
0.867/ 0.067</p>
          <p>It can be noted in conclusion of this section that the stability of the structure and content of
clusters in the embedding space is a key dependency and a requirement for confident determination
of the fuzzy prototype structure. The use of an ensemble of generative neural models assured that
the cluster/prototype structure resolved by the method described characteristic patterns in the model
problem data.</p>
          <p>The results of this experiment show that the precision of association of observations to prototype
classes determined via the process described in this work improves steadily with accumulation of
new data, even without the dependency on the known association between the observations and
their true, externally known class. This knowledge was used in this work only for verification of the
performance of the models.
4.4.</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Application of fuzzy-prototype models in intelligent decisions systems in the constrained prior context</title>
        <p>The method of construction of fuzzy-prototype models developed in this study has shown the
potential of progressive improvement in the precision of association of the data points in the problem
data space with characteristic types, concepts or prototypes. Importantly, the fuzzy-prototype
structure can be initially resolved with smaller samplings and the method allows to accumulated
data in the process of learning/operation with progressively improving precision, resulting in higher
effectiveness of the decisions associated with prototypes. A word of caution that needs to be said
here is that data used to demonstrate the method was relatively simple with respect to its conceptual
content (i.e., the number of distinct types) and more complex types/spaces of problem data may
require samplings of larger size for confident determination of the prototype structure.</p>
        <p>Another interesting potential of the proposed method can be pointed out for observations (i.e.,
data points in the problem data space) that produce less confident, “confused” associations to
prototype classes. In such cases, if the decision space and process allow so, the constructed decision
that is a mixture of the “pure” decisions associated with prototype classes can be more effective than
any of the pure decisions. Let consider an example with a prototype structure of three classes p1 –
p3, and an observation xm:  (  ,  1) = 0.4,  (  ,  2) = 0.3,  (  ,  3) = 0.3. Then, the decision
created as a combination of the pure decisions d1 – d3 associated with the prototype classes, for
example as:
 (  ) =</p>
        <p>(  ,  k)   ,

if the decision space/process allows such kind of mixing, can be more effective than any of the pure
decisions di(pi) associated with the prototype classes.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions</title>
      <p>In this work, we attempted to approach the problem of learning with constrained prior for intelligent
decision making by combining the results from several thriving fields of data science and the
research in intelligent systems: methods of unsupervised and/or self-supervised learning, generative
learning, prototype/concept learning, unsupervised, including density clustering and fuzzy
clustering with iterative aggregation of samplings of the problem data space to outline the direction
in which intelligent systems can begin learning complex data spaces with minimal data improving
the quality of the decisions in the process of learning. As outlined in the review section, these results
connect with reported results in these fields and are supported by them. Several concluding
comments need to be added to this summary of the findings of this study.</p>
      <p>First, while one can expect the outlined approach to be sufficiently general to accommodate a
broad range of realistic complex data, specific characteristics and implementations of the method
suitable for different types of data/problem spaces can vary. This can include methods of producing
informative embeddings, characteristics of the embeddings including dimensionality, methods of
clustering/fuzzy conceptualization, minimal sampling, the number and pace of learning iterations
and other essential characteristics.</p>
      <p>Another essential point that was mostly left out of the scope of our discussion is verification of
the quality and effectiveness of the learning process and the decisions produced by the learning
system. A verification loop, based on batches of samples verified empirically, with known outcomes
can be collected in each learning iteration and integrated into the learning process to evaluate the
quality and progress of learning. This essential function of the system will be examined in a future
work.</p>
      <p>A close and seemingly natural connection between the processes of learning the conceptual
structure in general problem data and the ability of intelligent systems to construct effective
decisions and responses based on the relationship of essential similarity in the input data space was
noted. In that way, a decision learned, and verified once can be applied to a class of inputs in the
problem data space, improving the generality, effectiveness and efficiency of the learning process.
An interesting direction of research that can be pursued in another work is the potential to construct
fuzzy or “hybrid” decisions for observations with less certain association to concept/prototype
classes, as discussed briefly in Section 3.4.</p>
      <p>The problem of learning with constrained prior data described and examined here, where
sufficiently large volumes of empirically verified knowledge about the problem distribution for
conventional approaches in learning have not been accumulated emerge on a regular basis in modern
science and technology. We expect that the methods proposed and examined in this work will be
instrumental and find application in the theory and practice of development of effective and efficient
intelligent decision systems capable of working with strongly constrained problems and
environments.</p>
    </sec>
    <sec id="sec-6">
      <title>Declaration on Generative AI</title>
      <p>
        The authors have not employed any Generative AI tools in preparation of this work.
visualization, J. Mach. Learn. Res., vol. 22, pp. 1–73, 2021. URL:
https://doi.org/10.48550/arXiv.2012.04456.
[9] Y. Shen, T. Chen, Z. Xiao, B. Liu, and Y. Chen, High-dimensional data clustering with fuzzy
Cmeans: Problem, reason, and solution, in Lecture Notes in Computer Science, vol. 12861, I. Rojas,
G. Joya, and A. Català, Eds. Cham: Springer, 2021, pp. 89–100.
[10] N. A. Setiawan, P. A. Venkatachalam, and A. F. M. Hani, Diagnosis of Coronary Artery Disease
using Artificial Intelligence based decision support system, 2020, Arxiv 2007.02854.
[11] G. Marín Díaz, R. Gómez Medina, and J. A. Aijón Jiménez, Integrating fuzzy C-Means clustering
and Explainable AI for robust galaxy classification, Mathematics, vol. 12, no. 18, p. 2797, 2024.
doi:10.3390/math12182797.
[12] I. Izonin, R. Tkachenko, I. Dronuyk et al., Predictive modeling based on small data in clinical
medicine: RBF-based additive input-doubling method, Mathematical Bioscience Engineering, 18
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) (2021) 2599–2613.
[13] C. Ylenia, D. L. Chiara, I. Giovanni, R. Lucia A Clinical Decision Support System based on fuzzy
rules and classification algorithms for monitoring the physiological parameters of type-2
diabetic patients. Mathematical Biosciences and Engineering, 18(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) (2021) 2687–2708. URL:
https://doi.org/10.3934/mbe.2021135.
[14] W. Mao and K. Xu, Enhancement of the classification performance of fuzzy C-means through
uncertainty reduction with cloud model interpolation, Mathematics, vol. 12, no. 7, p. 975, 2024.
[15] X. Gu, M. Li, L. Shen, G. Tang, Q Ni et al., Multiobjective evolutionary optimization for
prototype-based fuzzy classifiers, IEEE Transactions on Fuzzy Systems, 31 (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ) (2023), 1703–1715.
[16] L. Makarova and M. Tatarenko, Comparative Analysis of Methods and Models for Estimating
Size and Effort in Designing Mobile Applications, Computer Systems and Information
Technologies, no. 3, pp. 26–33, 2024. doi:10.31891/csit-2024-3-4.
[17] M. Welling, D. Kingma, An introduction to variational autoencoders, Foundations and Trends
in Machine Learning, 12 (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) (2019), 307–392.
[18] S. Dolgikh, Modeling of small data with unsupervised generative ensemble learning, in:
Proceedings of the 5th International Conference on Informatics and Data-Driven Medicine
(IDDM-2022) Lyon France 2022, CEUR-WS.org volume 3302, pp. 35–45.
[19] L. Gao, D. Wang, L. Zhuang, X. Sun, M. Huang, and A. Plaza, BS3LNet: A new blind-spot
selfsupervised learning network for hyperspectral anomaly detection, IEEE Trans. Geosci. Remote
Sens., vol. 61, art. no. 5504218, pp. 1–18, 2023. doi:10.1109/TGRS.2023.3246565.
[20] R.N. D’souza, P.Y. Huang, FC Yeh, Structural analysis and optimization of Convolutional Neural
      </p>
      <p>Networks with a small sample size, Scientific Reports 10 (2020) 834.
[21] A. Wu, W. Ge, and W.-S. Zheng, Rewarded Semi-Supervised Re-Identification on Identities
Rarely Crossing Camera Views, IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 12, pp.
15512–15529, Dec. 2023. doi:10.1109/TPAMI.2023.3292936.
[22] S. Kumar, P. Kaur, A. Gosain, A comprehensive survey on ensemble methods, in: Proceedings
of the 2022 IEEE 7th International conference for Convergence in Technology (I2CT), Mumbai,
India, 2022, pp. 1-7.
[23] J. Yan and X. Wang, Unsupervised and semi-supervised learning: The next frontier in machine
learning for plant systems biology, Plant J., vol. 111, no. 2, pp. 301–314, 2022.
doi:10.1111/tpj.15905.
[24] P. Bhattacharjee, P. Mitra, A survey of density based clustering algorithms. Frontiers of</p>
      <p>
        Computer Science 15 (2021) 151308.
[25] D.R. Hunter, Unsupervised clustering using nonparametric finite mixture models, WIREs
Computational Statistics 16 (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) (2024) e1632.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>J. E. Van Engelen and H. H.</given-names>
            <surname>Hoos</surname>
          </string-name>
          ,
          <article-title>A survey on semi-supervised learning</article-title>
          ,
          <source>Machine Learning</source>
          , vol.
          <volume>109</volume>
          , no.
          <issue>2</issue>
          , pp.
          <fpage>373</fpage>
          -
          <lpage>440</lpage>
          ,
          <year>2020</year>
          . doi:
          <volume>10</volume>
          .1007/s10994-019-05855-6.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>L.</given-names>
            <surname>Von Rueden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mayer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Beck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Georgiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Giesselbach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Heese</surname>
          </string-name>
          , et al.,
          <article-title>Informed machine learning - A taxonomy and survey of integrating prior knowledge into learning systems</article-title>
          ,
          <source>IEEE Trans. Knowl</source>
          . Data Eng.,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .1109/TKDE.
          <year>2021</year>
          .
          <volume>3079836</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rabasovic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Pavlovic</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Sevic</surname>
          </string-name>
          ,
          <article-title>Analysis of laser ablation spectral data using dimensionality reduction techniques: PCA, t-SNE and UMAP, Contrib</article-title>
          .
          <source>Astron. Obs. Skalnaté Pleso</source>
          , vol.
          <volume>53</volume>
          ,
          <year>2023</year>
          . doi:
          <volume>10</volume>
          .31577/caosp.
          <year>2023</year>
          .
          <volume>53</volume>
          .3.51.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kracker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Garcke</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Schumacher</surname>
          </string-name>
          ,
          <article-title>Automatic analysis of crash simulations with dimensionality reduction algorithms such as PCA and t-SNE</article-title>
          ,
          <source>in Proc. 16th Int. LS-DYNA Forum</source>
          ,
          <year>2020</year>
          . URL: https://lsdyna.ansys.com/wp-content/uploads/attachments/t1-1
          <article-title>-a-automotive011.pdf</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>T.</given-names>
            <surname>Isaiev</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Kysil</surname>
          </string-name>
          ,
          <article-title>Method of Creating Custom Dataset to Train Convolutional Neural Network, Computer Systems</article-title>
          and Information Technologies, no.
          <issue>4</issue>
          , pp.
          <fpage>37</fpage>
          -
          <lpage>44</lpage>
          , Dec.
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .31891/csit-2024
          <source>-4-5.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>S.</given-names>
            <surname>Dolgikh</surname>
          </string-name>
          ,
          <article-title>Topology of conceptual representations in unsupervised generative models</article-title>
          ,
          <source>in Proc. 26th Int. Conf. Information Society</source>
          and University Studies (IVUS-
          <year>2021</year>
          ), Kaunas, Lithuania, CEUR-WS.org, vol.
          <volume>2915</volume>
          , pp.
          <fpage>150</fpage>
          -
          <lpage>157</lpage>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Meilă</surname>
          </string-name>
          and
          <string-name>
            <given-names>H.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <article-title>Manifold learning: What, how, and why</article-title>
          ,
          <source>Annu. Rev. Stat. Appl.</source>
          , vol.
          <volume>11</volume>
          , pp.
          <fpage>263</fpage>
          -
          <lpage>290</lpage>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .1146/annurev-statistics-
          <volume>040522</volume>
          -115238.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shaposhnik</surname>
          </string-name>
          ,
          <article-title>Understanding how dimension reduction tools work: An empirical approach to deciphering t-SNE, UMAP, TriMAP, and PaCMAP for data</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>