<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>International Journal of Image</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.5815/ijigsp.2019.04.05</article-id>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Bratislava University of Economics and Management</institution>
          ,
          <addr-line>Furdekova str. 16, Bratislava</addr-line>
          ,
          <country>Slovak Republic</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University “Kharkiv Polytechnic Institute”</institution>
          ,
          <addr-line>Kyrpychova str. 2, Kharkiv, 61002</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Olga Cherednichenko</institution>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1981</year>
      </pub-date>
      <volume>69</volume>
      <issue>4</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Assessing the edibility of food based on consumer perception remains an underexplored yet practically significant challenge in food safety. This paper presents a novel framework for evaluating food suitability using natural language descriptions of sensory experiences, such as odor, appearance, and texture. By extracting structured features from unstructured, subjective input, our system leverages a comparatorbased identification approach to infer missing attributes and assess overall edibility. The model aligns incomplete descriptions with prototypical instances from labeled data, enabling robust classification even under uncertainty. We demonstrate that this method can support nuanced, human-like judgments and serve as a foundation for intelligent decision-support tools in consumer and public health contexts. The proposed framework opens avenues for integrating qualitative perception with structured inference in critical application domains.</p>
      </abstract>
      <kwd-group>
        <kwd>natural language processing</kwd>
        <kwd>food safety</kwd>
        <kwd>comparator-based identification</kwd>
        <kwd>feature modeling 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Ensuring the safety and suitability of food products for consumption is a persistent and globally
relevant challenge. While traditional methods of food safety assessment rely on laboratory analysis,
expiration labeling, and visual inspection, in everyday settings, consumers often base their decisions
on informal sensory evaluations - descriptions of smell, texture, taste, and appearance. These
evaluations are typically articulated in natural language and are inherently subjective, imprecise,
and often incomplete. Yet, they represent a rich source of information that, if properly structured
and interpreted, could inform intelligent systems capable of estimating food edibility.</p>
      <p>This paper introduces a conceptual and technical framework for assessing food edibility based on
free-form</p>
      <p>user descriptions. The approach rests on modeling these descriptions as partial
observations of an underlying, structured feature space - comprising attributes such as odor type,
surface texture, discoloration, moisture level, and taste anomalies. By employing techniques from
natural language processing, key indicators are extracted and normalized into a set of interpretable
features. To overcome the limitations of incomplete data, we propose a comparator-based
identification method, which allows for the inference of missing attributes by aligning observed
feature subsets with prototypical examples of known edibility status.</p>
      <p>This method situates food products within a latent comparative space, where similarities to
known spoiled or safe instances provide a probabilistic basis for prediction. Rather than relying
solely on absolute rules or fully observed inputs, the system can generalize from experience and
provide nuanced judgments even in uncertain or borderline cases. The proposed framework not only
addresses a practical consumer need but also contributes a novel perspective to the modeling of
qualitative, perception-based descriptions in safety-critical domains.</p>
      <p>The goal of this research is to introduce and evaluate a Comparator Based Identification
framework that infers food edibility by analyzing free-text sensory descriptions and demonstrate its
utility as an interpretable decision-support tool in food safety.</p>
      <p>This research addresses the following key questions:


</p>
      <p>How can logical rules for recognizing the edibility of food be formalized as predicate
structures based on observable characteristics?
How effective is the comparator model at classifying food according to their sensory
characteristics compared to traditional ML models?</p>
      <p>How interpretable are comparator solutions?</p>
      <p>By answering these questions, we aim to develop Comparator-Based Identification Framework
including a mapping from subjective natural language inputs (odor, texture, color, etc.) to structured
feature representations; designing a comparator mechanism and assessing similarity; and
demonstrating the framework's utility via case studies.</p>
      <p>The task of determining the food edibility based on external features is a typical binary
classification task. Traditional machine learning methods solve it by training a classifier, but in the
framework of the comparator identification method [1] the recognition process is formulated as
identification by comparison. The idea is that a new or unknown food is identified not by direct
determination of its species, but by comparison with already known samples of edible and dangerous
ones. The method of comparator identification is based solely on the analysis of physically
observable features of an object and the identification of patterns in the form of logical conditions.</p>
      <p>In this paper, we consider mushrooms as an example of food to identify their edibility. We
formalize the mushroom features as sensory descriptions, build a model based on pairwise
comparisons of mushrooms, describe the comparator structure, present a logical scheme for
identifying edibility without directly specifying the mushroom species, and show how the decisions
of the comparator system relate to the binary classification task.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Review of related works</title>
      <p>Recent advances in product classification increasingly leverage human perception and textual
descriptions by integrating multi-modal learning, personalized recommendation systems, and
graphbased representations. Multi-modal approaches, as demonstrated by combining textual [2] and visual
features [3] through neural networks and fusion techniques, show superior performance by
compensating for limitations within individual data modalities. Meanwhile, personalized
recommendation systems employing models like BERT and nearest neighbor algorithms address
individual preferences in e-commerce environments, enhancing user satisfaction while tackling
scalability and cold-start challenges [4]. Simultaneously, text-attributed graphs and frameworks like
P2TAG enable few-shot classification by fusing raw textual and structural graph information,
significantly boosting classification accuracy [5]. Thus, these developments highlight the shift
toward more adaptive, perception-driven classification models that capture nuanced human
understanding of product categories.</p>
      <p>Research combining food safety and natural language understanding is limited, with most
NLPfor-food work focusing on structured inputs like labels and recipes to predict nutrition or categorize
products [6, 7]. These approaches show high performance but assume clean, standardized data, not
subjective or incomplete sensory descriptions. While food safety analytics use NLP for recall tracking
or risk modeling [8], they rarely treat user-reported sensory input as core data for inference.</p>
      <p>A closer parallel is mushroom edibility classification using structured features like odor and color
[9], but these models rely on complete attribute sets and lack mechanisms for recovering missing
data or aligning partial input with prototypes. Some works propose a comparator-based approach
inspired by prototype/metric learning [10, 11, 12], comparative reasoning in NLP [13], and Bayesian
Case Models [14], which support inference from partial, subjective text and generate interpretable,
probabilistic edibility assessments.</p>
      <p>Despite advances in large-scale language models (LLMs), their performance on Named Entity
Recognition (NER) still lags behind supervised methods due to the intrinsic mismatch—NER is a
sequence labeling task, while LLMs are optimized for generation. GPT-NER addresses this by
reframing NER as a generation task using special entity-marking tokens, and incorporates a
selfverification strategy to counter hallucinations common in LLMs [15]. Notably, GPT-NER performs
comparably to supervised models across five benchmarks and excels in few-shot settings,
highlighting its potential for real-world, low-resource applications. In parallel, NER plays a key role
in processing domain-specific information such as aeronautical intelligence, where challenges
include semantic ambiguity, data-sharing opacity, and lack of standardization. A recent survey
explores how NER can support this domain, highlighting the roles of aviation-specific ontologies,
knowledge systems, and thematic databases while identifying future research directions [16].</p>
      <p>While many studies in NER focus on model architectures and training strategies, comprehensive
evaluation across genres and entity types remains underexplored. One study conducts extensive
testing on varied and adversarial test sets to assess the robustness of three state-of-the-art models,
proposing improved reporting practices to better reflect real-world performance [17]. Another
growing research area is nested NER, which addresses cases where entities overlap or are embedded
within each other—issues that standard flat NER models often ignore. A review categorizes nested
NER models (e.g., rule-based, hypergraph-based) and examines challenges such as error propagation
and entity dependency, offering guidance for both researchers and practitioners [18]. To support
multilingual NER development, the Universal NER (UNER) project presents gold-standard datasets
across 12 languages with consistent annotations, facilitating cross-lingual research and providing
publicly available baselines and tools [19].</p>
      <p>While recent advancements in multi-modal learning, personalized recommendation systems, and
graph-based methods have significantly improved product classification by incorporating human
perception and textual descriptions, several challenges remain. First, accurately classifying products
from subjective, incomplete, or noisy descriptions continues to be a critical issue, especially in
domains like food safety where user sensory narratives are underutilized for inference. Second,
Named Entity Recognition (NER), despite being a mature NLP task, still faces limitations in handling
domain-specific data, nested structures, and cross-genre robustness, particularly when adapting
large language models originally designed for generation tasks. Third, identifying significant
indicators within product descriptions—especially from partial or sensory-based language—requires
new methods that can infer missing attributes and align unstructured input with meaningful,
interpretable prototypes. Addressing these issues will be key to developing robust, user-aware
systems capable of understanding and classifying products in complex, real-world scenarios.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and Methods</title>
      <sec id="sec-3-1">
        <title>3.1. Food edibility feature extraction from natural language description</title>
        <p>In real-world applications of food identification – particularly in unstructured settings such as
foraging or home inspection – users often provide descriptions of food items in natural language,
rather than structured categorical forms. To integrate such inputs into a comparator-based
identification framework, it is necessary to extract structured features from textual descriptions. In
our case, these features represent perceptual and contextual properties relevant to food
safety – such as color, shape, odor, texture, bruising, and presence of specific anatomical structures
(e.g., gills or rings in mushrooms).</p>
        <p>We approach this task as a rule-based information extraction problem, mapping linguistic cues
to categorical variables  ∈  , where each  is a finite set of permissible values for the
 -th feature. For instance, the sentence “The cap is flat and smooth with a brownish tint” yields the
feature assignments  _ =  ,  _ =  ℎ, and  _ =  . Since user
input may omit features, use synonyms, or express ambiguity, we adopt a tolerant matching
procedure that:



recognizes synonymous terms and paraphrases using a manually constructed mapping
dictionary;
allows partial filling of the feature vector  = ( ,  , … ,  );
defers decision-making in case of insufficient information.</p>
        <p>Each natural-language description is thus transformed into a partial categorical vector in the
comparator feature space, suitable for downstream metric-based comparison and classification. This
allows flexible integration of free-form descriptions into a symbolic decision pipeline without
requiring full supervision or structured data entry.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Comparator-based identification method</title>
        <p>Comparator-based identification is a symbolic classification framework grounded in the notion of
perceptual similarity between objects. Rather than relying on numerical features or statistical
models, this method compares an unknown object’s description with previously known instances
using interpretable, feature-level match predicates. Let the perceptual description of a food item be
represented as a vector  = ( ,  , … ,  ), where each  ∈  is a categorical value of the  -th
attribute (e.g., shape, color, texture, odor). The space  =  ×  × … ×  forms a discrete feature
space.</p>
        <p>
          For any pair of objects  and  we define elementary comparators  ( ,  ), which are binary
predicates indicating whether the two objects agree on the  -th feature. The comparator similarity
between two objects is defined as:
(
          <xref ref-type="bibr" rid="ref2">1</xref>
          )
 ( ,  ) =
        </p>
        <p>Classification is based on comparing an unknown object  to labeled reference objects from
known classes (e.g., edible or inedible).</p>
        <p>To improve both interpretability and decision reliability, we adopt a method for identifying
significant (core) features – a subset of attributes that are most informative for distinguishing
between classes. We base this step on the structural significance criterion proposed [1], which
defines a feature  as significant if it contributes to class separation within the comparator
framework.</p>
        <p>Let  and  denote the sets of known safe and unsafe food items, respectively. Formally, feature
 is considered structurally essential if:</p>
        <p>
          ∃ ∈  ,  ∈  such that  ( ,  ) = 1 ∀ ≠  , but  ( ,  ) = 0 (
          <xref ref-type="bibr" rid="ref3">2</xref>
          )
That is, there exist objects from opposite classes that differ only in the  -th feature, making it
decisive for classification in at least one instance.
        </p>
        <p>The core feature set  ⊆ {1, … ,  } is defined as the minimal set for which classification accuracy
remains unchanged when only features from  are used for comparison:</p>
        <p>
          ∀ ∈  ,  ( ) =  ( ), (
          <xref ref-type="bibr" rid="ref4">3</xref>
          )
where  ( ) is the comparator decision rule restricted to features in  . This reduction allows
building simpler, explainable classifiers focused on perceptually relevant attributes.
        </p>
        <p>By using such comparator-based principles and isolating core features, our method enables
symbolic, transparent decision-making in safety-critical applications, such as identifying food
edibility from natural language descriptions.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Comparator model of mushroom edibility</title>
        <p>The UCI Mushroom Dataset [20] is a well-established benchmark for classification tasks involving
categorical features. It contains 8,124 labeled observations of mushrooms belonging to the Agaricus
and Lepiota families. Each observation is described by 22 nominal attributes that capture observable
features, such as cap shape and color, gill attachment, bruising, odor, ring type, and habitat (see
Figure 1). The target variable indicates whether the mushroom is edible or poisonous. Because all
features are categorical and the classes are balanced, this dataset is used to evaluate models that
handle symbolic data, feature selection strategies, and interpretable decision rules [21].</p>
        <p>We have chosen this dataset for three main reasons. First, the categorical nature of the attributes
aligns well with the logic-based framework of comparator identification, in which each feature is
treated as a finite-valued variable and encoded via logical indicators. Second, the availability of
ground truth class labels allows us to rigorously evaluate the performance of classification rules
derived from comparator principles. Third, the dataset provides a natural context for demonstrating
core feature extraction and decision strategies.</p>
        <p>
          We will describe each mushroom by a set of observable features - a sensory description. In the
UCI Mushroom dataset, each specimen is assigned such features as cap shape and color, cap surface,
presence of spots ("bruises"), odor, characteristics of laminae (their attachment, distance, size, color),
stem shape and structure (presence of a ring, its number and type, thickness/shape of the stem, color
of the stem above and below the ring), color of spore powder, growing environment and population,
etc. Formally, each mushroom x is matched with a vector of features:
 = ( ,  , … ,  ),
(
          <xref ref-type="bibr" rid="ref5">4</xref>
          )
where  is the value of the  -th feature. For each feature a finite set of allowed values of  is
defined (e.g., cap color  ∈  = {red, brown, white, . . . }. Thus, the space of mushroom
descriptions can be represented as a Cartesian product of  =  ×  × … ×  . This space is a
vector feature space in terms of comparator identification [1]. The vector  contains all available
information about the mushroom obtained through the observer's "sensors": sight (color, shape),
smell (odor), touch (surface texture), and others. It is essential that the sign of edibility is not a direct
sensory attribute - it cannot be observed directly. It must be established indirectly, by comparing the
perceptual attributes of an unknown mushroom with those of known edible or poisonous
mushrooms.
        </p>
        <p>Each component of the sensory description can be interpreted as a result of measurement or
perception: e.g.,  is the color of the cap as registered by sight;  is the categorical value of
the odor (almondy, unpleasant, absent, etc.) as perceived by the sense of smell;  is the
presence/type of ring on the stalk as determined visually or by touch, etc. Thus, the sensory
description provides a necessary and sufficient set of inputs for mushroom identification using the
comparator.</p>
        <p>In the comparator identification method, the key role is played by the operation of comparing
two objects by their features. Consider two mushrooms with descriptions  = ( ,  , … ,  ) and
 = ( ,  , … ,  ). For each feature  , let us define an elementary comparator as the feature
matching predicate:</p>
        <p>
          ( ,  ) = 01,,   ≠=  ,, (
          <xref ref-type="bibr" rid="ref6">5</xref>
          )
where  and  are the values of the  -th feature in objects  and  . The predicate  ( ,  )
indicates whether the objects are comparable in terms of the  -th feature, i.e., whether the observed
feature is the same for them. For example,  ( ,  ) = 1 if two mushrooms have the same cap
color;  ( ,  ) = 1 if their smell belongs to the same category;  ( ,  ) = 1 if either both have
a ring of the same type or both do not, etc. In the special case, if  ( ,  ) = 1
∀ ∈ (1,  ), then two mushrooms have identical sensory description (match on all features).
        </p>
        <p>Not all features are equally informative for determining edibility, so comparisons emphasize those
traits that correlate with the edible/poisonous class. Some features may be indistinguishable for the
purpose of identifying edibility. For example, the veil-type attribute in the Data Set takes the same
value for all mushrooms, so that a comparison based on this attribute does not provide useful
information (it is always the same and does not influence the decision). At the same time, the odor
or spore color attributes are extremely important: it is known that certain odor values are found only
in poisonous mushrooms. Thus, in a comparator analysis of features, we can divide them into:



diagnostic attributes that are critical and their difference directly indicates class (e.g., the
presence of an acrid/chemical odor virtually guarantees poisonousness);
minor attributes for which edible and poisonous mushrooms may overlap, and the
coincidence/difference of these attributes only in combination with other attributes affects
the inference;
neutral attributes that have little influence on the inference of edibility.</p>
        <p>
          Formalizing a statement of the form "mushroom  looks like an edible mushroom", we can
introduce a similarity measure based on a set of pairwise comparisons. One approach is to count the
number of matching features between  and some known edible mushroom  . Let's denote by
 ( ,  ) the number of matches:
(
          <xref ref-type="bibr" rid="ref7">6</xref>
          )

( ,  ) =
where  is the truth indicator of the matching condition. Then  ( ,  ) =  means that the
descriptions of  and  completely match. A mushroom  can be called "similar" to  if  ( ,  ) is
large, i.e. the objects coincide in most of the key features. By limiting case we can introduce a
threshold  : consider  similar to  if  ( ,  ) ≥  . In other words, we introduce a binary similarity
predicate  ( ,  ) - " is similar to  in at least  features". This predicate is a composition of
individual comparisons by attributes:  ( ,  ) is true if enough individual  ( ,  ) for important 
are true. For example, the statement "this mushroom is similar to the edible species Agaricus" can be
interpreted as: this mushroom has the same cap shape, plate color, lack of odor, and presence of a
ring as some reference edible champignon, i.e., the corresponding  for these characteristic features
are satisfied.
        </p>
        <p>It should be noted that the target "edibility" itself is not part of the sensory description and is not
directly involved in the comparison - it is the one we want to define. Therefore, a comparator
conclusion about edibility can only be made indirectly, through comparison of the other attributes
with already studied mushrooms whose edibility is known.</p>
        <p>The space of descriptions  can be endowed with the structure of a metric space for quantifying
the similarity of mushrooms. One natural variant of the metric is the Hamming metric [23, 24],
defined as the number of differing features:
(clusters) of similar objects.
the predicate:</p>
        <p>In particular, if  = 0:

⊆  ×  : ( ,  ) ∈</p>
        <p>⟺  ( ,  ) ≤ 
Given  = 0 the relation</p>
        <p>expresses exactly the identity of the descriptions. For  &gt; 0 the
relation becomes a relation of  -similarity: the mushrooms  and  differ in no more than  features.
It is clear that</p>
        <p>is an equivalence relation on the set of objects (partitioning the space  into classes
of identical descriptions), while</p>
        <p>may not have transitivity at  &gt; 0, but defines a neighborhood
In the language of predicates, we can define the corresponding similarity predicates. For example,
 ( ,  ) =
(1 −  ( ,  )) =  − 
( ,  )</p>
        <p>
          Such a metric  ( ,  ) is 0 if the mushrooms  and  have an identical description, and is increased
by 1 for each feature in which they differ. Proximity (similarity) can be defined through a metric: the
smaller  ( ,  ) is, the more "similar" the mushrooms are. By introducing a threshold  ≥ 0, we can
define the binary relation:
(
          <xref ref-type="bibr" rid="ref8">7</xref>
          )
(
          <xref ref-type="bibr" rid="ref9">8</xref>
          )
(
          <xref ref-type="bibr" rid="ref10">9</xref>
          )
(
          <xref ref-type="bibr" rid="ref11">10</xref>
          )
(
          <xref ref-type="bibr" rid="ref12">11</xref>
          )
 ( ,  ) = 1 ⟺  ( ,  ) ≤ 
 ( ,  ) ⟺
        </p>
        <p>( ,  ),
i.e., complete matching of descriptions. Similarity predicates allow to formalize statements of the
form "object</p>
        <p>belongs to the same class as object  ". In the classical theory of comparator
identification, this corresponds to the notion of an equivalence predicate of object identity.</p>
        <p>In other words, within the framework of our problem we can say that edible mushrooms form
one equivalence class (by the relation "having the same edibility status", defined through the
similarity of key properties), and poisonous mushrooms form another. The identification task boils
down to determining which of these two equivalence classes the unknown mushroom belongs to.</p>
        <p>The mechanism of decision making in the comparator structure relies on a set of predicates of
similarity with standard samples (references). Suppose we have a set of known edible mushrooms
 = ( ,  , … , 
) and poisonous mushrooms  = ( ,  , … , 
) (these samples can be
considered as a training sample or expert knowledge). The unknown mushroom class  is decided
based on analyzing   ,</p>
        <p>and  ( ,  ) - distances to known benchmarks of both classes. Formally,
two distance functions can be introduced:
 ( ) = min  ( ,  ) ,  ( ) = min  ( ,  ),
∈
∈
i.e., the distance from  to the nearest edible and nearest poisonous specimen, respectively. The
classification rule is then given as:
 ( ) =  ,  
– “undefined”.</p>
        <p>In the case of equal distances, either a finer criterion can be applied or the decision is postponed
to request additional features. Thus, the decision is made in favor of the class to which benchmarks
the mushroom  happens to be closer in the feature space. This mechanism is a formalization of the
principle "an unknown object belongs to the same class as the most similar known object". The
comparator identification method actually assigns objects to classes based on their similarity to
representatives of these classes, dividing the set of objects into equivalence classes automatically.</p>
        <p>It is important to emphasize that when  = 0 (the requirement of complete coincidence of
descriptions), the rule is reduced to an exact match with the reference: an unknown mushroom is
classified as edible if at least one known edible mushroom with an identical set of features is found
(otherwise, the poisonous one is checked). However, in real conditions, new combinations of features
that have not been found before are possible. Then we have to rely on  &gt; 0, i.e. we have to allow
partial coincidence. The metric-based decision mechanism naturally takes partial matches into
account: even if there is no exact analog in memory, the mushroom will be assigned to the class
whose sample is most similar (minimum distance). This approach is robust to feature variation and
noise in the data, as it does not require a perfect match, but uses a proximity measure.</p>
        <p>The above mechanism can also be described as a logical scheme based on comparison predicates.
The logical model of identification represents the solution as inference based on sets of conditions.
In the simplest case, for  = 0, we can write the logical expression for the mushroom  belonging to
the edible class as a disjunction of conjunctions reflecting the match with each edible reference:

( ) =</p>
        <p>( ,  ) ,
∈
 ( ,  ) =
1,  
0,  
= 
≠ 
where each  ( ,  ) means "mushroom  matches in feature  with known edible mushroom  ":
Similarly, we can define the formula 
( ) in terms of known poisonous mushrooms. If


( ) = 1, the mushroom is identified as edible; if 
( ) = 1, it is identified as
poisonous. In the case when none of the formulas is reach to 1 (i.e., there is no complete match with
any of the benchmark), a fuzzy or stepwise logical solution is used. For example, it is possible to
check conditions in descending order of their diagnostic significance:</p>
        <p>Step 1: Check for signs clearly indicating poisonousness. If feature  is found, the values of which
never occur in edible mushrooms (but do occur in poisonous ones), and 
has just such a value
immediately classify the mushroom as poisonous (a decision without further doubt). For example, if
= "pungent" or</p>
        <p>= "fishy", then the mushroom is definitely poisonous (in the UCI
Mushroom dataset, all mushrooms with a pungent or fishy odor are poisonous).</p>
        <p>Step 2: If no obvious poisonous features are found, check for characteristic combinations of
features of edible mushrooms. For example, for some edible mushrooms, the following combination
may be typical 
( ,  ) ∧ 
( ,  ) ∧ 
( ,  ) for some reference  . In other
words, if a mushroom  satisfies most of the conditions characteristic of a certain edible species (or
group of species), then a reasonable conclusion can be drawn about its edibility.</p>
        <p>
          Step 3: If doubts remain (there are both edible traits and uncharacteristic abnormalities), a more
refined analysis is performed: comparison with the closest edible and poisonous references (e.g. by
the  ( ,  ) metric as described above) and analysis of which differences prevent unambiguous
identification. Additional information or an expert may need to be brought in at this step. In logical
(
          <xref ref-type="bibr" rid="ref14">13</xref>
          )
(
          <xref ref-type="bibr" rid="ref15">14</xref>
          )
terms, step 3 corresponds to evaluating the truth of similarity predicates at some  &gt; 0 and selecting
a class based on the maximum number of fulfilled predicates 
with the benchmarks of each class.
        </p>
        <p>The logic scheme is thus reduced to a set of rules of the form "IF &lt;conditions of comparison&gt;, THEN
&lt;resolution&gt;". These rules can be extracted from comparisons with the benchmarks and knowledge
about the diagnostic value of the features. The advantage of comparator identification is that such a
scheme is human verifiable: the decision is justified by explicitly stating with which known
mushrooms and on which features a given specimen matches or diverges. In fact, the method fixes
the course of reasoning of an expert mushroom grower: for example, "if the mushroom has white
plates and a ring on the stalk, and there is no unpleasant odor, then it looks like a champignon (edible)
and does not look like a pale grebe (which has green plates and volva)". All of this reasoning can be
precisely expressed through the predicates  and their logical combinations.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results and discussion</title>
      <p>This section looks at an example of how to apply the proposed approach based on the comparator
model.
equation</p>
      <p>,
Let  = ( , 
, … ,  , … ,</p>
      <p>) be the binary indicator vector that encodes every categorical
attribute of a mushroom specimen in the UCI data set (Table1). For each attribute  = (1,2, … ,22) and
for each of its 
admissible categories, a component:</p>
      <p>
        0, otherwise,
exhibits the  − th category of the  − th attribute,
(
        <xref ref-type="bibr" rid="ref16">15</xref>
        )
 ). The notation 
belong to attribute  : 
Attribute values encoding (based on data set description in [20])
      </p>
      <p>Attribute values and their notations
is introduced. Hence, a specimen is mapped to a 111-dimensional binary vector (the sum of all
without a second index will be reserved for the whole block of components that</p>
      <p>= ( ,  , … ,  , ).
i
1
2
3
4
5
6
7
8
9</p>
      <p>Attribute name
(according to UCI
data set)
cap-shape
cap-surface
cap-color
bruises</p>
      <p>odor
gill-attachment
gill-spacing
gill-size
gill-color
 1,1: bell,  1,2: conical,  1,3: convex,  1,4: flat,  1,5: knobbed,</p>
      <p>1,6: sunken
 2,1: fibrous,  2,2: grooves,  2,3: scaly,  2,4: smooth
 3,1: brown,  3,2: buff,  3,3: cinnamon,  3,4: gray,  3,5: green,
 3,6: pink,  3,7: purple,  3,8: red,  3,9: white,  3,10: yellow</p>
      <p>4,1: true,  4,2: false
 5,1: almond,  5,2: anise,  5,3: creosote,  5,4: fishy,  5,5: foul,</p>
      <p>5,6: musty,  5,7: none,  5,8: pungent,  5,9: spicy
 6,1: attached,  6,2: descending,  6,3: free,  6,4: notched
 7,1: close,  7,2: crowded,  7,3: distant</p>
      <p>8,1: broad,  8,2: narrow
 9,1: black,  9,2: brown,  9,3: buff,  9,4: chocolate,  9,5: gray,
 9,6: green,  9,7: orange,  9,8: pink,  9,9: purple,  9,10: red,  9,11:
 11,1: bulbous,  11,2: club,  11,3: cup,  11,4: equal,  11,5:</p>
      <p>rhizomorphs,  11,6: rooted,  11,7: missing
 12,1: fibrous,  12,2: scaly,  12,3: silky,  12,4: smooth
 13,1: fibrous,  13,2: scaly,  13,3: silky,  13,4: smooth
 14,1: brown,  14,2: buff,  14,3: cinnamon,  14,4: gray,  14,5:</p>
      <p>orange,  14,6: pink,  14,7: red,  14,8: white,  14,9: yellow
 15,1: brown,  15,2: buff,  15,3: cinnamon,  15,4: gray,  15,5:</p>
      <p>orange,  15,6: pink,  15,7: red,  15,8: white,  15,9: yellow
 16,1: partial,  16,2: universal (note: in the data set only partial</p>
      <p>occurs)
 17,1: brown,  17,2: orange,  17,3: white,  17,4: yellow</p>
      <p>18,1: none,  18,2: one,  18,3: two
 19,1: cobwebby,  19,2: evanescent,  19,3: flaring,  19,4: large,</p>
      <p>19,5: none,  19,6: pendant,  19,7: sheathing,  19,8: zone
 20,1: black,  20,2: brown,  20,3: buff,  20,4: chocolate,  20,5:
green,  20,6: orange,  20,7: purple,  20,8: white,  20,9: yellow
 21,1: abundant,  21,2: clustered,  21,3: numerous,  21,4:</p>
      <p>scattered,  21,5: several,  21,6: solitary
 22,1: grasses,  22,2: leaves,  22,3: meadows,  22,4: paths, x 22,5:</p>
      <p>urban,  22,6: waste,  22,7: woods</p>
      <p>
        Let's consider the typical process of analyzing product descriptions using a comparator model. As
mentioned above, we will use the UCI Mushroom dataset and determine the edibility of a mushroom
based on its description. We assume the user is describing the mushrooms directly in front of them
and is indicating all their sensory perceptions (appearance, color, smell, etc.). The comparator then
returns one of the 12 defined classes as the answer. We interpret the comparator's response as
"mushroom X is similar to edible" (if the delta response is 1) based on the coincidence of the following
features, for which (
        <xref ref-type="bibr" rid="ref6">5</xref>
        ) is true. First, we generate various descriptions that a user could provide and
demonstrate which features can be extracted from them.
      </p>
      <p>Below are seven sample descriptions. They are all phrased differently, with some attributes named
explicitly, some hinted at, and some left unspecified – exactly the variety you would face in practice.</p>
      <p>Example 1: “The cap is flat and smooth, light-brown in color; the gills underneath are crowded
and white. I don’t notice any smell at all. There’s one thin pendant ring on the stalk, which tapers
slightly and is white both above and below the ring. No blue bruising when I press it.”</p>
      <p>There are core attributes extracted: cap-shape – “flat”, cap-surface – “smooth”, cap-color –
“brown”, bruises – “false”, odor – “none”, gill-spacing – “crowded”, gill-color – “white”, stalk-shape –
“tapering”, stalk-color-above-ring – “white”, stalk-color-below-ring – “white”, ring-number – “one”,
ring-type – “pendant”. Other values are undefined.</p>
      <p>Example 2: “Tiny purple-red buttons pushing up through grassy soil - the caps look convex and
a bit scaly. When I scratch the flesh it bruises blue-green, and there’s a strong, almost chemical odor.
Can’t see any skirt or ring yet.”</p>
      <p>There are core attributes extracted: cap-shape – “convex”, cap-surface – “scaly”, cap-color –
“purple” or “red”, bruises – “true”, odor-“creosote”, ring-number – “none observed”, ring-type – “none”,
and habitat – “grasses”. Other values are undefined.</p>
      <p>Example 3: “It grows alone on a fallen log in the woods. The top is bell-shaped, kind of
cinnamoncolored, and the stalk widens at the base. The air around it smells spicy – like cloves. I didn’t notice
any spore dust.”</p>
      <p>There are core attributes extracted: cap-shape – “bell-shaped”, cap-color – “cinnamon”,
stalkshape – “enlarging”, odor – “spicy”, population – “solitary”, and habitat – “woods”. Other values are
undefined.</p>
      <p>Example 4: “These mushrooms form tight clusters on leaf litter. Caps are sunken in the middle,
with a yellow surface that feels fibrous. Gills seem distant and pale gray. The stalk is silky above a
single ring and orangey below. When sliced, nothing turns blue.”</p>
      <p>There are core attributes extracted: population – “clustered”, habitat – “leaves”, cap-shape –
“sunken”, cap-surface – “fibrous”, cap-color – “yellow”, gill-spacing – “distant”, gill-color – “gray”,
bruises – “none”, ring-number – “one”, stalk-surface-above-ring – “silky”, and stalk-color-below-ring
– “orange”. Other values are undefined.</p>
      <p>Example 5: “Cap surface is smooth, pale pink; no scales or grooves. There’s definitely no ring, and
the stem stays the same thickness top to bottom. I get a musty cellar smell but can’t decide if it
bruises— pressing didn’t change the color. Not sure about spore-print yet.”</p>
      <p>There are core attributes extracted: cap-surface – “smooth”, cap-color – “pink”, ring-number –
“none”, ring-type – “none”, odor – “musty”, and bruises – “none”. Other values are undefined.</p>
      <p>Example 6: “Mature mushrooms with broad pink gills and a pleasant almond scent. The overnight
spore print is deep brown. A thin pendant ring encircles the stalk, and the tissue below the ring is perfectly
smooth.”</p>
      <p>There are core attributes extracted: odor – “almond”, spore-print-color – “brown”, gill-color –
“pink”, ring-type – “pendant”, and stalk-surface-below-ring – “smooth”. Other values are undefined.</p>
      <p>Example 7: “A cluster of slick white caps gives off a distinctly fishy smell. The gills are white and the
spore print is also white. A skirt-like pendant ring hangs from the stalk, which feels smooth below the
ring”.</p>
      <p>The core attributes extracted are: odor – “fishy”, spore-print-color – “white”, gill-color – “white”,
ring-type – “pendant”, and stalk-surface-below-ring – “smooth”. Other values are undefined.</p>
      <p>Let  = 1 denote that the mushroom exhibits the  -th value of the  -th attribute (as numbered
above), and  = 0 otherwise. For every description we write a conjunctive clause Φ ( ) that fixes
only the attribute–value pairs explicitly inferable from the text; all other indicators remain free.
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ,
Φ ( ) =  , ∧  , ∧ ( , ∨  , ) ∧  , ∧  , ∧  , ∧  , ∧  , ,
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ,
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ,
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ,
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , ,
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , .</p>
      <p>We will complete the identification task in two stages. First, we will evaluate the edibility of the
mushroom based on a core set of characteristics. Then, we will refine the results using a full set of
characteristics. The core set of characteristics will be based on the values available in the training
dataset. To do this, we take the conjunction of all true predicates for each reference class of edible
mushrooms.</p>
      <p>Φ =</p>
      <p>,
∈ , ∈
where  represents the attributes of each edible specimen.</p>
      <p>
        Similarly, we determine the class of poisonous mushrooms.
(
        <xref ref-type="bibr" rid="ref17">16</xref>
        )
Then the entire training set can be described by the formula
Φ
=
      </p>
      <p>∈</p>
      <p>
        , ∈
Φ = Φ ∨ ¬Φ
(
        <xref ref-type="bibr" rid="ref18">17</xref>
        )
(
        <xref ref-type="bibr" rid="ref19">18</xref>
        )
 = { , 
, 
,
      </p>
      <p>
        }
Φ ( ) =  ,
Φ ( ) =  , ,
Φ ( ) =  , ,
Φ ( ) =  , ,
Φ ( ) =  , ∧ 
Φ ( ) =  ,
∧ 
∧  , ,
, ∧  , ∧ 
, ∧  , ∧ 
resulting core, based on the solution of equation (
        <xref ref-type="bibr" rid="ref19">18</xref>
        ), contains only the features whose values allow
us to distinguish between edible and poisonous mushrooms. This core ensures the equivalence of
the original and reduced comparator classifiers. For this dataset, the following kernel was obtained:
spore print color, gill color, ring type, stalk-surface-below-ring, i.e.
      </p>
      <p>
        Let's simplify the equations that describe unknown instances of mushrooms. We obtain:
It is clear that for examples 1-5, the information obtained is insufficient to assign these samples
to one of the classes. However, for examples 6 and 7, it is possible to determine their proximity to
one of the classes based on the core features. For these examples, we calculate the distances (
        <xref ref-type="bibr" rid="ref12">11</xref>
        ) and
apply the classification rule (
        <xref ref-type="bibr" rid="ref13">12</xref>
        ). We obtain:
      </p>
      <p>( ) = 3,  ( ) = 1,
i.e. sample 6 is closed to edible prototype and sample 7 is closed to poisonous prototype. Then,
analyze those examples using the full attribute space. For the example 1 we have
Φ ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  ,
∧ 
, ∧ 
, ∧ 
, ∧ 
, ∧  , ,
and the closest edible and poisonous prototypes are described as
E ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧ 
P ( ) =  , ∧  , ∧  , ∧  , ∧  , ∧  , ∧  ,
∧ 
, ∧  , ∧ 
, ∧ 
, ∧ 
, ∧ 
, ∧ 
, ∧  , ,
, ∧ 
( ) = 2. Because 
( ) the specimen is closer to the poisonous
prototype when every available attribute is considered.</p>
      <p>Thus, the combination “no bruises + white gills + crowded gills + pendant ring + odor none”
matches key traits of Amanita phalloides more than of common edible Agaricus; without spore-print
color or chemical tests, the safer classification is poisonous, i.e. avoid consumption. Hence, from a
comparator perspective, Example 1 should be flagged unsafe unless further evidence pushes it toward
the edible region.</p>
      <p>Therefore, the comparator identification method applied to the edibility/poisoning task, in fact,
implements binary classification, albeit in a different paradigm. That is to say, it establishes
equivalence or similarity with samples. The classification of a set of objects is determined through a
comparative analysis with representatives of these established classes. From a formal point of view,
the result of the comparator scheme is a function, which takes on two values (e.g., 1 for edible and 0
for poisonous mushrooms). That is to say, it corresponds to the target variable in the classification
task. However, the internal logic of decision-making processes differs from, for example, the
decision-making processes of a tree or a neural network. The comparator model does not directly
derive the formula for this function through the features. Rather, it computes a value through
comparisons with reference objects.</p>
      <p>
        It is evident that the rule, based on (
        <xref ref-type="bibr" rid="ref12">11</xref>
        ), is a variation of the nearest neighbor method in feature
space. That is to say, the class of a new object is determined by the class of the nearest neighbor
among the training data. The distinction in emphasis is as follows: comparator identification
accentuates the explicability and logical structure of such a solution. While the K-nearest neighbor
algorithm does not explicitly provide the class label, the comparator scheme can provide a rationale
for the classification. For instance, it can be explained that the closest was a mushroom classified as
edible with a distance of 2; however, the closest poisonous one had a distance of 4. Therefore, the
classification of this instance as edible is supported by these data. Furthermore, the method enables
the incorporation of a priori rules, thereby aligning classification with the characteristics of an expert
system. Consequently, comparator solutions are directly associated with the outcomes of binary
classification, while concurrently offering an interpretation through resemblance to recognized
patterns. The veracity of classification is contingent upon the assumption that the feature space
adequately differentiates edible species from poisonous ones. This is a prerequisite for the
applicability of comparator identification.
      </p>
      <p>In summary, the application of the comparator identification method to the task of mushroom
edibility demonstrates an alternative, human-understandable approach to binary classification.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>In this work, we have demonstrated a framework for identifying food edibility grounded in
comparator-based predicate logic. To answer the first research question, we have shown how logical
rules for edibility can be formalized as predicate structures based on observable sensory
characteristics. Each perceptual feature – such as cap shape, odor, texture, and color – is encoded as
a finite-valued predicate or an indicator variable. We have also developed a core-extraction
procedure to isolate the minimal subset of features. This yields a compact, human-readable core
feature set that drives the comparator decision rule. To evaluate the second research question, which
concerns the effectiveness of the comparator model versus traditional machine learning approaches,
we conducted experiments on the canonical UCI Mushroom dataset.</p>
      <p>Thus, we have formally described the process by which sensory attributes are converted into a
system of comparisons, the manner in which a metric and logical structure for decision making is
built on this basis, and the manner in which the final verdict ("edible" or "poisonous") is obtained as
a consequence of comparison with already known samples. This approach establishes a connection
between a rigorous mathematical model, characterized by predicates and metrics, and practical
interpretability. This interpretability is particularly valuable in the critical domain of poisonous
mushroom identification.</p>
      <p>A key contribution of this study is the interpretability of comparator solutions. Unlike
"blackbox" models, comparator rules provide logical explanations – for each edibility verdict, one can trace
exactly which features matched which reference specimens and which predicate failures tipped the
decision. The core-extraction method ensures that only features with genuine discriminative power
appear in the final rule, which simplifies the explanation further. In user-facing scenarios (e.g.,
mobile identification apps), this transparency allows users to understand and trust the model's
verdicts and supply targeted follow-up descriptions when the model is uncertain.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgements</title>
      <p>The EU NextGenerationEU partially funds the research study depicted in this paper through the
Recovery and Resilience Plan for Slovakia under project No. 09I03-03-V01-00078.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this paper, the authors used Grammarly in order to grammar and spelling
check and DeepL in order to text translation. After using these tools, the authors reviewed and edited
the content as needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <article-title>Next, we sequentially apply the method of extracting essential features, as proposed in [1]</article-title>
          . The
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Karataiev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Sitnikov</surname>
          </string-name>
          , and
          <string-name>
            <given-names>N.</given-names>
            <surname>Sharonova</surname>
          </string-name>
          ,
          <article-title>A method for investigating links between discrete data features in knowledge bases in the form of predicate equations</article-title>
          ,
          <source>in: Proc. CEUR Workshop</source>
          , vol.
          <volume>3387</volume>
          , pp.
          <fpage>224</fpage>
          -
          <lpage>235</lpage>
          ,
          <year>2023</year>
          . https://ceur-ws.
          <source>org/</source>
          Vol-3387
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T. M.</given-names>
            <surname>Tashu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fattouh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kiss</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.</given-names>
            <surname>Horváth</surname>
          </string-name>
          ,
          <string-name>
            <surname>"Multimodal E-Commerce Product Classification Using Hierarchical Fusion</surname>
          </string-name>
          ,
          <article-title>"</article-title>
          <source>2022 IEEE 2nd Conference on Information Technology and Data Science (CITDS)</source>
          , Debrecen, Hungary,
          <year>2022</year>
          , pp.
          <fpage>279</fpage>
          -
          <lpage>284</lpage>
          , doi: 10.1109/CITDS54976.
          <year>2022</year>
          .
          <volume>9914136</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          et al.,
          <article-title>This looks like that: Deep learning for interpretable image recognition</article-title>
          ,
          <source>in Proc. NeurIPS</source>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhou</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zheng</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Xin</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          (
          <year>2024</year>
          ).
          <article-title>Intelligent classification and personalized recommendation of e-commerce products based on machine learning</article-title>
          .
          <source>arXiv preprint arXiv:2403</source>
          .
          <fpage>19345</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>Huanjing</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Beining</given-names>
            <surname>Yang</surname>
          </string-name>
          , Yukuo Cen, Junyu Ren, Chenhui Zhang, Yuxiao Dong, Evgeny Kharlamov,
          <string-name>
            <given-names>Shu</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Jie</given-names>
            <surname>Tang</surname>
          </string-name>
          .
          <year>2024</year>
          .
          <article-title>Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed Graphs</article-title>
          .
          <source>In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '24)</source>
          .
          <article-title>Association for Computing Machinery</article-title>
          , New York, NY, USA,
          <fpage>4467</fpage>
          -
          <lpage>4478</lpage>
          . https://doi.org/10.1145/3637528.3671952
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Hu</surname>
          </string-name>
          et al.,
          <article-title>Natural language processing and machine learning approaches for food categorization and nutrition quality prediction compared to traditional methods</article-title>
          ,
          <source>American Journal of Clinical Nutrition</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <surname>M. van Erp</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. van der Sande</surname>
          </string-name>
          , and
          <string-name>
            <surname>C. van Son</surname>
          </string-name>
          ,
          <article-title>Using AI to analyze nutrition and sustainability of recipes</article-title>
          ,
          <source>Frontiers in Artificial Intelligence</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>K.</given-names>
            <surname>Makridis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gkillas</surname>
          </string-name>
          , and G. Sermpinis,
          <article-title>Deep learning with NLP and time-series modeling for enhanced food safety</article-title>
          ,
          <source>Machine Learning</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>B.</given-names>
            <surname>Wagner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rowe</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Gillam</surname>
          </string-name>
          ,
          <article-title>Mushroom data creation, curation, and simulation to support binary classification</article-title>
          ,
          <source>Scientific Reports</source>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Snell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Swersky</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Zemel</surname>
          </string-name>
          ,
          <article-title>Prototypical networks for few-shot learning</article-title>
          ,
          <source>in Proc. NeurIPS</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kohonen</surname>
          </string-name>
          ,
          <string-name>
            <surname>Self-Organizing</surname>
            <given-names>Maps</given-names>
          </string-name>
          , 2nd ed. Berlin, Germany: Springer,
          <year>1995</year>
          <article-title>(chap</article-title>
          .
          <source>on Learning Vector Quantization).</source>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Cherednichenko</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nebesky</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kováč</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <article-title>Gathering and Matching Data from the Web: The Bibliographic Data Collection Case Study</article-title>
          .
          <source>International Conference on Smart Business Technologies</source>
          ,
          <year>2024</year>
          , pp
          <fpage>139</fpage>
          -
          <lpage>146</lpage>
          . DOI:
          <volume>10</volume>
          .5220/0012863500003764
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yu</surname>
          </string-name>
          et al.,
          <article-title>Pre-training Language Models for Comparative Reasoning</article-title>
          ,
          <source>in Proc. EMNLP</source>
          ,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>B.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rudin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <article-title>The Bayesian Case Model: A generative approach for casebased reasoning and prototype inference</article-title>
          ,
          <source>in Proc. ICML</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [15]
          <string-name>
            <surname>Shuhe</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Xiaofei Sun,
          <string-name>
            <given-names>Xiaoya</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Rongbin</given-names>
            <surname>Ouyang</surname>
          </string-name>
          , Fei Wu, Tianwei Zhang, Jiwei Li,
          <article-title>Guoyin Wang GPT-NER: Named Entity Recognition via Large Language Models (2023) doi</article-title>
          .org/10.48550/arXiv.2304.10428
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [16]
          <string-name>
            <surname>Baigang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yi</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <article-title>A review: development of named entity recognition (NER) technology for aeronautical information intelligence</article-title>
          .
          <source>Artif Intell Rev</source>
          <volume>56</volume>
          ,
          <fpage>1515</fpage>
          -
          <lpage>1542</lpage>
          (
          <year>2023</year>
          ). https://doi.org/10.1007/s10462-022-10197-2
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Sowmya</surname>
            <given-names>Vajjala</given-names>
          </string-name>
          ,
          <article-title>Ramya Balasubramaniam What do we Really Know about State of the Art NER? (2022) doi</article-title>
          .org/10.48550/arXiv.2205.00034
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [18]
          <string-name>
            <surname>Yu</surname>
            <given-names>Wang</given-names>
          </string-name>
          , Hanghang Tong, Ziye Zhu,
          <article-title>Yun Li Nested Named Entity Recognition: A Survey ACM Transactions on Knowledge Discovery from Data (TKDD)</article-title>
          , Volume
          <volume>16</volume>
          , Issue 6 Article No.:
          <volume>108</volume>
          ,
          <string-name>
            <surname>Pages</surname>
          </string-name>
          1 -
          <fpage>29</fpage>
          . https://doi.org/10.1145/3522593
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [19]
          <string-name>
            <surname>Mayhew</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blevins</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Šuppa</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonen</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Imperial</surname>
            ,
            <given-names>J. M.</given-names>
          </string-name>
          , ... &amp;
          <string-name>
            <surname>Pinter</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          (
          <year>2023</year>
          ).
          <article-title>Universal NER: A gold-standard multilingual named entity recognition benchmark</article-title>
          .
          <source>arXiv preprint arXiv:2311</source>
          .
          <fpage>09122</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>