<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of Machine Learning Research</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.48550/arXiv.1810.04805</article-id>
      <title-group>
        <article-title>Semantic Statistical Model for Assessing Expert Competence in Scientific Competitions</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vitaliy Tsyganok</string-name>
          <email>vitaliytsyganok@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yaroslav Khrolenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information Technology and Implementation, IT&amp;I-2025</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute for Information Recording of NAS of Ukraine</institution>
          ,
          <addr-line>2 Shpaka Street, Kyiv, 03113</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <volume>22</volume>
      <issue>142</issue>
      <fpage>104</fpage>
      <lpage>116</lpage>
      <abstract>
        <p>This paper addresses the problem of objectively evaluating expert competence during the formation of juries for scientific competitions. It is shown that traditional bibliometric indicators (such as the Hirsch index, number of publications, and citation counts) the specific subject domain of a competition, particularly in cases of interdisciplinary studies. To overcome this limitation, a semantic statistical model is proposed, based on describing the comp scope through a set of concepts and its iterative expansion via co-occurrence analysis in bibliographic the integral competence score is defined as a weighted sum of matches across different levels of the model. The experimental validation of the model was carried out using data from the OpenAlex aggregator, which contains over 65,000 concepts and corpora of publications across diverse scientific domains. The study demonstrated that the model is robust to reductions in sample size: even when using only 3% of the full publication corpus, the results remain close to those of the complete model (Jaccard coefficient &gt; 0.9). This indicates the possibility of reducing the volume of processed data without significant quality loss, thereby lowering computational requirements and enabling partial offloading of calculations to the aggregator side. The practical significance of the proposed approach lies in the development of automated decisionsupport systems for organizers of scientific competitions, conferences, and grant programs. The model enhances the objectivity and transparency of expert selection, accommodates the interdisciplinary nature of research topics, and ensures interpretability of evaluation results.</p>
      </abstract>
      <kwd-group>
        <kwd>expert competence</kwd>
        <kwd>semantic statistical model</kwd>
        <kwd>competitions of scientific papers</kwd>
        <kwd>OpenAlex</kwd>
        <kwd>conceptual units</kwd>
        <kwd>thematic relevance 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
-balanced expert
committees for student research competitions, grant allocation, and project peer review is becoming
increasingly significant. The quality of decisions is directly contingent upon the competence of the
selected experts, as they determine whether the submitted works align with the current state of
scientific advancement, demonstrate novelty, and possess practical relevance. Conventional
approaches to expert selection primarily rely on bibliometric indicators such as the Hirsch index,
publication counts, citation</p>
      <p>metrics, and academic titles. Although these indicators reflect a
with the specific subject area of the competition. This limitation is particularly critical in
interdisciplinary research.</p>
      <p>Emerging domains often integrate knowledge from
multiple disciplines, giving rise to novel
subject areas that remain insufficiently represented in traditional classification systems. Fields such
lignment
as bioinformatics, quantum computing, and cognitive science exemplify this trend, as they evolve at
disciplinary intersections and cannot be adequately categorized through journal hierarchies or
domain-specific taxonomies alone. Consequently, the selection of experts becomes problematic:
qualifications but insufficient specialization to assess submissions in narrow or emerging domains.</p>
      <p>These challenges highlight the need for models that combine semantic analysis of scientific texts
with statistical methods for evaluating the co-occurrence of key concepts in global research
discourse. Such an integrated approach enables a more precise assessment of thematic proximity</p>
    </sec>
    <sec id="sec-2">
      <title>2. Analysis of recent research</title>
      <p>
        Over the past decades, the scientific community has proposed a variety of approaches for evaluating
common remain formal bibliometric methods, which are based on quantitative indicators of research
productivity. These include the Hirsch index, citation counts, total number of publications, as well
as academic degrees and titles. Such metrics have clear advantages: they are easy to compute, well
standardized, and allow for quick comparisons between researchers. However, their main drawback
lies in the lack of connection to a specific subject domain. As Bornmann and Daniel [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and Waltman
and van Eck [2] have shown, formal indicators do not always correlate with actual expertise in
narrow thematic fields.
      </p>
      <p>To account for the content-specific nature of research, content-oriented methods have been
developed. These are based on the analysis of keywords, abstracts, and descriptors accompanying
scientific publications. In this case, the comparison of expert profiles and competition materials is
performed by matching or measuring the frequency of recurring terms. The advantage of this
approach lies in its intuitive interpretability and applicability even with small text corpora. Its
weakness, however, stems from the ambiguity of natural language: the same concept can be
expressed by different words or phrases, and author-provided keywords are often subjective and do
not necessarily reflect the true content of the article, as emphasized by Haustein and Larivière [3].</p>
      <p>The next stage of evolution involves semantic methods that employ natural language processing
techniques to determine contextual proximity between terms. Modern vector-based models, such as
Word2Vec (Mikolov et al. [4]), GloVe, or BERT (Devlin et al. [5]), represent words and concepts in a
multidimensional space, where the distance between them corresponds to semantic similarity. This
enables comparison not only of identical terms but also of semantically related concepts. For
example, if a competition topi
without a direct keyword match.</p>
      <p>Statistical methods, in turn, rely on studying the patterns of co-occurrence of key concepts in
publications. A typical example is the construction of co-occurrence graphs, where vertices represent
concepts and edges reflect their joint frequency of use. Such graphs make it possible not only to
identify directly related concepts but also to study the structure of scientific knowledge at a higher
level, for instance, to detect interdisciplinary links or to form thematic clusters, as discussed by van
Eck and Waltman [6].</p>
      <p>
        The problem of assessing expert competence correlates with the task of assigning reviewers to
scientific articles. For example, Stelmakh, Shah, and Singh [7] proposed the PeerReview4All model,
which addresses the reviewer assignment problem by maximizing topical relevance between
submitted works and expert profiles. A systematic review by Zhao et al. [8] summarizes recent
methods of automatic reviewer assignment, highlighting relevance and scalability as key criteria.
Jovanovic et al. [9], in their review o
thematic competence remains central to the further development of reviewer assignment models.
Leyton-Brown, Shah, Stelmakh, and Thakur [
        <xref ref-type="bibr" rid="ref6">10</xref>
        ] describe the practical LCM system, implemented at
AAAI-35, which enables scalable reviewer paper matching based on topical scores. Similarly, Anjum
et al. [11] propose a joint topical space model for aligning articles with reviewers, demonstrating
high accuracy on datasets from computer architecture conferences.
      </p>
      <p>In recent years, aggregator services have become widely adopted, providing large-scale access to
scientific works and their associated metrics, while performing classification tasks that require the
identification of conceptual units (concepts, topics, keywords), often on the basis of semantic
similarity. Services such as Scopus, Web of Science, PubMed, or Google Scholar provide access to
massive corpora of publications, along with bibliometric indicators (citations, h-index) and altmetric
measures (social media impact, mentions). Each platform relies on its own system of conceptual
classification: Scopus employs the ASJC journal classification, Web of Science uses its hierarchical
subject categories, PubMed applies the MeSH vocabulary, while OpenAlex introduces a multilayered
taxonomy of Fields of Study. Each of these approaches has advantages and limitations. For instance,
Moed [12] notes that Scopus classification does not always reflect the actual topic of an individual
article since it is tied to journal profiles. PubMed is highly effective in the biomedical domain but not
applicable to other fields. OpenAlex, described by Priem, Piwowar, Orr, and Carbery [13], provides
the broadest coverage, but its concepts are sometimes overly general.</p>
      <p>Nevertheless, the numerical connectivity indicators calculated by aggregators provide
information about the proximity of concepts within the entire system of scientific knowledge.
However, the task of competence evaluation requires assessing the proximity
profile to that of a specific competition. This is particularly important when competitions address
emerging research areas, which often lie at the intersections of several disciplines and lack a
wellestablished position in the broader system of knowledge.</p>
      <p>Thus, a review of the literature shows that existing approaches can partially address the problem
of expert competence evaluation, but remain either overly formal or narrowly specialized. None of
them ensures accuracy, universality, and adaptability to interdisciplinary challenges simultaneously.
This creates a foundation for developing hybrid models that integrate semantic and statistical
features, thereby enabling a more comprehensive and relevant representation of scientific
competence.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Problem statement</title>
      <p>The object of this study is the process of evaluating expert competence in the context of forming a
jury for scientific competitions. This process directly determines the quality of the evaluation of
competition materials, since the correctness of reviewer selection affects both the objectivity and the
scientific significance of the resulting decisions.</p>
      <p>Recent research emphasizes that the key objective of reviewer assignment systems is to ensure a
The scientific problem lies in the lack of a universal competence evaluation model that would
accurately capture the topical alignment of experts with the subject matter of competitions. Such a
model must account simultaneously for the high degree of specialization of research topics on the
one hand, and their interdisciplinary character on the other.</p>
      <p>The aim of this work is to develop a model for evaluating the competence of jury members in
scientific competitions, based on the semantic and statistical relationships between the concepts
he scientific works of potential
experts. The proposed approach is intended to create a universal method that incorporates the
interdisciplinary nature of modern scientific domains and enables the evaluation of expert
competence with respect to the specific subject domain of the competition. The model should ensure
completeness, accuracy, scalability, and practical applicability, while allowing adaptation to different
types of content units (keywords, topics, descriptors) with minimal modifications.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Metodology</title>
      <p>4.1.
identifier assigned by a scientific aggregator (such as OpenAlex, Scopus, or PubMed) as a result of
the automatic classification of a publication.</p>
      <p>Concepts possess a number of properties.
controlled vocabulary (taxonomy). For example, in OpenAlex, it is an element of the Fields of Study
A concept is not a free-form
keyword provided by the author but the result of machine analysis of publication metadata and/or
full text performed by the aggregator using NLP algorithms. The use of standardized concepts
eliminates problems of synonymy (
biology vs. electrical engineering), since each semantic notion corresponds to a unique identifier.
Another advantage provided by the use of aggregator platform classifiers is their multilingual
capability. For instance, OpenAlex employs NLP algorithms capable of handling various languages.
When processing non-English publications, the system attempts to map local terminology onto the
global FoS taxonomy, ensuring comparability of works written in different languages within a single
semantic space. However, the quality of classification depends on the effectiveness of processing the
specific language.</p>
      <p>A key stage in constructing a semantic statistical model for evaluating expert competence is the
competition as a set of concepts content units that reflect its thematic scope. Concepts may be
represented in the form of keywords, descriptors, topics, or other semantic markers that characterize
scientific texts. Thus, the set of competition concepts forms a conceptual core, relative to which the
subsequent dete</p>
      <p>The source for constructing such a set can be standardized classification systems or taxonomies
used in scientific information aggregators. It is recommended to adopt the classifier of the aggregator
from which the scientific works will subsequently be retrieved for analysis. The use of such classifiers
with bibliographic databases.</p>
      <p>The formalization of a competition or expert review topic begins with selecting a classification
system from a scientific aggregator (for example, Fields of Study in OpenAlex, ASJC in Scopus, or
MeSH in PubMed), which provides access to a standardized vocabulary of concepts with unique
identifiers. The initial set of competition concepts is determined through programmatic retrieval via
queries through the respective API. The system returns a ranked list of standardized concepts with
relevance scores, which, after expert validation, form the final set of competition concepts.</p>
      <p>In cases where the organizers of a competition cannot employ an existing classifier, or when the
available system proves insufficiently detailed, the set of concepts may be formed directly through
the analysis of competition documentation, abstracts, keywords, and titles of scientific works in the
relevant field. For this purpose, methods of automatic keyword extraction, topic modeling (e.g.,
Latent Dirichlet Allocation [14]), or pre-trained language models (e.g., BERT [15], SciBERT [16]) can
be applied to generate coherent sets of concepts. The construction of a classifier, however, represents
a separate task that lies beyond the scope of this study.</p>
      <p>To construct the semantic space of a competition, viewed as a subspace of the general space of
scientific work concepts, a statistical semantic approach is proposed. Within this approach, the
statistical semantic proximity of concepts is defined on the basis of the frequency of their
cooccurrence in the global corpus of scientific works. Using this method, the competition model is
knowledge concepts (i.e., the complete body of scientific publications).
where
concept:
concepts    :</p>
      <p>0 = {  0 } ,  = 1, 
  0 = {    0 },  = 1, |  0|
N</p>
      <p>the number of competition concepts;
 0 the n-th concept of the set T0.</p>
      <p>Step 2. A sample of works   0 is then selected, each of which contains at least one competition
Each work in the obtained sample {    0 },  = 1, |  0| is characterized by its own set of
The generalized algorithm implementing this approach consists of the following steps:
Step 1. As a starting point, the set of concepts describing the competition is taken. They form the
set of concepts of the initial zero level (layer):
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(10)
(11)</p>
      <p>Step 3. Based on the obtained set of works, a set of unique concepts is formed that are
coonly the absence of duplicates among the identified concepts, but also the exclusion of concepts from
the previous level T0. In this way, the set of first-level (layer) concepts is constructed.</p>
      <p>where  = 1, |T1|
 1 occurs is calculated.</p>
      <p>Step 4. On the obtained set of works   0, the number of works  1 in which each concept  1 ∈
   0 = 〈    〉</p>
      <p>} ,  = 1, |   |
= {</p>
      <p />
      <p>⋂  0 ≠ ∅
with the restriction that the reappearance of competition concepts is not allowed:
For each competition concept, the set of works is determined in which this concept occurs among
their conceptual representations:</p>
      <p>0 = {    0 },  = 1, |  0|
and the number  0 of works in which it is used is calculated:
 0 = |  0 |
|  0|</p>
      <p>1 = { 1} = ( ⋃    )</p>
      <p>⁄  0,
 1 = |  1|
  1 = {  ∈   1 |  1 ∈    }</p>
      <p>Based on the frequency value of concept usage, it is possible to identify and exclude random
occurrences of irrelevant concepts. The threshold frequency level for excluding a concept from
consideration is determined by the specifics of the subject domain and is defined by the organizers
of the competition for example, 3% of the size of the selected set of works.</p>
      <p>1 =  
∗ |  0|,</p>
      <p>
        The use of a 3% threshold for concept filtering is consistent with established practices in
bibliometric mapping, where similar thresholds are applied to identify statistically robust and
semantically meaningful relationships between concepts [
        <xref ref-type="bibr" rid="ref4">17</xref>
        ]. The purpose of this threshold is to
eliminate random
coscientific discourse surrounding the topic. Empirical studies have shown that thresholds in the range
of 1 5% are effective for constructing balanced and interpretable models. The selected 3% value lies
in the middle of this range and, as demonstrated by our experiment, provides an optimal balance
between precision and completeness of the thematic model. However, the task of determining the
appropriate threshold level requires further investigation.
      </p>
      <p>Step 5. Similarly, on the basis of the set T1, the next sample of works   1 is formed, in which at
least one concept from the set T1 was used, and a new set of unique concepts T2 is obtained.</p>
      <p>1 = {    1 },  = 1, |  1|
|  1|</p>
      <p>2 = { 2} = ( ⋃    1)
⁄</p>
      <p>( 0 ⋃  1), де  = 1, |T2|</p>
      <p>For each concept of the set T2, the frequency of its usage  2 is calculated in the sample of works
  1.</p>
      <p>2 = |  2|
  2 = {  ∈   2 |  2 ∈    }
Concepts  2, for which the frequency of occurrence  2 does not satisfy the threshold level
 2 = s
∗ |  1|, 
are excluded from consideration.</p>
      <p>Step 6. In the same way, the set of concepts of the next (n+1)-th layer is formed. The process
continues iteratively until the set of concepts of the next level becomes empty.</p>
      <p>Thus, the model is sequentially expanded (Fig.1):
where M is the maximum depth of the model, determined either by the criterion of exhausting
new concepts or by the established threshold of co-occurrence frequency. As a result, the
application of this algorithm yields a model of the competition in which concepts are grouped
according to levels of semantic connectivity. The connectivity of concepts is constructed on the
basis of statistics of their co-occurrence in scientific works from the global corpus of research.</p>
      <p>The distinctive feature of the proposed model is that it constitutes a system of organizing
scientific knowledge, in which concepts are arranged according to their increasing semantic
distance (layer/level number) from the conceptual core defined by the co
scope. In other words, the model implements a procedure of structural localization of general
scientific knowledge with respect to the coordinate system represented by the set of competition
concepts.</p>
      <sec id="sec-4-1">
        <title>4.2. Properties of the Model</title>
        <p>The proposed semantic statistical model has a number of properties that determine its suitability for
use in systems for forming juries of scientific competitions.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2.1. Completeness</title>
        <p>The model ensures the gradual coverage of the entire relevant subject domain due to its iterative
construction principle. At each step, sets Tk are formed, which expand the competition core T0 with
new concepts co-occurring in the corpus of scientific publications. This makes it possible to identify
not only explicitly stated terms but also related concepts that frequently appear in the same contexts
l is not limited to a narrow list of keywords but
captures a broader spectrum of notions that genuinely reflect the structure of knowledge in the
chosen field.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.2.2. Universality</title>
        <p>The construction methodology is independent of the specific type of content units. Concepts may
take the form of keywords, descriptors, topics, MeSH terms, Fields of Study (FoS) from OpenAlex, or
even phrase vectors generated using transformer-based models (e.g., BERT, SciBERT). This means
that the model is easily integrated with various databases (Scopus, Web of Science, PubMed,
OpenAlex) and can be applied in biomedical research as well as in engineering or the humanities. Its
universality makes it a flexible tool for organizers of competitions with different profiles.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.2.3. Scalability</title>
        <p>The model is applicable to datasets of any size, including large-scale data. The use of statistical
methods (once a relevant sample is obtained) does not require processing excessively large volumes
and allows for a significant reduction in computational and storage requirements for intermediate
results. Since at each iteration of the algorithm the concepts used in previous steps are excluded, the
overall size of the model cannot exceed the size of th
in the OpenAlex database the size of such a dictionary is approximately 65,000 concepts across all
scientific disciplines.</p>
      </sec>
      <sec id="sec-4-5">
        <title>4.2.4. Objectivity</title>
        <p>Unlike traditional evaluation methods based on bibliometric indicators (e.g., Hirsch index, number
of publications), the proposed model evaluates thematic relevance directly. It reduces the impact of
subjective factors, since it relies on formalized metrics such as co-occurrence frequency of concepts
and distances in vector space. As a result, an expert whose research profile strongly correlates with
This enhances the fairness of jury formation from a scientific and methodological perspective.</p>
      </sec>
      <sec id="sec-4-6">
        <title>4.2.5. Interpretability</title>
        <p>The model enables transparent explanation of results. Each competence evaluation can be detailed
belong to more distant levels. This allows organizers to justify why a particular expert is considered
relevant and to compare experts based on clear, understandable criteria.</p>
      </sec>
      <sec id="sec-4-7">
        <title>4.2.6. Flexibility</title>
        <p>The algorithm allows for the adjustment of weighting coefficients   which regulate the sensitivity
of the model to more distant concepts. If a strict evaluation focused only on the competition core is
required, rapidly decaying coefficients may be used (   = 1⁄ ). If broader context needs to be
2
considered, slower decay may be applied (   = 1⁄( + 1). This makes the model adaptable to
different types of competitions and subject domains.</p>
      </sec>
      <sec id="sec-4-8">
        <title>4.2.7. Accuracy</title>
        <sec id="sec-4-8-1">
          <title>Theoretical accuracy (accuracy by design).</title>
          <p>The model is constructed so that the core T0 carries the maximum weight, while distant levels (T1,
) gradually lose influence due to decreasing coefficients   . This makes the system robust to
thematically relevant notions. In other words, the model is theoretically accurate, provided that the
competition core is adequately defined.</p>
          <p>Practical accuracy (evaluation on data).</p>
          <p>The accuracy of the model is determined by its ability to correctly identify experts whose research
set of concepts T0, the completeness of the expert publication corpus, and the classification system
used for content units.</p>
          <p>A drawback of the proposed model can be considered the fact that, in the pursuit of resource
efficiency and effectiveness, it has lost its autonomy. It requires extensive and labor-intensive data
preprocessing. In particular, this involves assigning thematic concepts to scientific works based on
their titles, abstracts, keywords, and textual content, or constructing dictionaries of key phrases,
topics, and so forth. However, this drawback cannot be regarded as critical, since many modern
aggregators of scientific publications already perform such processing and provide the results in
open access.</p>
        </sec>
      </sec>
      <sec id="sec-4-9">
        <title>4.3. Integral Evaluation of Expert Competence</title>
        <p>The constructed semantic statistical model provides for the formation of a generalized numerical
thematic scope of the competition. Within the framework of the problem being addressed, the
scientific profile of an expert is represented as the complete list of their publications.</p>
        <p>The integral indicator of expert competence is defined as a weighted sum of matches between the
4.3.1. Model</p>
        <sec id="sec-4-9-1">
          <title>Each expert has a set of scientific works  ( ).</title>
          <p>Each work of the expert {   } is described by a set of concepts {    },  = 1, |W(E)|
The expert profile E is represented by the set of unique concepts derived from their works:
 ( )= { 1,  2, … ,   , }
  ∈ ( ⋃ {    }),
| ( )|</p>
          <p>=1
where</p>
          <p>V is the total number of unique concepts.</p>
        </sec>
        <sec id="sec-4-9-2">
          <title>A typical choice is geometric decay:</title>
        </sec>
      </sec>
      <sec id="sec-4-10">
        <title>4.3.2. Matching Function</title>
        <p>For each level  , a matching function is defined:
where
 ( , 
)=
∑ 1 [ ∈  ( )] ∙   ( )
 ∈
(19)
(20)
P(E), it
(21)
(22)
(23)
(24)
should be emphasized that the set of unique concepts associated with an expert is derived exclusively
from the classification provided by the selected scientific aggregator and does not involve the use of
any proprietary extraction algorithms.
condition for the correct computation of the integrated evaluation score  ( ). The methodology is
transparent and reproducible, as it relies entirely on publicly available data and metadata supplied
by the aggregator.</p>
        <p>The set of competition concepts T is represented as a multilayer structure:</p>
        <sec id="sec-4-10-1">
          <title>T, which is a necessary</title>
          <p>associated with the previous level.</p>
          <p>where T0 is the competition core, and each subsequent level T includes concepts statistically
To reflect the varying significance of levels in the competition model, weighting coefficients
  were introduced, which decrease as the distance from the core increases:</p>
          <p>T0, T1, … , TM
 1 &gt;  2 &gt; ⋯ &gt;  
  =
1

2
  ( )</p>
          <p>concept ccc occurs).</p>
          <p>Thus,  ( ,</p>
          <p>) reflects the intensity of the presence of competition concepts from level 
The integral evaluation of expert competence is defined as:</p>
          <p>This indicator generalizes the matches across all levels of the model, assigning the greatest weight
to the concepts of the competition core while reducing the influence of more distant terms.</p>
        </sec>
      </sec>
      <sec id="sec-4-11">
        <title>4.3.3. Normalization</title>
        <p>A direct interpretation of the obtained evaluation results for an individual expert has little meaning
without comparison to the results of other experts in the group. Therefore, it is reasonable to
normalize the evaluation results:</p>
        <p>( )=
This normalization allows the evaluation to be interpreted on a relative scale [0;1].</p>
      </sec>
      <sec id="sec-4-12">
        <title>4.3.4. Interpretation of Results</title>
        <p>•
•
•</p>
        <p>High values</p>
        <p>( )≥ 0.7
Medium values 0.4</p>
        <p>≤   
acceptable in interdisciplinary competitions.</p>
        <p>Low values   
( )&lt; 0.4
( ) &lt; 0.7 reflect partial relevance, which may be</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experiment</title>
      <sec id="sec-5-1">
        <title>5.1. Objective and Hypothesis</title>
        <p>The objective of the experiment is to test the hypothesis that, for constructing a valid semantic
Thus, the task of the experiment is to determine the minimal sample sizes that guarantee the stability
of the model.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Input Data</title>
        <p>employed:
•
•
•
For the experiment, data from the scientific information aggregator OpenAlex were used. The study
A corpus of scientific publications in the field of «Computer Science» containing 2 million
works.</p>
        <p>Publication metadata.</p>
        <p>
          The OpenAlex concept directory (approximately 65,000 concepts) [
          <xref ref-type="bibr" rid="ref5">18</xref>
          ].
5.3. Experimental Scenario
•
•
•
•
•
•
•
•
        </p>
        <p>Initial data formation. A random set of concepts with varying degrees of specialization is
selected (from different levels of the hierarchy in the OpenAlex concept taxonomy).
Sampling of works. For each selected concept, a set of scientific works in which it appears is
formed (see Sec. 4.1, Step 2 of the algorithm). The number of such works is determined (see
.</p>
        <p>Formation of co-occurring concepts. Based on the obtained set of works, the set of unique
concepts co-occurring with the studied concept is computed (see Sec. 4.1, Step 3 of the
algorithm).</p>
        <p>Frequency estimation. For each co-occurring concept, the number of works in which it
appears is counted, and its percentage relative to the sample size is calculated (see Sec. 4.1,
Step 4 of the algorithm).</p>
        <p>Noise filtering. Concepts with a frequency below 1% (scope=0.01scope = 0.01scope=0.01) are
excluded from the set to eliminate random occurrences of irrelevant terms and informational
noise.</p>
        <p>Obtaining the set of relevant concepts. As a result, a subset of concepts co-occurring with
the selected concept within the publication database is formed.</p>
        <p>Artificial reduction of the sample. The number of works from Step 2 is intentionally reduced
to 30%, 3%, and 0.1% of the full size.</p>
        <p>For each reduced sample, Steps 3 6 are repeated.</p>
        <p>Evaluation of similarity of results. The sets of concepts obtained from the full and reduced
samples are compared using the Jaccard coefficient:</p>
        <p>| ⋂ | (27)
 ( ,  )=</p>
        <p>| ⋃ |</p>
        <p>The Jaccard coefficient measures the similarity between sets and is defined as the size of their
intersection divided by the size of their union.</p>
      </sec>
      <sec id="sec-5-3">
        <title>5.4. Results</title>
        <sec id="sec-5-3-1">
          <title>3% of Full 0.1% of Full Concept</title>
        </sec>
        <sec id="sec-5-3-2">
          <title>Mathematics</title>
          <p>Software engineering
Artificial intelligence
Artificial neural
network</p>
        </sec>
        <sec id="sec-5-3-3">
          <title>Number of</title>
          <p>Papers in Full</p>
          <p>Sample
313644
30910
369553
130556</p>
          <p>The analysis of the obtained data shows that the semantic
thematic scope is highly robust to reductions in sample size. Even with a substantial decrease in the
number of works (down to 3% of the full corpus), the resulting sets of concepts remain close to those
constructed using the complete dataset. Significant deviations are observed only for very small
samples (0.1%), particularly in the case of rare or highly specialized concepts.</p>
          <p>Practical implications of the obtained results include:
•
•
•
the possibility of significantly reducing the volume of processed data without substantial loss
of model quality;
reduced computational requirements for implementing the algorithm;
potential offloading of the most resource-intensive computations to the aggregator side.
6. Discussion
competition scope may combine several different directions within one discipline, or even be
interdisciplinary. In such cases, to obtain a correct evaluation, it is necessary to involve experts from
multiple domains. This is confirmed by established practices in forming expert groups. Another
option is to subject each work to independent evaluation by several experts, followed by weighting
lack of competence of a particular expert is compensated by the competences of other members of
the group.</p>
          <p>From this perspective, the proposed method requires further development toward differentiating
the evaluation for each competition concept individually. This, in turn, enables the formation of
expert groups using multi-criteria optimization methods.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7. Conclusion</title>
      <p>The developed model for evaluating the competence of jury members in scientific competitions,
based on a semantic statistical approach, provides a universal and objective method for selecting
experts, with adaptability to interdisciplinary and applied domains. The use of semantic models and
statistical methods of co-occurrence frequency analysis makes it possible to construct a
comprehensive and scalable representation of t</p>
      <p>Integration with aggregator services provides access to large volumes of up-to-date and
structured data (scientific articles with performed terminological analysis and concept identification,
dictionaries of concepts, key phrases, and topics), which enables the practical implementation of the
method and ensures its effectiveness.</p>
      <p>The proposed algorithm accounts for both direct and indirect connections by forming a hierarchy
of semantic connectivity levels. The integral competence evaluation of an expert is based on the
intersection of concepts from their scientific works with the levels of the proposed semantic
reducing the contribution of concepts from distant levels in a geometric progression, thus ensuring
convergence of the evaluation measure.</p>
      <p>The method is flexible and adaptable to different types of content units. In addition to concepts,
it can incorporate keywords, topics, and descriptors, making it compatible with diverse approaches
to structuring scientific knowledge.</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT-4 and Grammarly in order to:
Grammar and spelling check. After using these tool(s)/service(s), the authors reviewed and edited
the content as needed and take(s) full responsibility for the publi</p>
      <p>arXiv preprint
https://arxiv.org/abs/1301.3781.
[5] J. Devlin,
M.</p>
      <p>Jan.</p>
      <p>[Online].</p>
      <sec id="sec-7-1">
        <title>Available:</title>
        <p>009-0146-3.
survey of the state-of-the- ACM Transactions on Intelligent Systems and Technology, vol. 13,
no. 5, pp. 1 27, Sep. 2022. doi: 10.1145/3531046.</p>
        <p>Jan.</p>
        <p>2003.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>L.</given-names>
            <surname>Bornmann</surname>
          </string-name>
          and H.-
          <source>Journal of Documentation</source>
          , vol.
          <volume>64</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>45</fpage>
          <lpage>80</lpage>
          , Jan.
          <year>2008</year>
          . doi:
          <volume>10</volume>
          .1108/00220410810844150.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>-normalized citation impact indicators and the choice of an Journal of Informetrics</article-title>
          , vol.
          <volume>9</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>872</fpage>
          <lpage>894</lpage>
          , Oct.
          <year>2015</year>
          . doi: Incentives and Performance: Governance of Research Organizations,
          <string-name>
            <given-names>I. M.</given-names>
            <surname>Welpe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wollersheim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ringelhan</surname>
          </string-name>
          , and M. Osterloh, Eds. Cham: Springer,
          <year>2015</year>
          , pp.
          <fpage>121</fpage>
          <lpage>139</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -09785-
          <issue>5</issue>
          _
          <fpage>8</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <source>arXiv:1301.3781</source>
          ,
          <year>2013</year>
          . Proc.
          <article-title>EMNLP-IJCNLP, Hong Kong</article-title>
          , China,
          <year>2019</year>
          , pp.
          <fpage>3613</fpage>
          <lpage>3618</lpage>
          . doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>1903</year>
          .
          <volume>10676</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [17]
          <string-name>
            <surname>Waltman</surname>
          </string-name>
          , L.,
          <string-name>
            <surname>van Eck</surname>
            ,
            <given-names>N. J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Noyons</surname>
            ,
            <given-names>E. C. M.</given-names>
          </string-name>
          (
          <year>2010</year>
          ).
          <article-title>A unified approach to mapping and clustering of bibliometric networks</article-title>
          .
          <source>Journal of Informetrics</source>
          ,
          <volume>4</volume>
          (
          <issue>4</issue>
          ),
          <fpage>629</fpage>
          <lpage>635</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.joi.
          <year>2010</year>
          .
          <volume>07</volume>
          .002
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [18] https://api.openalex.
          <source>org/concepts 10244-0.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [10]
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>Leytonfor reviewer assignment: Design and deployment at AAAI50 61</article-title>
          ,
          <string-name>
            <surname>Mar</surname>
          </string-name>
          .
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .1002/aaai.12139.
          <article-title>-to-match (LCM) system AI Magazine</article-title>
          , vol.
          <volume>45</volume>
          , no.
          <issue>1</issue>
          , pp.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <source>reviewer representation towards automatic paper- Proc. ACM/IEEE Joint Conf. Digital Libraries (JCDL)</source>
          , Champaign, IL, USA,
          <year>2019</year>
          , pp.
          <fpage>185</fpage>
          <lpage>194</lpage>
          . doi:
          <volume>10</volume>
          .1109/JCDL.
          <year>2019</year>
          .
          <volume>00037</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>