<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Mathematical Model of Semantic Kernel of WEB site</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sergey Orekhov</string-name>
          <email>sergey.v.orekhov@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Henadii Malyhon</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tetiana Goncharenko</string-name>
          <email>tatianagoncharenko1806@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>National Technical University “Kharkiv Polytechnic Institute”</institution>
          ,
          <addr-line>Kyrpychova str. 2, Kharkiv, 61002</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Our latest research from search engine optimization projects shows the effect of the semantic kernel of a website. It is a unique set of keywords that depends on time as well as the current state of the search engine vector space model. Therefore, the problem of mathematical modeling of changes in the semantic kernel of a website within a period of a month - a year is becoming urgent. The kernel is formed from three clusters of keywords: product (service), geography (location) and time (duration). The article proposes a model for representing the semantic kernel of a website. This view is supported by a description of the kernel based on the semantic web, followed by its presentation in the form framework (RDF) schema. An algorithm for forming a kernel based on hierarchical clustering (agglomerative nesting) is also considered. Semantic kernel, search engine optimization, clustering, degree of proximity MoMLeT+DS 2021: 3rd International Workshop on Modern Machine Learning Technologies and Data Science, June 5, 2021, Lviv-Shatsk, ORCID: 0000-0002-5040-5861 (S. Orekhov); 0000-0001-5448-2488 (H. Malyhon); 0000-0001-6630-307X (T. Goncharenko)</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Since 2009, our team has completed more than thirty projects in the field of search engine
optimization. The projects were in various fields: pharmacy, marketing research, jewelry production,
construction materials, cosmetics, wood products, furniture production, and automobile spare parts.
All the projects were united by one circumstance, namely: they are all related to the field of
ecommerce. One way or the other, but the main goal of the project was set as an increase in the volume
of sales of goods or services via the Internet. Some of the projects were successful, and some were
clearly with negative results. Analyzing the results obtained, we came to the following conclusion.</p>
      <p>
        There is an interesting effect of the so-called learning of search engine [
        <xref ref-type="bibr" rid="ref2">1</xref>
        ]. The fact is that search
engine optimization is the process of training a search engine to respond to given user requests by
showing our website in the first places in the list of answers. For this purpose, according to the space
model vector, texts from our website that describe a product or service must have the maximum
number of external links (large citation index). However, such links can be divided into two groups:
black (formally created for the citation index) and white (created by real users who, for example, have
already tested this product or service). In our case, white links are of the greatest interest, but such
links are generated based on semantics.
      </p>
      <p>Let us ask ourselves a question: how does an ordinary user form such a link? That is right, he
learns in advance by using a product or service and then he writes a review for such use. In other
words, feedback is generated. In this way, the search engine responds to a user feedback on our
website, or rather, he reacts to texts and images. This reaction is shaped by us, however, this is an
indirect reaction. At the first stage, we recognize texts from websites. Analyzing these texts, we either
agree with them or not. Our consent is expressed in a positive response to the website. The process of
such analysis is clearly short-lived. There are a lot of texts and websites, so we react to so-called
annotations, that is, short sets of keywords that, in our opinion, are as complete as possible, but briefly
describe the meaning of the text on the website. We will call such short descriptions of texts –
semantic kernels.</p>
      <p>Ukraine
Goncharenko)</p>
      <p>2021 Copyright for this paper by its authors.</p>
      <p>Thus, there is an urgent problem of mathematical modeling of the semantic kernel. It is also
necessary to describe mathematically the algorithm for its formation from the text of the website.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Problem statement</title>
      <p>
        The simplest approach to considering the semantic kernel is to represent the source text as a bag of
terms [
        <xref ref-type="bibr" rid="ref3">2</xref>
        ] with its subsequent clustering. As it was established [
        <xref ref-type="bibr" rid="ref4">3</xref>
        ], we should distinguish three clusters
of terms: geography, product (service) and time.
      </p>
      <p>
        There are three main clustering approaches: probabilistic (statistical), graph and hierarchical
algorithms [
        <xref ref-type="bibr" rid="ref5 ref6">4-5</xref>
        ].
      </p>
      <p>Graph clustering algorithms represent the initial sample in the form of a graph, where the nodes of
the graph are the sample objects themselves, and the edges are pairwise distances between the sample
objects. The main advantage of graph clustering algorithms is clarity and ease of implementation, as
well as the ability to make various improvements using simple geometric assumptions. How are the
edges of graphs constructed? Moreover, what distinguishes one method from another?</p>
      <p>
        Most graph algorithms work is based on a hypothesis about the number of clusters that should be
obtained in the end. For example, Minimum Spanning Tree algorithm assumes that each new sample
point is sequentially connected to the nearest one, which is already connected to others. Then k-1
longest edges are removed at the end of the algorithm execution. The parameter k in this case is a
hypothesis about the number of clusters. Since, in principle, it is impossible to know how many event
clusters can occur a day, the advancement of any hypotheses about their number will be unacceptable
and will only distort the clustering results [
        <xref ref-type="bibr" rid="ref1 ref7">6</xref>
        ].
      </p>
      <p>Probabilistic (statistical) algorithms assume that each object is a random variable. Then the cluster
will obey some distribution law. However, since the form of the distribution law is unknown in
advance, then for the operation of the algorithm it is necessary to put forward a hypothesis about the
applicability of a certain distribution law. A hypothesis that is incorrectly put forward can
significantly distort the clustering results.</p>
      <p>Therefore, probabilistic algorithms are also an ineffective approach to solving the problem of
clustering terms. Because, firstly, it is impossible to determine the distribution laws of terms, and
secondly, the number of terms is finite and small enough in comparison with the number of terms that
can be used to describe a given triad (product, time and location). Thirdly, the functioning of
probabilistic algorithms also often requires an assumption about the number of clusters, the initial
centers of clusters, which leads to a significant qualitative difference in the results of the algorithm.</p>
      <p>Hierarchical (taxonomic) clustering algorithms form not one sample partitioning into disjoint
clusters, but a system of nested partitions. The result of such an algorithm is presented in the form of a
dendrogram (Figure 1).</p>
      <sec id="sec-2-1">
        <title>Root cluster</title>
      </sec>
      <sec id="sec-2-2">
        <title>Empty clusters</title>
        <p>
          Among the algorithms for hierarchical clustering, there are two main types, depending on the logic
of building clusters: divisional and agglomerative [
          <xref ref-type="bibr" rid="ref1 ref7">6</xref>
          ]. Divisional or top-down algorithms split the
original sample into smaller and smaller clusters. Agglomerative or bottom-up algorithms are those in
which objects are combined into larger and larger clusters.
        </p>
        <p>In the study, having evaluated the advantages and disadvantages of different approaches, the
agglomerative hierarchical algorithms was proposed to apply.</p>
        <p>Let at the initial moment of the execution of the algorithm each object (term) be considered a
separate cluster. Let us start the merging process, where at each iteration a new cluster Ai  Aj is
formed instead of a pair of the closest clusters Ai and Aj . The quality of the algorithm is:
•
•
in the method of determining the distance between Ai and Aj ;
in the method of determining the distance between the new cluster Ai  Aj formed in the
previous step and some element Af to be united.</p>
        <p>At the same time, do not forget about the so-called dubbing situation, which is in the fact that the
same terms, but from the perspective of different users, can describe the same triad: product, time and
geography.</p>
        <p>In addition, when describing a triad, the user uses the concept of a plot, that is, a certain scenario
within which he applies a given product or service.</p>
        <p>Consequently, when constructing an algorithm for identifying the semantic core, it is necessary to
take into account the moment of terms duplication, as well as a possible storyline in which these terms
take part.</p>
        <p>To automatically highlight duplicate and plot terms, it is necessary to formulate the corresponding
metric criteria for the proximity of terms.</p>
        <p>We will formulate a vector interpretation of an object (term) based on data obtained in the process
of processing web content and introduce metric criteria for the proximity of two terms and an
algorithm for combining terms into one of the clusters: geography, goods and time.</p>
        <p>Each term is assigned to one of three categories, reflecting the type of term. Let us designate a set
of terms that form dictionaries of three categories, V = {v j}, j = 1, J where v j is a term in the
dictionary. For a set of three categories, we denote the corresponding dictionaries V k = {v kj }, k = 1,3 .
At the same time, the same terms cannot form different dictionaries of categories
V k V r = 0; k  r; r, k  (1,3) , which make it possible to unambiguously distinguish a category
exclusively
on
dictionaries
of
terms.</p>
        <p>For
example,
V T = {month, day, week,...} and V G = {latitude, longitude,...} .</p>
        <p>The required identification accuracy is achieved due to the fact that the number of categories is
strictly limited. Terms that fall into one category are unlikely to fall into any other.</p>
        <p>Let the semantic kernel of web content be based on these dictionaries. That is, the kernel will
include only terms from these three clusters (dictionaries). Thus, our task is to build a vector of
semantic kernel by clustering existing terms from web content.</p>
        <p>An analysis based on this approach allows, in a primary approximation, to evaluate the kernel
described on the website, however, the allocation of a unique core is only possible with the complete
processing of a group of websites of a given topic.</p>
        <p>The problem of processing the flow of websites is that the most of the data is non-metric, and
therefore it is necessary to switch from a non-metric representation of terms to a metric one, in which
it is possible to describe a universal criterion for the proximity of two terms in one category.</p>
        <p>In our approach, the processing of web content involves the formation of three vectors: a vector of
 
terms n , reflecting information about: what product, where it is located and at what time; vector n is
 
recovered on the basis; vector of terms n that specifies the vector n . Thus, these vectors describe a
semantic kernel.</p>
        <p> </p>
        <p>A vector n always has a fixed structure: n = (d , p, G, L, M ) , where d – contains the date and
time (when), p  P – product (what), g  G – product geography (where), l  L – many sources of
web content, m  M – many related products (services) mentioned in web content (with by whom).
V P = {wood, metal, auto,...} ,</p>
        <p>All sets P, G, L, M contain terms found in the web content of a given web site. Each element of
such a set takes on the value zero if the term is absent in the web content and the value one if it is
present. Set d contains the date and time of web content creation.</p>
        <p> 
The vector n describes a variant of the semantic core. At the same time, the vector n contains the

terms synonyms that possibly exist in the web content. In addition, we need a third vector n, which
contains possible terms that are service terms to describe the main meaning of web content.</p>
        <p>Then we will assume that the web content of the web resource contains several semantic cores,
possibly close in meaning. Our task is to collect several cores from the terms of a bag of words by
clustering and check them for dubbing. In addition, as a rule, the content management system of a
web resource can contain several versions of the kernels, possibly also duplicating each other. In
addition, the search server in its database contains several versions of the semantic cores of this web
resource. And the third place, where a variant of the semantic core of a web resource is possibly
contained, is a social network, or rather an account in a social network or marketplace. Thus, by
clustering, we form a set of nuclei and it is necessary to establish whether they are close and by how
much. This affinity is also important as it echoes the idea of linking to a document in a search engine.</p>
        <p>Therefore, we need a metric indicating the similarity or duplication of cores both on the web
resource and on the Internet as a whole. The first parameter for evaluation is probably the date and
time of the publication of web content. Consider a metric based on this assumption.</p>
        <p>To combine two kernels – duplicates in one dictionary, the primary condition is the coincidence of
the categories of terms. The date and time of the term appearance in web content should differ within
the threshold value dtreshold reflecting the dynamics of the market. The degree of proximity for the rest
of the values is carried out according to the formula (1):</p>
        <p>  J J J J
F1 (n , n ) = (1 + ( pj − pj)2 )(1 + (mj − mj)2 )(1 + (lj − lj)2 )(1 + (gj − gj)2 ) → min ,(1)
 j=0 j=0 j=0 j=0
where M , M  – vectors described by coordinates M   M  . Each element has a value

m j = (0,1) if a term has been possessed to a vector n ;
 
L, L – vectors of web content sources respectively are taken from two kernel variants;
 </p>
        <p>G, G – vectors of geography, while the values of the geography of the terms should be reduced
to the total dimension.</p>
        <p>Obviously, in most cases the vectors differ and complete coincidence, when the criterion equal to
one, is not observed. This problem can be solved either by calculating expert estimates of the
thresholds for combining terms, or by forming the threshold value in an analytical way.</p>
        <p>The following analytical method for calculating the error coefficient  is proposed, based on the

reconstructed vector n of other terms. The set of such terms should include those that are included in
the web content, but at the same time do not reflect the very description of the product or service
directly, but only clarify its description, in particular, indicate secondary signs, characteristics,
properties, etc.</p>
        <p>
For vectors n of other terms, we will form a set of dictionaries of synonymous terms mentioned
 
on a given site, O  O , on the basis of which we will form vectors O, O , such that

0, o n
Oi =   , whence, the criterion for the proximity of vectors takes the form (2):
1, o n</p>
        <p>J
Ftertiary = (Oj − O)2 → min .</p>
        <p>j
j=0
Based on the coefficient Ftertiary,  can be obtained (3):</p>
        <p>
 (n )
 = U ,</p>
        <p>Ftertiary
   
where a vector nU = n  n, but  (n ) – vector’s length.</p>
        <p>U
(2)
(3)
•</p>
        <p>The meaning of the coefficient  is that: the greater the degree of similarity according to the
tertiary criterion, the more likely that the terms describe the same product or service, the less
stringency is required for similarity according to the primary and secondary criteria. However, the use
of only these two criteria for assessing the degree of similarity of terms is not enough, since quite
often, the information about the product is incomplete.
F1  
larger vector is , the less they are similar according to the tertiary criterion;

F2    F1 means that the similarity for the reconstructed vector n should be greater (due

to the large number of coordinates) than for the original vector n within the error  .</p>
        <p>
          The algorithm for identifying a cluster of terms describing a unique kernel in the web stream is
shown in Figure 2 [
          <xref ref-type="bibr" rid="ref8">7</xref>
          ].
        </p>
        <p>As a result, based on three vectors that describe the semantic core and the introduced proximity
metrics, it is possible to obtain a single semantic core of the web resource as a whole. However, it
should be understood that the core is formed on a specific date and time. In addition, there is a kernel
retrospective.</p>
        <p>Let us first consider how the algorithm for constructing the semantic core of a web resource will
function, taking into account the fact that in the web content itself there may be several core options.
Moreover, if the content management system contains several options for web content, therefore, the
number of kernel options increases.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Proposed algorithm</title>
      <p>For each term (figure 2), a set of potential duplicates is formed among the terms included in the
same category (product, time and geography). After that, within the framework of one set of
duplicates, the terms are checked in pairs for similarity. In case two terms are not similar, then the
compared term is excluded from the list of potential duplicates. Otherwise, the vectors of the two
terms are combined into a new vector based on the following rule: if two terms describe one product
or service and differ slightly, then the vectors combine these terms to a more accurate description of a
product or service.</p>
      <p>After considering one term within its set of duplicates, in case a cluster is selected, all vectors of
terms included in the cluster are replaced by a cluster vector, which eliminates the formation of
duplicates.</p>
      <p>Highlighting terms in a story will differ from searching for a unique term among duplicates, only
by the rule of checking the time of appearance of a term in web content. Obviously, the dates should
be consistent within the same margin of error.</p>
      <p> p = p,

 d  − d   dtreshold,
F = 
F1   ,
F1    F2 .
(5)</p>
      <sec id="sec-3-1">
        <title>Exclude a potential duplicate from consideration</title>
      </sec>
      <sec id="sec-3-2">
        <title>Are there any unreviewed terms in the category?</title>
        <p>NO</p>
      </sec>
      <sec id="sec-3-3">
        <title>Select a term in one category</title>
      </sec>
      <sec id="sec-3-4">
        <title>Is there a potential duplicate?</title>
      </sec>
      <sec id="sec-3-5">
        <title>Calculate alpha coefficient and first and second criteria</title>
      </sec>
      <sec id="sec-3-6">
        <title>Criteria are met?</title>
        <p>NO
YES
NO
YES
YES</p>
      </sec>
      <sec id="sec-3-7">
        <title>Highlight the term in question as unique and exclude from consideration</title>
      </sec>
      <sec id="sec-3-8">
        <title>Combine vectors of terms into one</title>
      </sec>
      <sec id="sec-3-9">
        <title>Exclude original vectors from consideration</title>
      </sec>
      <sec id="sec-3-10">
        <title>Add a combined vector</title>
      </sec>
      <sec id="sec-3-11">
        <title>All duplicates are excluded</title>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Software designing</title>
      <p>
        Based on the requirements (figure 2), we start to design a software taking into account that there
are two main users: user and administrator, which share the functionality. Using UML [
        <xref ref-type="bibr" rid="ref8">7</xref>
        ], the
following use case diagram was designed (figure 3).
      </p>
      <p>Scanning web content</p>
      <p>Stop words</p>
      <p>Normalization
«extends»
«extends»</p>
      <p>Lemmatization
«extends»</p>
      <p>The main idea is to represent the final result of the algorithm execution in the form of an RDF

schema. It is a handy tool that includes the basic components of a vector n . The operating principle
of the software is shown in the diagrams - Figures 4 and 5.</p>
      <p>As we can see, the software takes the web content of the web resource as the initial information.
Next, the primary processing of web texts is performed (lemmatization and normalization). Then our
clustering algorithm is applied and, as a result, an xml file is formed - an RDF schema.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>Let us look at the potential results of applying the proposed algorithm on the example of web
content on the topic of astrology and psychology. Figure 6 shows an example of such content,
followed by highlighting a vocabulary of terms based on the frequency of their occurrence in the text.</p>
      <p>
        The web content of this site has changed three times for the last ten years. As you can see, from
Google Analytics data, each such change was accompanied by a surge in user activity. In the example
(figure 7), the semantic kernel was analyzed at the stage of its first change. The main task of the
kernel was to fix in the minds of potential users of the site and the Internet in general that the
CelestialTiming site [
        <xref ref-type="bibr" rid="ref9">8</xref>
        ] logo is associated with the direction of astrology and psychology. Subsequent
changes to the core were aimed at the next two stages of the marketing of this web project.
      </p>
      <p>The first stage is to promote the services of this website, namely the construction of a
psychological portrait based on the user's personal data.</p>
      <p>The second stage is to promote a new service - the school of psychology and astrology. In
accordance with the stages, the semantic core of the website is also changed.</p>
      <p>The list of keywords</p>
      <p>Unfortunately, the creators of the web project made several mistakes. The first is that the kernels
must be linked when changed. The second mistake is that the kernel should be changed not only on
the website itself, but also on friendly links pointing to this kernel. The third mistake is that each core
has its own life cycle and it is different from each other. This can be seen, for example, in figure 7.
The fourth mistake is that the kernels, based on their aging effect, need changing more often than the
creators of this project do. The fifth mistake is that all kernel changes must fit within the framework
of a single marketing strategy.</p>
      <p>All these comments were passed on to the developers of this web project.</p>
      <p>Subsequent research on this topic includes:
• development of an alternative approach to the description of the semantic kernel. A promising
idea is to represent the kernel in matrix form using the principles of permutations based on expert
assessments of relationships between terms. In this case, the matrix contains the same elements as the
search engine matrix. Then it is possible to use a vector-space model to determine the proximity of
nuclei. In addition, it is promising to use a genetic algorithm to obtain an optimal semantic kernel, on
a set of existing or promising ones;</p>
      <p>
        • development and testing of software in the Javascript language on the WordPos platform [
        <xref ref-type="bibr" rid="ref10">9</xref>
        ].
It is required to design this software as a component with the possibility of its subsequent integration
into existing content management systems such as Wordpress and Opencart;
      </p>
      <p>• also a promising area of research is the analysis of profiles in social networks in order to
identify semantic cores and, on their basis, search for potential buyers of a product or service.</p>
    </sec>
    <sec id="sec-6">
      <title>7. References</title>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>6. Future work</mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Orekhov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Godlevsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Orekhova</surname>
          </string-name>
          ,
          <article-title>Theoretical fundamentals of search engine optimization based on machine learning in: Proceesings of the 13th International Conference on ICT in Education, Research and Industrial Applications</article-title>
          . Integration, Harmonization and
          <string-name>
            <given-names>Knowledge</given-names>
            <surname>Transfer</surname>
          </string-name>
          , ICTERI '
          <year>2017</year>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2017</year>
          , Volume
          <year>1844</year>
          , pp.
          <fpage>23</fpage>
          -
          <lpage>32</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Goldberg</surname>
          </string-name>
          .
          <source>Neural Network Methods in Natural Language Processing (Synthesis Lectures on Human Language Technologies)</source>
          .
          <source>Morgan&amp;Claypool Publ. USA</source>
          ,
          <year>2017</year>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Orekhov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Malyhon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Goncharenko</surname>
          </string-name>
          ,
          <article-title>Using Internet News Flows as Marketing Data Component in: Proceesings of the 4th</article-title>
          <source>International Conference on Computational Linguistics and Intelligent Systems. Volume</source>
          <volume>1</volume>
          :
          <string-name>
            <surname>Main</surname>
            <given-names>Conference</given-names>
          </string-name>
          , COLINS '
          <year>2020</year>
          ,
          <string-name>
            <surname>CEUR-WS</surname>
          </string-name>
          ,
          <year>2020</year>
          , Volume
          <volume>2604</volume>
          , pp.
          <fpage>358</fpage>
          -
          <lpage>373</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Berry</surname>
          </string-name>
          .
          <source>Survey of Text Mining: Clustering, Classification and Retrieval</source>
          . Springer, USA,
          <year>2004</year>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>E.</given-names>
            <surname>Krupka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Tischby</surname>
          </string-name>
          .
          <article-title>Generalization from Observed to Unobserved Features by Clustering</article-title>
          .
          <source>Journal of Machine Learning Research</source>
          ,
          <year>2008</year>
          , Volume
          <volume>9</volume>
          , pp.
          <fpage>339</fpage>
          -
          <lpage>370</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Murty</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Flynn</surname>
          </string-name>
          .
          <article-title>Data clustering: A review</article-title>
          .
          <source>ACM Computing Surveys</source>
          ,
          <year>1999</year>
          , Volume
          <volume>31</volume>
          , No.
          <issue>3</issue>
          , pp.
          <fpage>264</fpage>
          -
          <lpage>323</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B.</given-names>
            <surname>Rumpe</surname>
          </string-name>
          .
          <source>Agile modeling with UML</source>
          . Springer, Germany,
          <year>2017</year>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <article-title>[8] Web content source: Psychological self portrait is the path of self discovery based on celestial timing / Celestialtiming</article-title>
          .com,
          <year>2021</year>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lengstorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Wald</surname>
          </string-name>
          .
          <article-title>Pro PHP and jQuery</article-title>
          . APress, USA,
          <year>2016</year>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>