<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Method for Documents Rubrication and Analysis Based on Fuzzy Relations of Difference between Their Syntactical Characteristics</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>V. Borisov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>M. Dli</string-name>
          <email>midli@imail.ru</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>P. Kozlov</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>The Branch of National Research University “Moscow Power Engineering Institute” in Smolensk</institution>
          ,
          <country country="RU">Russia</country>
        </aff>
      </contrib-group>
      <fpage>164</fpage>
      <lpage>173</lpage>
      <abstract>
        <p>The paper states the formulation and proposes a method for rubrication and analysis of electronic nonstructural documents. The application of the proposed method results in forming a tree structure of a rubric field based on fuzzy relations of difference between syntactical characteristics of rubricated documents. The documents analysis is based on the determination of the fuzzy correspondence for these documents according to syntactical characteristics with the values of the centers for the detected clusters sequentially from the root to the leaves of the built fuzzy decision tree. The conducted computational experiments have shown that the proposed method allows reducing the number of erroneously rubricated documents (in comparison with probabilistic and neural network methods)</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>The program "Electronic government" suggests the dynamic introduction of
information and communication technologies in the activities of public authorities.
The main program goal is to increase the efficiency of public administration and to
develop partnerships with civil society and business.</p>
      <p>A key task of program implementation is to develop Internet services, which
provide information support and a variety of services in electronic form. Their use can
improve the quality and accessibility of state and municipal services to citizens and
businesses, reduce the cost of their provision and increase the labor productivity in
institutions of government at various levels.</p>
      <p>One of the ways to use information and communication technology to solve this
task is to automate the process of analyzing electronic appeals (applications,
complaints, suggestions) of individuals and legal entities arriving at official websites
and portals of authorities and local self-government.</p>
      <p>The text rubrication plays an important role in the process of automatic analysis of
incoming electronic appeals. It consists of their distribution according to thematic
rubrics that determine the areas of activity of the departments involved in their
processing and preparation of the corresponding response.</p>
      <p>Today, there are many methodological approaches to the classification of
documents of various types. The choice of a specific method is directly determined by
the characteristics of the rubrication objects (i.e. documents received by public
authorities).</p>
      <p>The analysis has revealed the following specific characteristics of electronic
documents received on official websites and portals of public authorities, which must
be taken into account when choosing a rubrication method:
 relatively small size of electronic documents that impedes their statistical
analysis;
 absence of marking in these documents that complicates the procedures for
highlighting the structure and extracting the information relevant to the
analysis;
 presence of grammar and syntactical errors in electronic messages that entails
the necessity for additional processing;
 nonstationarity of the thesaurus (the composition and relevance of the rubric
words);
 dynamic changes of the legislative and regulatory framework that can change
the distribution of tasks between departments;
 description of several problems in one message (answers can be prepared by
several specialists or even several departments).</p>
      <p>
        These features significantly limit the possibilities of application of the methods
based on the probabilistic and statistical approach to the rubrics generation and
electronic text analysis [
        <xref ref-type="bibr" rid="ref1 ref27 ref6">1, 6, 27</xref>
        ].
      </p>
      <p>The aforementioned determines the urgency of the task of developing a new
method of rubricating the electronic unstructured documents, taking into account the
specific features of text messages received on official websites and portals of public
authorities.
2</p>
      <p>Related works</p>
      <p>At present, there are a variety of methods, models and algorithms for the
classification of text documents written in natural language. However, each of them
has its applicability conditions determined by the statement of the rubrication
problem.</p>
      <p>
        It was shown in articles [
        <xref ref-type="bibr" rid="ref10 ref11 ref12">10, 11, 12</xref>
        ] that the choice of a specific classification
(rubrication) method is determined by such characteristics as the size of the analyzed
document, the degree of rubric thesaurus intersection and the amount of accumulated
statistical information.
      </p>
      <p>Machine learning is a well-known approach to classifying unstructured
documents. It offers the use of artificial intelligence methods that can learn from a set
of precedents.</p>
      <p>
        One of the machine learning methods that have been successfully used to solve
various classification problems is artificial neural networks. The classification of texts
is devoted to the works of authors [
        <xref ref-type="bibr" rid="ref17 ref20 ref21 ref26 ref5">5, 17, 20, 21, 26</xref>
        ]. The main limitation of the
application of this approach is the requirement for the presence of a large amount of
statistical data necessary for training algorithms.
      </p>
      <p>
        Another machine learning method that can be used to classify text documents is
fuzzy decision trees. They are based on learning by examples, while the rules are
presented in the form of a hierarchical sequential structure. The issues of using fuzzy
decision trees are considered in the works [
        <xref ref-type="bibr" rid="ref13 ref15 ref16 ref2 ref22 ref23 ref25 ref26 ref9">2, 9, 13, 15, 16, 22, 23, 25, 26</xref>
        ].
3 Statement of the rubrication problem
      </p>
      <sec id="sec-1-1">
        <title>Initial data</title>
        <p>
          1. For the formalized presentation of electronic unstructured documents (EUD)
“a unification” for a set of syntactical characteristics is performed in advance. These
characteristics are selected by a classical analyzer (parser), for example,
LinkGrammar [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]:
        </p>
        <p>S  {sn | n 1..N},</p>
        <p>V  {Vk | k 1..K },
where for the typical case N  5 ; s1 – the root word or the predicate; s2 – the subject;
s3 – the adverbial modifier; s4 – the object under the action; s5 – the predicate.</p>
        <p>2. There is a set of EUD
in which every document Vk is presented by a set of its relevant words:
k  1.. K Vk  {vl(k) | lk 1..Lk },
k
where vl(k) – the relevant word of EUD, Lk – the number of words in the k-th EUD.</p>
        <p>k
3. The set of EUD V is presented as a set of SD formalized documents:</p>
        <p>SD  {SDk | k 1..K},
in which the formalized document SDk corresponds to each EUD:</p>
        <p>
          k 1..K SDk  { SDn(k ) | n 1..N},
where SDn(k) – the set of words from EUD Vk , corresponding to the syntactical
parameter sn [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ].
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>Required</title>
        <p>To propose a method for rubrication and EUD analysis based on the hierarchical
clustering which uses fuzzy relations between syntactical characteristics of rubricating
documents.</p>
        <p>Method description</p>
        <p>The proposed method for rubrication and analysis of EUD includes the steps
discussed below.</p>
        <p>Step 1. To give the parameters to determine the degree of correspondence for
formalized documents according to the syntactical characteristics.</p>
        <p>For each formalized document SDk ( k 1..K ) a set of values for parameters
SD k   SD n / sn  | n 1..N  is given to assess the degree of its correspondence
(k )
according to all syntactical characteristics.</p>
        <p>Step 2. To determine the degree of difference between all pairs of formalized
documents according to all syntactical characteristics.</p>
        <p>Consider a pair of documents SDk and SDl , k, l 1..K :</p>
        <p>SDk  { SDn(k ) | n 1..N} и SDl  { SDn(l) | n 1..N}.</p>
        <p>To compare these documents sets of parameters values are given for all syntactical
characteristics:</p>
        <p>SD k   SD n / sn  | n 1..N  и SDl   SD n / sn  | n 1..N  .</p>
        <p>(k ) (l)</p>
        <p>As a result, sets of parameter values are formed. These parameters characterize the
degrees of difference for documents SDk and SDl according to all syntactical
characteristics:
d(SDk , SDl )    d  SDn , SD(nl)  / sn  | n 1..N ,</p>
        <p>(k )
where, for example, d  SDn , SD(nl)   SD(nk )  SDn .</p>
        <p>(k ) (l)</p>
        <p>Note. The obtained set of values d(SDk , SDl ) can be presented in the form of a
fuzzy set and
SDk   SDn / sn  | n 1..N
(k)
interpreted
as
and
a
fuzzy difference between
SDl   SDn / sn  | n 1..N ,
(l)
fuzzy</p>
        <p>sets
syntactical
characteristics from S  {sn | n 1..N} are their carriers, and the documents degrees of
(k)
correspondence to these characteristics SDn
(l)
and SDn
membership for the fuzzy set d(SDk , SDl ) .
are the degrees
of</p>
        <p>Example. Consider an example of documents SDk and SDl comparison taking
into account the below-mentioned parameters:</p>
        <p>SDk  0.7 / s1 , 0.5 / s2 , 0.3 / s3 , 0.3 / s4 , 0.8 / s5  and</p>
        <p>SDl  0.1 / s1 , 0.9 / s2 , 0.2 / s3 , 0.6 / s4 , 0.4 / s5  .</p>
        <p>As a result, the following set of parameters values, characterizing the degree of
difference between the documents according to the syntactical characteristics, is
formed:</p>
        <p>d (SDk , SDl )  0.6 / s1 , 0.4 / s2 , 0.1 / s3 , 0.3 / s4  , 0.4 / s5  .</p>
        <p>The calculation for the degree of differences according to all syntactical
characteristics is performed for all pairs of formalized documents SDk and SDl ,
k, l 1..K .</p>
        <p>Step 3. To form a matrix of difference between all pairs of the formalized
documents.</p>
        <p>The results of the previous step allow forming a compose matrix of difference
between all pairs of documents.</p>
        <p>Figure 1 shows such type of a matrix.</p>
        <p>SD1
SDk
SDK</p>
        <p>SD1
…
…
d(SD1, SD1)
d(SDk , SD1)
d(SDK , SD1)
…
…
…</p>
        <p>SDl
…
…
d(SD1, SDl )
d(SDk , SDl )
d(SDK , SDl )
…
…
…</p>
        <p>SDK
d(SD1, SDK )
…
…
d(SDk , SDK )
d(SDK , SDK )
Step 4. Fuzzy hierarchical clustering of documents based on the fuzzy relations of
difference between all pairs of formalized documents according to all syntactical
characteristics.</p>
        <p>Parameters d  SDn , SD(nl )  are used as the parameters for fuzzy hierarchical
(k )
clustering of formalized documents, their values characterize the results of pairwise
comparison SD(nk ) and SD(nl ) separately according to all syntactical characteristics
{sn | n 1..N} .</p>
        <p>
          It is reasonable to use well-known agglomerative methods as a base for the
hierarchical clustering procedure [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ].
        </p>
        <p>Clusters Cl  {Cli | i 1..I} are detected as a result of hierarchical clustering. Let
the centers of these clusters be {Cl i | i 1..I} , where Cl i  Cl (ni) / sn  | n 1..N  .</p>
        <p>The detected clusters Cl  {Cli | i 1..I} correspond to the rubrics:</p>
        <p>R  {Ri | i 1..I},
where for all i 1..I</p>
        <p>
          Ri   t ji , (wjin / sn ) | n  1..N | j 1..Ji, t ji – j-th relevant
word in the rubric Ri , wjin [
          <xref ref-type="bibr" rid="ref1">0, 1</xref>
          ] – the degree of correspondence for the word t ji to
the syntactical characteristic sn in the rubric Ri .
        </p>
        <p>Thus, the result of the hierarchical clustering for documents is a tree-type structure
of the formed rubric field based on the fuzzy relations between syntactical
characteristics of the rubricating documents.</p>
        <p>Step 5. Documents analysis.</p>
        <p>The proposed procedure of analysis is based on the comparison of the
correspondence degrees SDk for the analyzing document SDk according to the
syntactical characteristics with the values for the clusters centers SDk sequentially
from the root to the leaves of the built decision tree. In this case, the analysis
procedure takes into account the specificity of the detected clusters.</p>
        <p>The analyzing document SDk is the most relevant to the rubric Rl , the degree of
fuzzy correspondence to which is the maximum:</p>
        <p>Rl : mi1a.. xI  (SDk , Cli ).</p>
        <p>
          To calculate a parameter characterizing the degree of fuzzy correspondence of
formalized documents SDk to the rubric Ri , it is reasonable to use the following [
          <xref ref-type="bibr" rid="ref3 ref4">3,
4</xref>
          ]:
 (SDk , Ri )  1
1
N
 SDn  Cl(ni) 2 .
        </p>
        <p>N (k)
n1
5 The results of the proposed method application</p>
        <p>The proposed rubrication method was programmatically implemented as a
component of the comprehensive information system for the automatic processing of
electronic unstructured documents arriving at official websites and portals of public
authorities.</p>
        <p>This method was tested in the automated processing and analysis of appeals
(applications, complaints or suggestions) of citizens and organizations receiving by
Administration of Smolensk region in 2018-2019.</p>
        <p>To carry out the classification of incoming electronic appeals, the experts have
identified 17 interconnected rubrics reflecting the urgent civic problems: general
issues of society and politics (R1), separation of powers and functions in the
Administration (R2), social sphere (R3), education (R4), suggestions for improving the
city of Smolensk (R5), family (R6), culture (R7), physical education and sport (R8),
housing and communal services (R9), maintenance and utilities (R10), housing stock
(R11), non-residential fund (R12), securing the right to housing (R13), economy (R14),
business activities (R15), natural resources (R16) and environmental protection (R17).</p>
        <p>Two well-known methods (probabilistic and neural network) successfully used to
classify unstructured text documents have been practically implemented for
comparative text analysis.</p>
        <p>The Bayes classification was chosen as the first alternative method because of its
ease of implementation and minimal human and financial costs for software
implementation. It uses the procedure for classifying documents based on Bayes
formula for conditional probability.</p>
        <p>The input text document is presented as a sequence of terms {wn}. Each rubric Ri
is characterized by the unconditional probability P(Ri) of the assignment of document
V to it and the conditional probability P(w|Ri) to meet the term w in document V,
subject to the choice of rubric Ri. Then the probability P(V|Ri) is understood as the
probability that the text document will be classified subject to the selection of
rubric Ri.</p>
        <p>The procedure for document rubrication consists in calculating the probabilities
P(Ri|V) for all rubrics Ri and choosing the rubric for which this probability is
maximal. Classifier training consists of compiling a vocabulary of probabilities of
various terms {wn} for each rubric.</p>
        <p>
          The methods of using probabilistic algorithms for the classification of text
documents are considered in more detail in [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ].
        </p>
        <p>Convolutional neural networks were used as the second alternative method for
document rubrication.</p>
        <p>Convolutional networks are artificial neural networks of feedforward type when a
signal travels sequentially along the neurons (from the first layer to the last). These
networks were originally developed for image analysis. Good results in this area have
led to their application for solving other classification tasks, including unstructured
documents.</p>
        <p>This neural network is an alternation of convolutional, subsampling and
fullyconnected layers. A text document arrives at the network input wherein each word is
determined by the vector (e.g., may use the algorithm word2vec). The Softmax
function which implements multiclassification is used for the output layer of the
neural network.</p>
        <p>
          Convolutional neural networks for the classification of text documents are
considered in more detail in [
          <xref ref-type="bibr" rid="ref18 ref19 ref28">18, 19, 28</xref>
          ].
        </p>
        <p>During the preliminary analysis, the authors have identified 4 typical situations,
identified depending on three indicators: the size of the received document, the degree
of intersection of the headings, and the amount of accumulated statistics for training
the models.</p>
        <p>Depending on these typical situations, Table 1 shows the results of comparative
assessment for the correct rubrication and analysis based on the example of more than
10 thousand mеssages.</p>
        <p>For the mentioned typical situations the proposed classification method has
allowed reducing the number of erroneously rubricated text documents by 7% on
average compared with the probabilistic method and by 6.3% compared with the
neural network method.</p>
      </sec>
      <sec id="sec-1-3">
        <title>Proposed method</title>
        <p>65
79
90
89</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>6 Conclusion</title>
      <p>As a result of the implemented method a tree structure of a rubric field is formed,
this structure is based on the fuzzy relations between the syntactical characteristics of
the rubricated documents. The document analysis is based on the detection of the
fuzzy correspondence for these documents according to the syntactical characteristics
with the values of the determined clusters sequentially from the root to the leaves of
the built decision tree.</p>
      <p>The proposed method for rubrication and analysis of electronic unstructured text
documents was implemented by the software and tested during automated processing
of appeals (applications, complaints or suggestions) of citizens and organizations
receiving by Administration of Smolensk region. It has made possible to ensure
efficient and high-quality actualization for the rubrics and document analysis under
the conditions of nonstationary composition of the thesaurus and the relevance of the
words in rubrics.</p>
    </sec>
    <sec id="sec-3">
      <title>7 Acknowledgment</title>
      <p>The reported study was funded by RFBR according to the research project
No 18-01-00558.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <article-title>Analytical report on the work of Administration of Smolensk region with citizens' appeals</article-title>
          . URL: https://www.adminsmolensk.ru/obrascheniya_grazhdan/obzori_ obrascheniy/news_16096.html.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Avdeenko</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Makarova</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          :
          <article-title>Acquisition of knowledge in the form of fuzzy rules for cases classification</article-title>
          .
          <source>Lecture Notes in Computer Science. Data Mining and Big Data</source>
          , vol.
          <volume>10387</volume>
          , pp.
          <fpage>536</fpage>
          -
          <lpage>544</lpage>
          (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Batyrshin</surname>
          </string-name>
          , I.:
          <article-title>On definition and construction of association measures</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          , vol.
          <volume>29</volume>
          , pp.
          <fpage>2319</fpage>
          -
          <lpage>2326</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Batyrshin</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          :
          <article-title>Towards a general theory of similarity and association measures: Similarity, dissimilarity and correlation functions</article-title>
          .
          <source>Journal of Intelligent &amp; Fuzzy Systems</source>
          , vol.
          <volume>36</volume>
          , pp.
          <fpage>2977</fpage>
          -
          <lpage>3004</lpage>
          (
          <year>2019</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ducharme</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vincent</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jauvin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>A Neural Probabilistic Language Model</article-title>
          .
          <source>JMLR 3</source>
          , pp.
          <fpage>1137</fpage>
          -
          <lpage>1155</lpage>
          (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Borisov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozlov</surname>
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Analysis and monitoring of electronic text documents rubrication</article-title>
          .
          <source>MPIE Bulletin</source>
          , vol.
          <volume>4</volume>
          , pp.
          <fpage>121</fpage>
          -
          <lpage>127</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Borisov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>The method of fuzzy analysis of texts and their rubrics actualization</article-title>
          .
          <source>Fuzzy Technologies in the Industry - FTI</source>
          <year>2018</year>
          :
          <article-title>Proceedings of the II International Scientific</article-title>
          and Practical Conference. Ulyanovsk, pp.
          <fpage>259</fpage>
          -
          <lpage>263</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Burlakov</surname>
            ,
            <given-names>M.E.</given-names>
          </string-name>
          :
          <article-title>Using optimize naïve bayes classifier in problem of sms classification</article-title>
          .
          <source>Izvestia of Samara Scientific Center of the Russian Academy of Sciences</source>
          , vol.
          <volume>18</volume>
          , no.
          <issue>4</issue>
          , pp.
          <fpage>705</fpage>
          -
          <lpage>709</lpage>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Cordon</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Herrera</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hoffmann</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Magdalena</surname>
          </string-name>
          , L.:
          <article-title>Genetic Fuzzy Systems: Evolutionary Tuning and learning of Fuzzy Knowledge Bases</article-title>
          . Sinqgapore, New Jersey, London, Hong Kong, World Scientific Publishing,
          <volume>462</volume>
          p. (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Dli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulygina</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ross</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          :
          <article-title>Developing the economic information system for automated analysis of unstructured text documents</article-title>
          .
          <source>Journal of Applied Informatics</source>
          , vol.
          <volume>13</volume>
          , no.
          <volume>5</volume>
          (
          <issue>77</issue>
          ), pp.
          <fpage>51</fpage>
          -
          <lpage>57</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Dli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulygina</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Development of multimethod approach to rubrication of unstructed electronic text documents in various conditions</article-title>
          .
          <source>Proceedings of the International Russian Automation Conference (RusAutoCon)</source>
          ,
          <source>Sochi</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Dli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bulygina</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kozlov</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Formation of the structure of the intellectual system of analyzing and rubricating unstructured text information in different situations</article-title>
          .
          <source>Journal of Applied Informatics</source>
          , vol.
          <volume>13</volume>
          , no.
          <volume>4</volume>
          (
          <issue>76</issue>
          ), pp.
          <fpage>111</fpage>
          -
          <lpage>123</lpage>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Faifer</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Janikow</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Bottom-up Partitioning in Fuzzy Decision Trees</article-title>
          .
          <source>Proceedings of the 19th International Conference of the North American Fuzzy Information Society</source>
          . IEEE, pp.
          <fpage>326</fpage>
          -
          <lpage>330</lpage>
          (
          <year>2000</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Jambu</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Hierarchical cluster analysis and correspondences</article-title>
          . Moscow: Finance and statistics (
          <year>1988</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Janikow</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>Fuzzy Decision Trees: Issues and Methods</article-title>
          .
          <source>IEEE Transactions of Man, Systems, Cybernetics</source>
          , vol
          <volume>28</volume>
          (
          <issue>1</issue>
          ), pp.
          <fpage>1</fpage>
          -
          <lpage>14</lpage>
          (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Kaftannikov</surname>
            ,
            <given-names>I.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Parasich</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          :
          <article-title>Decision Tree's Features of Application in Classification Problems</article-title>
          . Bulletin of the South Ural State University. Ser. Computer Technologies, Automatic Control,
          <source>Radio Electronics</source>
          , vol.
          <volume>15</volume>
          , no.
          <issue>3</issue>
          , pp.
          <fpage>26</fpage>
          -
          <lpage>32</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Kalchbrenner</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blunsom</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          :
          <article-title>Recurrent convolutional neural networks for discourse compositionality</article-title>
          . Workshop on CVSC, pp.
          <fpage>119</fpage>
          -
          <lpage>126</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Convolutional neural networks for sentence classification</article-title>
          .
          <source>IEMNLP, September</source>
          , pp.
          <fpage>1746</fpage>
          -
          <lpage>1751</lpage>
          (
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <surname>Krizhevsky</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hinton</surname>
          </string-name>
          , G.:
          <article-title>Imagenet classification with deep convolutional neural networks</article-title>
          .
          <source>NIPS</source>
          , pp.
          <fpage>1106</fpage>
          -
          <lpage>1114</lpage>
          (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>Kruglov</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dli</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Golunov</surname>
          </string-name>
          , R.:
          <source>Fuzzy logic and artificial neural networks</source>
          . Moscow: Nauka,
          <string-name>
            <surname>Fizmatlit</surname>
          </string-name>
          (
          <year>2001</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>LeCun</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          <article-title>Text understanding from scratch</article-title>
          . Computer Science Department (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>Nakagawa</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Inui</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kurohashi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Dependency tree-based sentiment classification using CRFs with hidden variables</article-title>
          .
          <source>Proceedings of ACL</source>
          <year>2010</year>
          (
          <year>2010</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <surname>Passino</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yurkovich</surname>
            ,
            <given-names>S.: Fuzzy</given-names>
          </string-name>
          <string-name>
            <surname>Control.</surname>
          </string-name>
          Addison-Wesley, NJ, 522 p. (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <surname>Protasov</surname>
          </string-name>
          , S.: LinkGrammar. URL: http://sz.ru/parser/doc/
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Quinlan</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Induction of decision trees</article-title>
          .
          <source>Machine Learning</source>
          , vol.
          <volume>1</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>81</fpage>
          -
          <lpage>106</lpage>
          (
          <year>1998</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>Shevelyov</surname>
            ,
            <given-names>O.G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Petrakov</surname>
            ,
            <given-names>A.V.</given-names>
          </string-name>
          :
          <article-title>Text classification with decision trees and feedforward neural networks</article-title>
          .
          <source>Tomsk State University Journal</source>
          , vol.
          <volume>290</volume>
          , pp.
          <fpage>300</fpage>
          -
          <lpage>307</lpage>
          (
          <year>2006</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>Uchitelev</surname>
          </string-name>
          , N.:
          <article-title>Classification of text information with the use of SVM. Information technologies and system</article-title>
          ,
          <source>no.1</source>
          , pp.
          <fpage>335</fpage>
          -
          <lpage>340</lpage>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , LeCun, Y.:
          <article-title>Character-level convolutional networks for text classification</article-title>
          .
          <source>Advances in Neural Information Processing Systems</source>
          , Febrary, pp.
          <fpage>649</fpage>
          -
          <lpage>657</lpage>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>