<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Transformations of Texts into the Complex Network with Applying Visibility Graphs Algorithms</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute for Information Recording of National Academy of Sciences of Ukraine</institution>
          ,
          <addr-line>Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”</institution>
          ,
          <addr-line>Kyiv</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <fpage>0000</fpage>
      <lpage>0003</lpage>
      <abstract>
        <p>In this article, the algorithms of visibility for transforming texts into a complex network is proposed. Keywords and concepts from a set of documents which describe some subject domain are extracted. Numeric values are assigned to each word or phrase using GTF metric, which was proposed in this article instead ordinary TF-IDF metric, that is intended to reflect how important a word is to a document in a collection or corpus. As the result, a time series is constructed. A tool in time series analysis - the visibility graph algorithm is used for constructing a graph of the subject domain. In this article, two actual subject domains (“Information extraction” and “Complex network”) are considered for example. The corpora of documents, which are related to actual subject domains, were considered from an open access repository of electronic preprints arXiv (https://arxiv.org). The proposed algorithm is used for the set of documents, which are related to “Information extraction” and “Complex network”. This article shows that applying GTF metric is more expedient compared with TF-IDF metric in the case when the set of documents describe one subject domain. Also, the results of applying the visibility graph algorithm and the compactified horizontal visibility graph algorithm are compared. This article shows, that in some case using the compactified horizontal visibility graph algorithm gives a network of words with more quantity of connections between concepts compared with using the visibility graph algorithm. An open-source visualization and exploration software for all kinds of graphs and networks Gephi and an original package of specially developed Python modules are used for simulation and visualization as an additional tool. The proposed algorithm can be used for visualization some subject domain, and also for information support systems, enabling to reveal key components of a subject domain. Also, the results of this article can be used for building UI of information retrieval systems, enabling to make a process of search a relevant information easier.</p>
      </abstract>
      <kwd-group>
        <kwd>Set of Documents</kwd>
        <kwd>Subject Domain</kwd>
        <kwd>Time Series</kwd>
        <kwd>Network of Words</kwd>
        <kwd>TF-IDF</kwd>
        <kwd>Visibility Graph</kwd>
        <kwd>Compactified Horizontal Visibility Graph</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The development of the Internet caused a number of problems, which are related, first
of all, with a massive quantity of data in the Web-space, including needless data.</p>
      <p>
        Today on the Internet there is a huge and dynamic information base which is
available for research and analysis. It turned out, that many tasks, which arise during
working with the network information space, have much in common with mathematical
sciences. This fact opens wide opportunities to applying a powerful mathematical tool
[
        <xref ref-type="bibr" rid="ref1 ref2">1,2</xref>
        ]. Taking into account the problems of the huge dimensionality and the dynamic
of information resources in global networks, the knowledge based on discrete
mathematics (graph theory, networks theory), pattern recognition (classification, clustering),
linguistics, digital signal processing, wavelet analysis and fractal analysis are applied.
      </p>
      <p>Due to terabytes of textual data, that are distributed in networks and have been
accumulating dynamically, development of new methods and algorithms for analyzing
these data is necessary. But also the advantages and disadvantages of algorithms that
exist for information retrieval must consider.</p>
      <p>A modern development of technologies in some case enable to find relevant
information. But the problems of further analytical processing of this information,
selection of necessary factual data, detection of development trends in selected subject
domain, the relation between concepts, events, and forecasting remain unresolved.
More of these problems are actual challenges of a semantic processing of huge
dynamical sets of textual data.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Analysis of Recent Researches and Publications</title>
      <p>
        A subject of this study is actual and most commonly found in various articles of
foreign and domestic scientists. For example, in the works [
        <xref ref-type="bibr" rid="ref3 ref4">3,4</xref>
        ] the main accent makes
on developing new methods and algorithms, which are appointed to analytical
processing of huge sets of textual data. In the works [
        <xref ref-type="bibr" rid="ref5 ref6">5,6</xref>
        ] authors consider a linguistic
processing of natural language texts, as one of the central problem of
intellectualization of information technologies.
      </p>
      <p>
        In particular, in the works [
        <xref ref-type="bibr" rid="ref10 ref7 ref8 ref9">7-10</xref>
        ] the visibility graphs algorithm is proposed. Also
the method of constructing networks based on the visibility graphs algorithm is
presented in works [
        <xref ref-type="bibr" rid="ref11 ref12 ref13 ref14 ref15">11-15</xref>
        ].
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Review of Some Visibility Algorithms</title>
      <p>In this work, a network of connections between terms and concepts, which go into
textual data is building. Building networks of words, the nodes of which are elements
of the text, enables to reveal key components of the text. At the same time, the task of
determining, which of the important structural elements of the text are also
informationally important, is actual.</p>
      <p>
        There are several approaches of constructing networks from the texts (so-called
language networks) and different ways of interpreting nodes and connections. It leads,
accordingly, to various kinds of presenting of such networks. Nodes are connected if
corresponding words are either adjacent in the text [
        <xref ref-type="bibr" rid="ref16 ref17">16, 17</xref>
        ], or are in a single sentence
[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], or are syntactically [
        <xref ref-type="bibr" rid="ref19 ref20">19, 20</xref>
        ] or semantically [
        <xref ref-type="bibr" rid="ref21 ref22">21, 22</xref>
        ] connected.
3.1
      </p>
      <sec id="sec-3-1">
        <title>Visibility Graph Algorithm (VG)</title>
        <p>
          In this article, a tool in time series analysis – the visibility graph algorithm [
          <xref ref-type="bibr" rid="ref23 ref24 ref7">7, 23, 24</xref>
          ]
is used for converting a time series into a graph. This algorithm maps a time series
into a network.
        </p>
        <p>For example, the derived graph of visibility for the time series {0.125, 0.063,
0.042, 0.104, 00.125, 0.063, 0.042, 0.104, 0.125, 0.063, 0.042, 0.104} is presented in
Ошибка! Источник ссылки не найден.. In the graph, every node corresponds, in
the same order, to series data. The visibility rays between the data define the links
connecting nodes in the graph.</p>
        <p>There is a connection between nodes if they are in “line of sight” with each other,
i.e. if they can be connected by a line that does not cross any other histogram bar.
More formally, the visibility criteria is described as follows: two arbitrary data values
(ta, ya ) and (tb, yb ) will have visibility, and consequently will become two connected
nodes of the associated graph, if any other data (tс, yс ) placed between them fulfills:</p>
        <p>
          Also in the article [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ] is shown that the structure of the time series is conserved in
the graph topology: periodic series convert into regular graphs, random series into
random graphs, and fractal series into scale-free graphs.
3.2
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>Compactified Horizontal Visibility Graph Algorithm (CHVG)</title>
        <p>
          In the works [
          <xref ref-type="bibr" rid="ref11 ref12 ref13 ref25 ref26 ref27">11, 12, 13, 25-27</xref>
          ] another algorithm for constructing networks of words
– the compactified horizontal visibility graph algorithm (CHVG) is proposed. In
general, the process of constructing the language network using the compactified
horizontal visibility graph algorithm consists of three stages (Fig. 2). At the first stage, the
set of nodes, which correspond to the set of words in order of occurrence in the text,
are marked on the horizontal axis. At the second stage, the horizontal visibility graph
is built. Two observations made at times ti and t j to be connected in a horizontal
visibility graph (HVG) if and only if
        </p>
        <p>xk  min{xi , x j}
for all tk with ti  tk  t j .</p>
        <p>At the third stage, the network, that was obtained at the previous stages, is
compactified. As the result, the new network of words – the compactified horizontal
visibility graph is obtained.</p>
        <p>In this manner, the compactified horizontal visibility graph algorithm enables to
construct of network structures based on texts, in which numeric values are assigned
in some manner to each word or phrase.
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Forming of the Time Series</title>
      <p>
        In this article, TF-IDF numeric metric (TF – Term Frequency, IDF — Inverse
Document Frequency) is used for forming of the time series. It is an example of a function
that assigns a number to a word in the text. TF-IDF is the most frequently applied
weighting scheme. Also this a numerical statistic is intended to estimate how
important a word is to a document in a collection or corpus [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ]. The TF-IDF value
increases proportionally to the number of times a word appears in the document and is offset
by the number of documents in the corpus that contain the word, which helps to adjust
for the fact that some words appear more frequently in general. It is often used as a
weighting factor in text mining, information searching, and retrieval. Also, it can be
used as one of the criteria to estimate the relevance of a document to a search query
[
        <xref ref-type="bibr" rid="ref29">29</xref>
        ].
      </p>
      <p>
        TF (term frequency) is a ratio of the number of the word occurs in a document to
the total number of words in the document. In this manner, the weight of a term
(word) ti that occurs in a document is simply proportional to the term frequency. The
term was proposed by Karen Spärck Jones [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ],
      </p>
      <p>TF  ni ,
 nk
k
where ni is a number of occurrences of the term (word) i in the document;  nk is
a total number of words in the document. k</p>
      <p>IDF (inverse document frequency) is an inverse function of the number of
documents in which a term occurs. It is the logarithmically scaled inverse fraction of the
documents that contain the word, obtained by dividing the total number of documents
by the number of documents containing the term and then taking the logarithm of that
quotient. Using IDF reduces the weight of widely used terms (words).</p>
      <p>D
IDF  log
(di  ti )
where D is a total number of documents in the corpus; (di  ti ) is the number
of documents contain a term ti ( ni  0 ).</p>
      <p>In other words, the TF-IDF metric is a product of two members: TF and IDF.</p>
      <p>TF  IDF  TF IDF</p>
      <p>A word has high TF-IDF score in a document if it appears in relatively few
documents, but appears in this one, and when it appears in a document it tends to appear
many times.</p>
      <p>After the representation of corpora of documents in a vector view (number of
words determines the dimension of the vector), the visibility graph algorithm, which
was described above, is used.</p>
    </sec>
    <sec id="sec-5">
      <title>Presentation of the Basic Material of the Research</title>
      <p>In this article, before using the method to constructing networks from the texts, we
propose to remove stop word. It enables to removing the words, which are
informationally unimportant. We use the stop-dictionary based on various stop-dictionaries,
which are available via the links:
https://code.google.com/archive/p/stopwords/downloads/; http://www.textfixer.com/tutorials/common-english-words.php.</p>
      <p>Also, we propose a global TF metric (GTF), which looks like</p>
      <p>GTF  ni ,
 nk
k
where ni is the number of occurrences of the term (word) i in documents of the
corpus;  nk is a total number of words in the documents of the corpus.</p>
      <p>Wordsk, which are occurred not often within a single document, have a low TF
metric. But if they occur in every document of corpora, they at real are informationally
important in a global context for the considered subject domain. That is why in this
article we use a GTF metric.
5.1</p>
      <sec id="sec-5-1">
        <title>Example 1</title>
        <p>In this article, the corpora of 292 documents, which are related with an actual domain
– “Information extraction”, were considered from an open access repository of
electronic preprints – arXiv (https://arxiv.org) for a period of time 2000-2010.</p>
        <p>As the result of applying of a proposed method of constructing networks from the
texts, the network of keywords, which are important structural elements of the subject
domain, was obtained (Fig. 3).</p>
        <p>Based on the results which presented in the Table 1 we can notice that quantity of
keywords, which are informationally important, is more in case of applying only GTF
metric for the set of documents that describe one subject domain. The keywords, such
as “information” and “extraction”, which are informationally important for the
considered subject domain, are missed in case of using TF-IDF metric (these keywords
have a low TF-IDF). After analyzing the results of research (Table 1) we can make
the conclusion that applying only GTF metric is more expedient compared with
TFIDF metric in the case when the set of documents describe one subject domain. It can
be explained by the fact that words, which are key for the considered subject domain
and occur in every document of corpora, have a low IDF (as the result a low TF-IDF).
But in fact, these words are informationally important and define the structure of the
text.
For comparison of the results of applying the visibility graph algorithm and the
compactified horizontal visibility graph algorithm, the corpora of 2901 documents, which
are related with an actual subject domain – “Complex network”, were considered
from an open access repository of electronic preprints – arXiv (https://arxiv.org) for a
period of time 2000-2010. As a result of applying of visibility graph algorithms two
different networks of words for the considered subject domain, was obtained (Fig. 4,
Fig. 5).</p>
        <p>
          After deriving the associated graphs from the visibility algorithms, all the terms are
sorted descending and weight values of CHVG and VG corresponding nodes
according to a number of connections with other nodes are calculated. As the weight, for
example, the authority (or hub) calculated by HITS algorithm [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] is used. Because
the graph is not directed, the choice of a form of the weight does not matter.
        </p>
        <p>Comparing the results (Table 2), it may notice, that in the case of applying the
compactified horizontal visibility graph algorithm (Fig. 5) there are many words,
which have more links than in the case of applying the visibility graph algorithm (Fig.
4).</p>
        <p>A general quantity of links is 768 in the case of applying the compactified
horizontal visibility graph algorithm, unlike in the case of applying the ordinary visibility
graph algorithm, when a general quantity of links is 703. It should be noted, that
obtained networks are very complex. That is why we plan to continue our research in
this sphere.
6</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusion</title>
      <p>The method of constructing networks from the texts, so-called language networks,
was proposed. Keywords and concepts from the set of documents which describe
some subject domain were retrieved. Numeric values were assigned to each word or
phrase using GTF metric, which was proposed in this article instead ordinary TF
metric. After analyzing the results of the research we made the conclusion that applying
only GTF metric is more expedient compared with TF-IDF metric in the case when
the set of documents describe one subject domain. As the result, a time series were
constructed. A tool in time series analysis – the visibility graph algorithm was used
for constructing the graph of the subject domain. After analyzing the results of
research the important structural elements of the text were found. It should be noted that
these elements of the text also are informationally important and define the structure
of the text. There was discovered, that in some case using the compactified horizontal
visibility graph algorithm gives a network of words with more quantity of connections
between concepts compared with using the visibility graph algorithm. Cause of
complexity of obtained networks we plan to continue our research in this sphere.</p>
      <p>The proposed method can be used for visualization some subject domain, and also
for information support systems, enabling to reveal key components of a subject
domain. Also the results of this article can be used for building UI of information
retrieval systems, enabling to make a process of search a relevant information easier.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I.V.</given-names>
            <surname>Bezsudnov</surname>
          </string-name>
          ,
          <article-title>Internetika: Navigation in complex networks: models and algorithms</article-title>
          , Moscow, Russia: Librokom,
          <string-name>
            <surname>Editorial URSS</surname>
          </string-name>
          (in Russian) (
          <year>2009</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <article-title>Knowledge Search in INTERNET. Professional work</article-title>
          .
          <source>Dialectics</source>
          , Moscow (in Russian) (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>C.C.</given-names>
            <surname>Aggarwal</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.X.</given-names>
            <surname>Zhai</surname>
          </string-name>
          , Mining text data. Springer Science &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          (
          <year>2012</year>
          )
          <fpage>77</fpage>
          -
          <lpage>128</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>G.</given-names>
            <surname>Miner</surname>
          </string-name>
          , J.
          <string-name>
            <surname>Elder</surname>
            <given-names>IV</given-names>
          </string-name>
          , and
          <string-name>
            <given-names>T.</given-names>
            <surname>Hill</surname>
          </string-name>
          ,
          <article-title>Practical text mining and statistical analysis for nonstructured text data applications</article-title>
          . Academic Press (
          <year>2012</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>V.</given-names>
            <surname>Yu</surname>
          </string-name>
          . Taranukha,
          <article-title>Intelligent processing of texts, Kiev: electronic publication on the website of the faculty (in Ukrainian) (</article-title>
          <year>2014</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>E.I.</surname>
          </string-name>
          <article-title>Bol'shakova</article-title>
          ,
          <string-name>
            <given-names>E. S.</given-names>
            <surname>Klyshinsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Noskov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O. V.</given-names>
            <surname>Peskova</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.V.</given-names>
            <surname>Yagunova</surname>
          </string-name>
          ,
          <article-title>Automatic processing of texts in a natural language and computational linguistics</article-title>
          , Moscow: MIEM Publ (in Russian) (
          <year>2011</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>L.</given-names>
            <surname>Lacasa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Luque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Ballesteros</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Luque</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.C.</given-names>
            <surname>Nuño</surname>
          </string-name>
          ,
          <article-title>From time series to complex networks: the visibility graph</article-title>
          ,
          <source>Proc. Natl. Acad. Sci. USA</source>
          <volume>105</volume>
          (
          <year>2008</year>
          )
          <fpage>4972</fpage>
          -
          <lpage>4975</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>A.M. Nunez</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Lacasa</surname>
            ,
            <given-names>J. P.</given-names>
          </string-name>
          <string-name>
            <surname>Gomez</surname>
          </string-name>
          , and Luque B.
          <article-title>Visibility algorithms: A short review</article-title>
          ,
          <source>Frontiers in Graph Theory. InTech</source>
          , (
          <year>2012</year>
          )
          <fpage>119</fpage>
          -
          <lpage>152</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9. В.
          <string-name>
            <surname>Luque</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Lacasa</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Ballesteros</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Luque</surname>
          </string-name>
          ,
          <article-title>Horizontal visibility graphs: Exact results for random time series</article-title>
          . Physical Review E,
          <volume>80</volume>
          (
          <issue>4</issue>
          ) (
          <year>2009</year>
          )
          <fpage>046103</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10. G. Gutin,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mansour</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Severini</surname>
          </string-name>
          ,
          <article-title>A characterization of horizontal visibility graphs and combinatoris on words</article-title>
          ,
          <string-name>
            <surname>Physica</surname>
            <given-names>A</given-names>
          </string-name>
          , -
          <volume>390</volume>
          (
          <year>2011</year>
          )
          <fpage>2421</fpage>
          -
          <lpage>2428</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          ,
          <article-title>Compactified HVG for the Language Network</article-title>
          .
          <source>In: Proceedings of the International Conference on Intelligent Information Systems: The Conference is dedicated to the 50th anniversary of the Institute of Mathematics and Computer Science</source>
          ,
          <volume>20</volume>
          -
          <fpage>23</fpage>
          Aug.
          <year>2013</year>
          , Chisinau,
          <source>Moldova: Proceedings IIS, Institute of Mathematics and Computer Science</source>
          (
          <year>2013</year>
          )
          <fpage>108</fpage>
          -
          <lpage>113</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.V.</given-names>
            <surname>Yagunova</surname>
          </string-name>
          , and
          <string-name>
            <surname>E. Pronoza,</surname>
          </string-name>
          <article-title>The Use of Horizontal Visibility Graphs to Identify the Words that Define the Informational Structure of a Text</article-title>
          .
          <source>In: Proceedings of the 12th Mexican International Conference on Artificial Intelligence</source>
          (
          <year>2013</year>
          )
          <fpage>209</fpage>
          -
          <lpage>215</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          , and
          <string-name>
            <given-names>E.V.</given-names>
            <surname>Yagunova</surname>
          </string-name>
          ,
          <article-title>Application of the CHVG-algorithm for scientific texts</article-title>
          .
          <source>In: Proceedings of the Open Semantic Technologies for Intelligent Systems (OSTIS)</source>
          ,
          <source>February 20 - 22th</source>
          , Minsk (
          <year>2014</year>
          )
          <fpage>199</fpage>
          -
          <lpage>204</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          . Manko,
          <article-title>The Model of Words Cumulative Influence in a Text</article-title>
          .
          <source>In: XVIII International Conference on Data Science and Intelligent Analysis of Information</source>
          . Springer, Cham (
          <year>2018</year>
          )
          <fpage>249</fpage>
          -
          <lpage>256</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <given-names>D.V.</given-names>
            <surname>Lande</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.V.</given-names>
            <surname>Yagunova</surname>
          </string-name>
          , E. Pronoza, and
          <string-name>
            <given-names>S.</given-names>
            <surname>Volskaya</surname>
          </string-name>
          ,
          <source>Hierarchies of Terms on the Euromaidan Events: Networks and Respondents Perception, 12th International Workshop on Natural Language Processing and Cognitive Science NLPCS</source>
          <volume>2015</volume>
          <fpage>127</fpage>
          -
          <lpage>139</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. R.
          <article-title>Ferrer-i-</article-title>
          <string-name>
            <surname>Cancho</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>R.V.</given-names>
            <surname>Solé</surname>
          </string-name>
          ,
          <source>The Small World of Human Language, Proceedings of the Royal Society of London B: Biological Sciences 268.1482</source>
          (
          <year>2001</year>
          )
          <fpage>2261</fpage>
          -
          <lpage>2265</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>S.N.</given-names>
            <surname>Dorogovtsev</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.F.F.</given-names>
            <surname>Mendes</surname>
          </string-name>
          ,
          <article-title>Language as an Evolving Word Web</article-title>
          ,
          <source>Proceedings of the Royal Society of London B: Biological Sciences 268.1485</source>
          (
          <year>2001</year>
          )
          <fpage>2603</fpage>
          -
          <lpage>2606</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <surname>S.M.G. Caldeira</surname>
            ,
            <given-names>T.C.</given-names>
          </string-name>
          <string-name>
            <surname>Petit Lobao</surname>
            ,
            <given-names>R.F.S.</given-names>
          </string-name>
          <string-name>
            <surname>Andrade</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <string-name>
            <surname>Neme</surname>
            , and
            <given-names>J.G. Miranda,</given-names>
          </string-name>
          <article-title>The network of concepts in written texts</article-title>
          ,
          <source>Preprint physics/0508066</source>
          (
          <year>2005</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19. R.
          <article-title>Ferrer-i-</article-title>
          <string-name>
            <surname>Cancho</surname>
            ,
            <given-names>R.V.</given-names>
          </string-name>
          <string-name>
            <surname>Solé</surname>
            , and
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Kohler</surname>
          </string-name>
          ,
          <article-title>Patterns in syntactic dependency networks</article-title>
          ,
          <source>Physical Review E 69.5</source>
          (
          <year>2004</year>
          )
          <fpage>051915</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20. R.
          <article-title>Ferrer-i-</article-title>
          <string-name>
            <surname>Cancho</surname>
          </string-name>
          ,
          <article-title>The variation of Zipf's law in human language</article-title>
          ,
          <source>The European Physical Journal B-Condensed Matter and Complex Systems</source>
          , (
          <year>2005</year>
          )
          <fpage>249</fpage>
          -
          <lpage>257</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <surname>A.E. Motter</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.P.S. De Moura</surname>
            ,
            <given-names>Y.C.</given-names>
          </string-name>
          <string-name>
            <surname>Lai</surname>
            , and
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Dasgupta</surname>
          </string-name>
          ,
          <article-title>Topology of the conceptual network of language</article-title>
          , Physical Review E,
          <volume>65</volume>
          (
          <issue>6</issue>
          ) (
          <year>2002</year>
          )
          <fpage>065102</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>M. Sigman</surname>
            , and
            <given-names>G.A.</given-names>
          </string-name>
          <string-name>
            <surname>Cecchi</surname>
          </string-name>
          ,
          <article-title>Global Organization of the Wordnet Lexicon</article-title>
          ,
          <source>Proceedings of the National Academy of Sciences 99.3</source>
          (
          <year>2002</year>
          )
          <fpage>1742</fpage>
          -
          <lpage>1747</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>I.V.</given-names>
            <surname>Bezsudnov</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.A.</given-names>
            <surname>Snarskii</surname>
          </string-name>
          .
          <article-title>From the time series to the complex networks: The parametric natural visibility graph</article-title>
          ,
          <source>Physica A: Statistical Mechanics and its Applications</source>
          <volume>414</volume>
          (
          <year>2014</year>
          )
          <fpage>53</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gao</surname>
          </string-name>
          , D. Han, and
          <string-name>
            <given-names>M.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <article-title>The parametric modified limited penetrable visibility graph for constructing complex networks from time series</article-title>
          ,
          <source>Physica A: Statistical Mechanics and its Applications</source>
          ,
          <volume>492</volume>
          (
          <year>2018</year>
          )
          <fpage>1097</fpage>
          -
          <lpage>1106</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>M. Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Tian</surname>
            , and
            <given-names>H. E.</given-names>
          </string-name>
          <string-name>
            <surname>Stanley</surname>
          </string-name>
          ,
          <article-title>Degree distributions and motif profiles of limited penetrable horizontal visibility graphs</article-title>
          .
          <source>Physica A: Statistical Mechanics and its Applications</source>
          (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26.
          <string-name>
            <surname>M. Wang</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          <string-name>
            <surname>Vilela</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Tian</surname>
            , and
            <given-names>H. E.</given-names>
          </string-name>
          <string-name>
            <surname>Stanley</surname>
          </string-name>
          ,
          <article-title>Exact results of the limited penetrable horizontal visibility graph associated to random time series and its application</article-title>
          .
          <source>Scientific reports</source>
          ,
          <volume>8</volume>
          (
          <issue>1</issue>
          ) (
          <year>2018</year>
          )
          <fpage>5130</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>M. Wang</surname>
            ,
            <given-names>A.L.</given-names>
          </string-name>
          <string-name>
            <surname>Vilela</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Du</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Tian</surname>
            , and
            <given-names>H. E.</given-names>
          </string-name>
          <string-name>
            <surname>Stanley</surname>
          </string-name>
          ,
          <article-title>Topological properties of the limited penetrable horizontal visibility graph family</article-title>
          , Physical Review E,
          <volume>97</volume>
          (
          <issue>5</issue>
          ) (
          <year>2018</year>
          )
          <fpage>052117</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>J.D. Ullman</surname>
          </string-name>
          , Data Mining,
          <article-title>Mining of massive datasets</article-title>
          . Cambridge University Press. (
          <year>2011</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>J. Beel</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>GIPP</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Langer</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Breitinger</surname>
          </string-name>
          ,
          <article-title>Research-paper recommender systems: a literature survey</article-title>
          ,
          <source>International Journal on Digital Libraries</source>
          .
          <volume>17</volume>
          (
          <issue>4</issue>
          ), (
          <year>2016</year>
          )
          <fpage>305</fpage>
          -
          <lpage>338</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>K.S. Jones</surname>
          </string-name>
          ,
          <article-title>A statistical interpretation of term specificity and its application in retrieval</article-title>
          ,
          <source>Journal of Documentation</source>
          , MCB University Press 60, (
          <year>2004</year>
          )
          <fpage>493</fpage>
          -
          <lpage>502</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <surname>J.M. Kleinberg</surname>
          </string-name>
          ,
          <article-title>Authoritative sources in a hyperlink environment</article-title>
          .
          <source>Journal of the ACM JACM</source>
          .
          <volume>46</volume>
          (
          <issue>5</issue>
          ) (
          <year>1999</year>
          )
          <fpage>604</fpage>
          -
          <lpage>632</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>