<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Multi-Feature Fusion TextRank Algorithm for Sentence- Oriented Keyword Extraction 1</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Shuo-shuo Meng</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Guo-sheng Hao</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Zi-hao Yang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computer Science and Technology, Jiangsu Normal University</institution>
          ,
          <addr-line>Xuzhou</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>12</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Most keyword extraction algorithms mainly focus on the extraction from a document but not a sentence. A document hosts more information than a sentence, therefore keyword extraction from a sentence is a challenging task. In addition, keyword extraction from a sentence has potential application in many fields, such as question answering systems, text search, recommendation systems, etc. Therefore, this paper proposes the multi-feature fusion TextRank algorithm for sentence-oriented keyword extraction by integrating knowledge of features in the initial score of keywords and calculation of the probability transfer matrix. The initial scores of candidate keywords are adjusted by fusing the term frequency and part of speech features in the sentence. And the probability transfer matrix to calculate the scores is tuned by using the semantic and syntactic features among the candidate keywords. Based on the scores of candidate keywords, the top K words are selected as the keywords of the sentence. The experiments show that our method outperforms in the indices of P, R, and F.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;sentence</kwd>
        <kwd>keyword extraction</kwd>
        <kwd>TextRank</kwd>
        <kwd>feature fusion</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>Keyword extraction is one of the popular topics in the field of natural language processing[1]. It is
also widely applied in our daily life since keywords help us save time by transferring the main ideas of
documents through fewer words.</p>
      <p>Keyword extraction from a document but not from a sentence is widely studied, such as from an
abstract of a paper[2], news[3], patent texts[4], etc. In document keyword extraction, the knowledge in
titles[3], paragraphs[5], and the location of words[6] can be used, and some deep learning methods as the
end-to-end extraction are also presented[2].</p>
      <p>Keyword extraction methods are divided into supervised and unsupervised types[1]. In supervised
type, keyword extraction is regarded as a binary classification or multi-classification task[1]. In
unsupervised type, they can be summarized into three kinds: keyword extraction based on statistical
features, the topic model and the graph model.</p>
      <p>
        Among the keyword extraction based on statistical features, TF-IDF (Term Frequency-inverse
Document Frequency)[7] is well-known for its simplicity and efficiency. The topic model based methods
extract keywords according to the topic distribution of documents[8]. Among the graph model based
keyword extraction, a popular method is TextRank algorithm[9]. Inspired by PageRank algorithm[10],
TextRank algorithm includes three steps: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) construct a graph model according to co-occurrence
relationship between words, (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) adjust the scores iteratively, (
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) and select the top K words with the
highest score as keywords. The improvement of TextRank mainly focused on two aspects: the scores
initialization of candidate keywords, and the construction of the probability transfer matrix.
      </p>
      <p>In the improvement from the scores initialization of candidate keywords, many features in document
are introduced, such as term frequency[11], the length of words, the position of words, the part of
speech[12], the narrative table[11], the importance of words in the document’s title[3]. For the improvement
from the construction of the probability transfer matrix, the included features are as follows: the
similarity[13] calculated according to Word2Vec, the reduction of the sparsity of matrix[5] based Doc2vec.</p>
      <p>Compared with a document, a sentence is shorter, and it does not have adequate structure
information.The challenge of sentence keyword extraction is the sparsity of text semantics [5]. Most of
the existing keyword extraction algorithms are studied with the background of documents. This paper
designs an algorithm suitable for sentence keyword extraction.</p>
      <p>
        However, the TextRank algorithm has the following shortcomings when applied to keyword
extraction from a sentence: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) It does not consider the inherent features of candidate keywords, such
as term frequency and part of speech. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) It is difficult to obtain deeper relations among words because
its probability transfer matrix only makes use of the co-occurrence relationships among words.
      </p>
      <p>
        This paper proposes the multi-feature fusion (MFF) TextRank algorithm for sentence-oriented
keyword extraction to address the above shortcomings. For shortcoming (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), the word term frequency
and part of speech features in the sentence are considered to assign initial scores to candidate keywords.
For shortcoming (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ), except for the co-occurrence relationship, we also take the semantic and syntactic
features into consideration to obtain deeper relationships between words.
      </p>
      <p>The rest of the paper is organized as follows: Section 1 points out the shortcomings of the state of
the arts, and reviews the related work on keyword extraction. Section 2 gives the main idea and the key
elements of this paper. The experiments are shown in Section 3 and this paper is concluded in Section
4.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Muti-Feature Fusion Textrank Algorithm For Sentence-Oriented Keyword</title>
    </sec>
    <sec id="sec-3">
      <title>Extraction</title>
      <p>This section firstly introduces the main idea of the algorithm. Secondly, a method was proposed to
improve candidate keyword scores. Thirdly, the algorithm to improve the construction of the probability
transfer matrix is given. Finally, the multi-feature fusion TextRank algorithm for sentence-oriented
keyword extraction is presented.</p>
    </sec>
    <sec id="sec-4">
      <title>2.1. Main idea of the algorithm</title>
      <p>
        Compared with a document, a sentence has less information that can be made use of. Therefore,
when applying TextRank in sentence-oriented keyword extraction, the information will have to be fully
used. In this paper, we propose two ideas to make use of the information of the sentence: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) When
assign initial scores to the candidate keywords, the term frequency and part of speech features are fused
into the calculation, and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) When construct probability transfer matrix, both semantic and syntactic
features are considered by using the summarization of the semantic similarity matrix and dependency
relevance matrix. Among them, the semantic similarity matrix is composed of the semantic similarity
among the candidate keywords trained by Word2vec.
      </p>
      <p>The above two ideas are embedded in the scores initialization of candidate keywords and the
construction of probability transfer matrix separately, as shown in Figure 1, which gives the framework
of MFF TextRank. Six parts are included in this framework, and they are organized according to their
relationship in the keyword extraction. Firstly, based on candidate keyword sets and co-occurrence
relationships, an undirected graph should be constructed. Secondly, the initial scores of the keywords
are assigned. Thirdly, the probability transfer matrix is constructed. Fourthly, iteratively calculate
candidate keywords’ scores. Fifthly, rank candidate keywords according to their scores, and at last take
the top K candidate keywords with high scores as the result.</p>
      <p>The model of TextRank algorithm can be formally expressed as:
st (vi ) = (1 − d ) + d ∗ 
vj∈In(vi )</p>
      <p>wji

vk∈Out(vj )
wjk</p>
      <p>st−1(v j )
Fig.1 MFF TextRank Framework</p>
      <p>
        where In(vi) denotes the set of all nodes (words) that have edges heading to node vi; Out(vj) denotes
the set of nodes that have edges tailing from the node vj; st(vi) denotes the TextRank score of the node
vi in t-th iteration; wji denotes the weight of the edge between the node vj and vi; wji will be taken as the
element value in the probability transfer probability, and d∈[0,1] is a damping factor, and it is generally
set to 0.85 [9]. The vector form of TextRank can be rewritten as:
st = (1− d ) * e + d * M * st-1
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        )
where st denotes a vector form of the scores of all keywords; M denotes the probability transfer
matrix, e is a unit-vector. Eq. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) presents the iteration of TextRank, i.e. the iterative computation will
continue until the termination condition, such as |st-st-1|&lt;ε, is satisfied, where ε is the given threshold.
Then all candidate keyword scores are ranked, and the top K words are selected as the keywords of the
sentence.
      </p>
      <p>The ideas to make use of the knowledge in a sentence to initialize the score of s0(vi), and to calculate
the probability transfer matrix are explained separately as below.</p>
    </sec>
    <sec id="sec-5">
      <title>2.2. Scores initialization of candidate keywords</title>
      <p>The canonical TextRank algorithm assigns the initial scores of each candidate keyword to 1 or 1/N
(N is the number of candidate keywords) by default, and it ignores the knowledge contained in each
candidate keyword. In this paper, we integrate various knowledge and assign different initial scores to
each candidate keyword accordingly. The initial score is calculated as:</p>
      <p>s0 (vi ) = s1 (vi ) * s2 (vi )
where vi is the i-th candidate keyword; s1(vi) denotes the term frequency of vi; s2(vi) denotes the score
of part of speech of vi. The higher s1(vi) is, the more important this word is. Similarly, since the part of
speech of the candidate keyword is different, the score should be different. s1(vi) and s2(vi) are described
as follows.From the perspective of part of speech, keywords in the sentence are often nouns, verbs, and
adjectives[14], therefore s2(vi) is calculated according to the corresponding part of speech as follows: 2
for a noun, 1.5 for a verb, 1 for an adjective, and 0.5 for others.</p>
      <p>
        The initial scores of all candidate keywords can be represented by a vector S0 with dimension n, as
shown in equation (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ).
      </p>
      <p>S0= ( s0 (v1 ) , s0 (v2 ) ,, s0 (vi ) ,, s0 (vn ))</p>
    </sec>
    <sec id="sec-6">
      <title>2.3. Construction of probability transfer matrix</title>
      <p>Based on the canonical TextRank algorithm, in terms of semantics, this paper calculates the semantic
similarity based on Word2vec. In terms of syntax, the relevance between words is calculated based on
dependency parsing. They are described separately as below.</p>
      <p>
        In terms of semantic similarity, if it is high, the weight between the two candidate keywords is high.
For two candidate keywords, vi, vj, their vectors vi, vj can be gained based on Word2vec. The semantic
similarity mij between them can be calculated according to cosine similarity, and the semantic similarity
matrix is shown in equation (
        <xref ref-type="bibr" rid="ref5">5</xref>
        ).
      </p>
      <p>Mα = mij </p>
      <p>In terms of syntax, if the dependency relevance[15] between two candidate keywords is high, their
weight will be high. Although the Word2vec-based TextRank algorithm[13] can achieve good results on
some publicly available datasets, it may not always be valid when it comes to a single sentence.
Therefore, from the syntactic perspective, we calculate the relevance by dependency parsing. This paper
takes LTP[16] (Language Technology Platform) as the tool for dependency parsing.</p>
      <p>Therefore, based on the length of the dependency relation path, the dependency relevance is
calculated according to:</p>
      <p>
        where len(vi,vj) denotes the length of dependency relation path between vi and vj. The dependency
relevance matrix is shown in equation (
        <xref ref-type="bibr" rid="ref7">7</xref>
        ).
      </p>
      <p>
        lij=1/len(vi,vj)
(
        <xref ref-type="bibr" rid="ref6">6</xref>
        )
The probability transfer matrix M is shown in equation (
        <xref ref-type="bibr" rid="ref8">8</xref>
        ).
      </p>
      <p>Mβ = lij </p>
      <p>M =  pij  = [mij + lij ] = Mα + Mβ
where pij denotes the transfer probability from node vi to node vj, and there is nj=1 p ji = 1.</p>
    </sec>
    <sec id="sec-7">
      <title>2.4. Multi Feature Fusion TextRank algorithm</title>
      <p>
        With the initial words’ scores in (
        <xref ref-type="bibr" rid="ref4">4</xref>
        ) and the probability transfer matrix in (
        <xref ref-type="bibr" rid="ref8">8</xref>
        ), the MFF (Multi Feature
Fusion TextRank algorithm iteratively calculates the scores of nodes according to (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ). Then the top K
candidate words can be selected as keywords.
      </p>
      <p>The MFF TextRank algorithm is shown in Algorithm 1. The initial scores are assigned to the
candidate keywords according to the word frequency and lexical features. Then the probability transfer
matrix is constructed by considering both the semantic relationship characteristics between words and
the syntactic relation among words to make the extracted keywords more accurate.</p>
      <p>
        Algorithm 1 Multi-Feature Fusion TextRank
Input: A sentence; the number of keywords to be extracted, K
Output: Top K keywords
Step1: Pre-process the sentence: segment, remove the stop-words, and construct a candidate keyword set;
Step2: Calculate the initial scores for each candidate keyword vi:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Calculate the term frequency s1(vi);
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Calculate the score of part of speech s2(vi);
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Get the initial scores s0(vi)=s1(vi)*s2(vi);
Step3: Construct the graph with all candidate keywords as nodes, and if two candidate keywords appear in a co-occurrence
window, there is an edge between them;
Step4: Construct the probability transfer matrix according to the semantic similarity and dependency relevance:
(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) Calculate the semantic similarity between candidate keywords based on vectors gained with Word2vec and construct
the probability transfer matrix Mα;
(
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) Calculate the dependency relevance and construct the probability transfer matrix Mβ;
(
        <xref ref-type="bibr" rid="ref5">5</xref>
        )
(
        <xref ref-type="bibr" rid="ref7">7</xref>
        )
(
        <xref ref-type="bibr" rid="ref8">8</xref>
        )
(
        <xref ref-type="bibr" rid="ref3">3</xref>
        ) Obtain the final probability transfer matrix M=Mα+Mβ;
Step5: Iteratively calculate candidate keywords’ scores until the termination condition is satisfied;
      </p>
      <p>Step6: Sort the candidate keywords in descending order of score and extract the top K words as the keywords.</p>
    </sec>
    <sec id="sec-8">
      <title>3. Experiments and Results Analysis</title>
      <p>
        The dataset in the experiments is from SogouCA[17], which is in Chinese and its size is about 1.4GB
and covers 18 fields, including military, sports, society, entertainment, etc. The preprocessing of the
dataset includes: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) words segment and removing stop-words; (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) training a word vector model based
on Gensim, which contains the Word2vec tool, and we obtained it with a size of about 160 MB.
      </p>
      <p>We cross-labeled 500 sentences of hot topics randomly crawled from Baidu Knows
(https://zhidao.baidu.com/) and Zhihu (https://www.zhihu.com/). In order to validate the reliability of
the keyword extraction results, the keywords are extracted and manually cross-labeled. In the analysis
of the experimental results, the extracted keywords are compared with the manually labeled keywords.
The indices used in the experiments include the accuracy rate P, the recall rate R, and the F-measure.</p>
      <p>The benchmark algorithms for experimental comparison include the TF-IDF algorithm (A1), the
TextRank algorithm (A2), the TextRank algorithm with improved initial scores of candidate keywords
(A3), the TextRank algorithm with improved probability transfer matrix by dependency parsing (A4),
the TextRank algorithm with improved probability transfer matrix by Word2vec (A5), and the MFF
TextRank (A6). When the number of extracted keywords is 1, 2, 3, and 4, the indices P, R, and
Fmeasure are calculated, and the experimental results are shown in Table 1 and Figure 2.
Table 1 Experimental results</p>
      <p>Algorithm P NR=1</p>
      <p>A1 0.744 0.200
A2 0.853 0.230
A3 0.867 0.233
A4 0.855 0.230
A5 0.884 0.238
A6 0.894 0.240</p>
      <p>It can be seen from Figure 2 that the TextRank algorithm, which extracts keywords through
cooccurrence relationships of words in the sentence, outperforms the TF-IDF algorithm, which selects
keywords based on word frequency. After assigning the initial scores of candidate keywords based on
term frequency and part of speech in TextRank, the performance of the algorithm is improved. It is
further improved after integrating the dependency relevance in the probability transfer matrix in
TextRank. After integrating the semantic similarity in the probability transfer matrix in TextRank, the
algorithm performance becomes better. The MFF TextRank outperforms the above five algorithms in
terms of the accuracy rate P, the recall rate R, and the F-measure.</p>
      <p>
        The reasons that our algorithm outperforms other algorithms mainly include that: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) More
knowledge of word are integrated in the initial scores of candidate keywords, and the knowledge
includes frequency and part of speech are introduced. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) More knowledge of relationship
between/among words are integrated into the construct of probability transfer matrices, and the
knowledge includes the semantic relationship and the dependency relevance.
      </p>
    </sec>
    <sec id="sec-9">
      <title>4. Conclusion</title>
      <p>
        This paper proposes a multi-feature fusion TextRank algorithm for sentence-orient keyword
extraction. To address the shortcomings that the canonical TextRank algorithm ignores the knowledge
of keywords and the relationship between/among keywords, the TextRank algorithm is improved from
two aspects: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) The initial scores of candidate keywords are assigned by fusing the knowledge of the
term frequency and part of speech. (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) The probability transfer matrix are calculated by fusing the
knowledge of semantic relation between words, and the dependency relevance among words. The
experiments show that our algorithm has better results in sentence keyword extraction.
      </p>
      <p>Although the algorithm in this paper outperforms the other five algorithms in terms of P, R, and F,
there is still room for improvement in terms of time complexity, and it is the future work to integrate
suitable features to make the TextRank algorithm extract sentence keywords with higher performance.</p>
    </sec>
    <sec id="sec-10">
      <title>5. Acknowledgments</title>
      <p>This work was partly supported by the National Natural Science Foundation of China (No. 62077029,
61673196, 62277030), Society Development Foundation of Xuzhou under Grant No. KC19213, Jiangsu
Normal University Postgraduate Research and Practice Innovation Program Project (2021XKT1391).</p>
    </sec>
    <sec id="sec-11">
      <title>6. References</title>
      <p>Communications (ICCC). Chengdu, pp. 2109-2113.
[14] Zhang, J.-E. (2013) Method for the Extraction of Chinese Text Keywords Based on Multi-Feature</p>
      <p>
        Fusion. Information studies: Theory &amp; Application., 36(
        <xref ref-type="bibr" rid="ref10">10</xref>
        ): 105-108.
[15] Zhang, W.-N., Ming, Z.-Y., Zhang, Y., Nie, L.-Q., Liu, T., Chua, T.-S. (2012) The use of
dependency relation graph to enhance the term weighting in question retrieval. In: Proceedings of
the Proceedings of COLING. Mumbai, pp. 3105-3120.
[16] Che, W.-X., Feng, Y.-L., Qin, L.-B., Liu, T. (2010) Ltp: A chinese language technology platform.
      </p>
      <p>In: Proceedings of the Coling 2010: Demonstrations. Beijing, pp. 13-16.
[17] Wang, C., Zhangm M., Ma, S., Ru, L. (2008) Automatic online news issue construction in web
environment. In: Proceedings of the Proceedings of the 17th international conference on World
Wide Web. Beijing, pp. 457-466.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <surname>Firoozeh</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nazarenko</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Alizon</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daille</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          (
          <year>2020</year>
          )
          <article-title>Keyword extraction: Issues and methods</article-title>
          .,
          <source>Natural Language Engineering</source>
          .,
          <volume>26</volume>
          (
          <issue>3</issue>
          ):
          <fpage>259</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Yang</surname>
            , D.-H.,
            <given-names>Y</given-names>
          </string-name>
          , Wu,
          <string-name>
            <given-names>Y.-X.</given-names>
            ,
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.-X.</surname>
          </string-name>
          (
          <year>2020</year>
          )
          <article-title>Chinese Short Text Keyphrase Extraction Model Based on Attention</article-title>
          . Computer Science.,
          <volume>47</volume>
          (
          <issue>1</issue>
          ):
          <fpage>193</fpage>
          -
          <lpage>198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          , Zhang, P.-Z., Zhang,
          <string-name>
            <surname>C.</surname>
          </string-name>
          (
          <year>2019</year>
          )
          <article-title>Research on news keyword extraction technology based on TF-IDF and TextRank</article-title>
          .
          <source>In: Proceedings of the 2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS)</source>
          . Beijing, pp.
          <fpage>452</fpage>
          -
          <lpage>455</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yao</surname>
            , Yu,
            <given-names>Y.</given-names>
          </string-name>
          , L.-Y.,
          <string-name>
            <surname>Yang</surname>
          </string-name>
          , G.-C.,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>J.-J.</given-names>
          </string-name>
          (
          <year>2018</year>
          )
          <article-title>Patent keyword extraction algorithm based on distributed representation for patent classification</article-title>
          . Entropy.,
          <volume>20</volume>
          (
          <issue>2</issue>
          ):
          <fpage>104</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            ,
            <given-names>G.-M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fan</surname>
            ,
            <given-names>C.-L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun Z.-L.</surname>
          </string-name>
          ,
          <string-name>
            <surname>Zhu H</surname>
          </string-name>
          .
          <article-title>-</article-title>
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2019</year>
          )
          <article-title>Keyword extraction for short text via word2vec, doc2vec, and textrank</article-title>
          .
          <source>Turkish Journal of Electrical Engineering &amp; Computer Sciences.</source>
          ,
          <volume>27</volume>
          (
          <issue>3</issue>
          ):
          <fpage>1794</fpage>
          -
          <lpage>1805</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hao</surname>
          </string-name>
          , X.-Y.,
          <string-name>
            <surname>Zhang</surname>
            <given-names>X.-Z.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>Y.-W.</given-names>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Research on the Strategy of Keyword Extraction</article-title>
          . Journal Of Taiyuan University Of Technology.,
          <volume>47</volume>
          (
          <issue>02</issue>
          ):
          <fpage>228</fpage>
          -
          <lpage>232</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Salton</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Buckley</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          (
          <year>1988</year>
          )
          <article-title>Term-weighting approaches in automatic text retrieval</article-title>
          .
          <source>Information processing &amp; management.</source>
          ,
          <volume>24</volume>
          (
          <issue>5</issue>
          ):
          <fpage>513</fpage>
          -
          <lpage>523</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>Blei</surname>
            ,
            <given-names>D. M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>A. Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jordan</surname>
            ,
            <given-names>M. I.</given-names>
          </string-name>
          (
          <year>2003</year>
          )
          <article-title>Latent dirichlet allocation</article-title>
          .
          <source>Journal of machine Learning research.</source>
          , pp.
          <volume>3</volume>
          (
          <issue>Jan</issue>
          ):
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <surname>Mihalcea</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tarau</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2004</year>
          )
          <article-title>Textrank: Bringing order into text</article-title>
          .
          <source>In: Proceedings of the 2004 conference on empirical methods in natural language processing. Barcelona</source>
          , pp.
          <fpage>404</fpage>
          -
          <lpage>411</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>Page</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Brin</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Motwani</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Winograd</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>1999</year>
          )
          <article-title>The PageRank citation ranking: Bringing order to the web</article-title>
          . Stanford Digital Libraries Working Paper.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>Z.-F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>P.-P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>X.-S.</given-names>
          </string-name>
          (
          <year>2021</year>
          )
          <article-title>Keywords extraction algorithm of railway literature based on improved TextRank</article-title>
          . Journal Of Beijing Jiaotong University.,
          <volume>45</volume>
          (
          <issue>02</issue>
          ):
          <fpage>80</fpage>
          -
          <lpage>86</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <article-title>Text Keyword Extraction Method Based on Weighted TextRank</article-title>
          . (
          <year>2019</year>
          )
          <article-title>Computer Science</article-title>
          .,
          <volume>46</volume>
          (
          <issue>S1</issue>
          ):
          <fpage>142</fpage>
          -
          <lpage>145</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>Wen</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yuan</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          , Zhang,
          <string-name>
            <surname>P.</surname>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Research on keyword extraction based on word2vec weighted textrank</article-title>
          .
          <source>In: Proceedings of the 2016 2nd IEEE International Conference on Computer and</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>