S. Krajči (ed.): ITAT 2018 Proceedings, pp. 28–34
CEUR Workshop Proceedings Vol. 2203, ISSN 1613-0073, c 2018 Matej Gallo, Luboš Popelínský, and Karel Vaculík


                              To text summarization by dynamic graph mining

                                           Matej Gallo, Luboš Popelínský, and Karel Vaculík

                                                           KD Lab FI MU Brno
                                                          popel@fi.muni.cz

      Abstract: We show that frequent patterns can contribute          ensure that only significant rules are extracted, the al-
      to the quality of text summarization. Here we focus              gorithm incorporates measuring support and confidence.
      on single-document extractive summarization in English.          While support expresses what portion of the graph is af-
      Performance of the frequent patterns based model ob-             fected by the rule, confidence measures the occurrence
      tained with DGRMiner yields the most relevant sentences          frequency of a specific change, given that a particular pat-
      of all compared methods. Two out of three proposed meth-         tern was observed. Time abstraction is an extension to the
      ods outperform other methods if compared on ROUGE                DGRMiner that allows us to analyse broader class of dy-
      data.                                                            namic graphs. The abstraction lies in the use of signum
                                                                       function on relative timestamps. All the negative times-
                                                                       tamps become −1, all the positive timestamps become 1
      1 Introduction                                                   and the timestamp of 0 remains 0. DGRMiner allows for
                                                                       two types of abstraction. One affects only timestamps of
      Extractive summarization assigns a score to each text            vertices and should be used when most changes are caused
      unit (phrase, sentence, paragraph, passage) and then picks       by edges and vertices remain static. The second one also
      the most informative ones to form a summary called ex-           affects the edges and is useful when there are too few pat-
      tract. The selected sentences are used verbatim [1, 25].         terns with exact timestamps.
         Graph representation is a common way for represent-
      ing data in automatic text summarization tasks. We intro-
      duce a new method for single-document extractive sum-            3 Method
      marization in English language where a text is represented
      as a dynamic graph and each sentence corresponds to a dy-        The initial idea was to take a text as a temporal (i.e. dy-
      namic graph snapshot, i.e. the graph in a partivular time.       namic) graph where each sentence represent a graph snap-
      E.g. the first sentence is the oldest snapshot while the last    shot at a particular time and tokens (a lemma together with
      sentence of a text is the newest one. This method is based       a part-of-speech tag) were nodes and edges connected all
      on the principle of mining frequent patterns from such a         pairs of tokens. The label of an edge was equal to a num-
      dynamic graph and using the resulting patterns as indica-        ber of appearances of this pair of tokens. In the following
      tors of sentence importance.                                     subsections we describe particular steps of the algorithm.
         The structure of this text is following. In Section 2.
      we introduce DGRMiner, a tool for mining in dynamic
                                                                       3.1 Transforming Text to graphs
      graphs. Section 3. describes the algorithm. In Sections
      4 we briefly introduce the data used for evaluation. Sec-        For pre-processing we employed CoreNLP [15]. CoreNLP
      tions 5 and 6 explain a way of summarization evaluation          provides a set of natural language analysis tools: POS
      and sentence scoring. Section 7 displays the main results        tagger, named entity recognition, parser, coreference res-
      of text summarization by means of frequent patterns mined        olution system, sentiment analysis, bootstrapped pattern
      from dynamic graphs. Related work is a contents of Sec-          learning, and open information extraction. We used four
      tion 8. We conclude with concluding Section 9.                   annotators: sentence split, tokenize, lemma, and POS.
                                                                          Using the coreNLP package for R, we split the text into
                                                                       sentences. In every iteration we removed the stop words
      2 DGRMiner                                                       from a sentence, lemmatize the remaining words and as-
                                                                       signed the POS tags. If a lemma-tag pair was not already
      We used DGRMiner tool to extract these patterns from             in the graph, we added it in and created edges between all
      the dynamic graphs. DGRMiner was proposed in [26]                words in a same sentence. Otherwise, we only notified the
      for mining frequent patterns that capture various changes        DGRMiner that a particular word had appeared again. We
      in dynamic graphs in the form of predictive rules. The           parametrized context c. Therefore, in each step we mod-
      found predictive graph rules express what way the graph          elled at most c sentences.
      changes. The rules capture patterns such as addition, dele-         We will show it on simple example that contains a single
      tion, and transformation of the subgraph. The use of rel-        sentence.
      ative timestamps for rules allows for mining general pat-
      terns while including time information simultaneously. To        <sentence position = " 9 " labelers
To text summarization by dynamic graph mining                                                                                       29

     = "l, 2 , 3, 4">The great thing about                           4 Data
     Aspen is that it has at least one of
     each of the really useful stores that                           To assess the performance of our method we used the Blog
     you need, sellingstuff at regular                               summarization dataset [10, 11, 23]. It consists of 100 posts
     prices. </sentence>                                             annotated in XML that were randomly chosen from two
                                                                     blogs (half from each blog), Cosmic Variance and Inter-
     After removing stop words and labeling words with a
                                                                     net Explorer Blog. Each of the four human summariz-
     lemma-tag pair, we receive a graph t # 9.
                                                                     ers picked approximately 7 sentences to form 4 reference
     t # 9                                                           summaries in total. We manually restored apostrophes for
     an 74 great#ADJ                                                 shortened forms of to be and to have verbs, and in posses-
     an 75 thing#NOUN                                                sive nouns. Punctuation within sentences was omitted as
     an 11 Aspen#NOUN                                                the coreNLP sentence split annotator often wrongly split
     an 76 least#ADJ                                                 the sentences in the middle. We decided not to use CNN
     an 77 one#NUM                                                   Dataset, nor SUMMAC dataset mentioned in [9] because
     an 78 really#ADV                                                the provided reference summaries were not extracted, ver-
     an 79 useful#ADJ                                                batim sentences. This would require us to manually find
     an 66 store#NOUN                                                the best matching sentence from within the document.
     an 5 need#VERB
     ae u 48 5 11 [need#VERB]-[Aspen#NOUN]
     ae u 432 5 66 [need#VERB]-[store#NOUN]                          5 Semi-automatic evaluation
     ae u 433 5 67 [need#VERB]-[sell#VERB]
     ae u 434 5 74 [need#VERB]-[great#ADJ]                           Evaluating summaries on sentence level can be done semi-
     ae u 435 5 75 [need#VERB]-[thing#NOUN]                          automatically by measuring content overlap with preci-
     ae u 436 5 76 [need#VERB]-[least#ADJ]                           sion, recall, and F1 measure. An extracted sentence is
     ae u 437 5 77 [need#VERB]-[one#NUM]                             considered acceptable if the same sentence was extracted
                                                                     in a reference summary. This process cannot be fully
     In the first column, an, ae stand for "add a node" and          automatized because reference summaries are created by
     "add an edge" respectively.                                     human judges. Other semi-automatic evaluation methods
                                                                     used nowadays are: ROUGE, P YRAMID and BASIC E LE -
                                                                     MENTS . We will discuss only the ROUGE method [25, 14]
     3.2   Pattern Mining
                                                                     as it was used in this work.
     Then we applied the DGRMiner on the data from previ-
     ous step to obtain predictive patterns, i.e. rules that de-
                                                                     5.1 ROUGE
     scribe frequent (or rare) changes between past and future
     graphs – sentences. We received two types of patterns:          The ROUGE (Recall-Oriented Understudy for Gisting
     frequent single-vertex patterns corresponding to nodes and      Evaluation) measure is based on the B LEU metrics used
     rare patterns corresponding to edges. We modified the sup-      in machine translation tasks. The idea is to compare the
     port parameter to change the threshold on frequency of ob-      differences between the distribution of words in the can-
     served patterns. When the support parameter was set too         didate summary and the distribution of words in the ref-
     low, the DGRMiner considered a pattern anything that ap-        erence summaries. Given h reference summaries and a
     peared at least once in the text – every word, every possible   candidate summary they are split into n-grams to calculate
     word combination. By setting the confidence parameter           the intersection of n-grams between the references and the
     in DGRMiner to 0, the DGRMiner assigned confidence              candidate. This process is illustrated in figure 1.
     (as discussed in section 2) to each extracted pattern. We          Given its correlation with manual judgments ROUGE
     also played with time abstraction which allowed us to ig-       almost seems to have become a standard. [13] reports
     nore the preset context c as discussed in section 2. There-     Pearson coefficient for the most commonly used variations
     fore, patterns were observed in sentences that were more        (ROUGE -2 and ROUGE-SU4 [14]) at a value of between
     than c − 2 sentences apart. Although this gave us more          0.94 and 0.99. Generally, the ROUGE-n is calculated from
     patterns, many of them were uninformative. An example           the co-occurrences of n-grams between the candidate and
     of observed single-vertex pattern is "+supplies#NOUN".          reference summaries as shown by formula 1 in [25]:
     This pattern indicates that the noun supplies appeared at
     least n times in the text, where the n is determined by the
     support parameter. An example of multi-vertex pattern is                             ∑n-grams ∈ {Sumcan ∩ Sumref }
     [store#NOUN]-[sell#VERB], which tells us that the noun                 ROUGE -n =                                       (1)
                                                                                                ∑n-grams ∈ Sumref
     store is frequently closely accompanied by the verb sell.
     The proximity of these two words is given by context c          where the numerator is the maximum number of co-
     and the frequency is given by the support parameter.            occurrences of n-grams in both reference and candidate
30                                                                                        Matej Gallo, Luboš Popelínský, and Karel Vaculík

                                                                        The frequency method (FRQ) as given by formula 5,
                                                                     is the simplest of the three. For a sentence s and a set of
                                                                     patterns P the score is calculated as weighted average.

                                                                                     ScoreF (s) = ∑ c(p)w(p)                    (5)
                                                                                                   p∈P

                                                                     where c(p) is the frequency of the pattern in the sentence
                                                                     and w(p) is its associated weight.
                                                                        The density method (DEN) as given by formula 6,
                                                                     counts the patterns in a sentence and normalizes by the
                                                                     length of the sentence. This method is parameter-free.
     Figure 1: The basic idea of ROUGE for evaluation of sum-
     maries [25]                                                                                       | S∩P |
                                                                                       ScoreD (s) =                             (6)
                                                                                                      length(s)
     summary and the denominator is the total sum of the num-           For multi-vertex patterns we tested frequency and den-
     ber of n-grams present in the reference summaries.              sity methods. The only difference between the methods is
        The ROUGE-SUγ is an adaptation of ROUGE-2 using              that instead of length of sentences we use the number of
                                                                                                             
     skip units (SU) of a size ≤ γ. SU4 considers the bigrams        all possible word pair combinations |S|2 .
     and the bigrams SU to be arbitrary in a maximum window             We built the summary in a greedy fashion. In every it-
     length of γ = 4 words. The number of bigrams with an            eration we picked the highest scoring sentence. Patterns
     arbitrary size of length γ is given by formula 2:               observed in the sentence were penalized by a parameter
                                k−γ                                λ . The scores were recomputed and the process continued
                               n                                     until a desired number of sentences was picked or until all
              Count(k, n) = C    − ∑ (k − γ); γ ≤ 1           (2)
                               k   0
                                                                     the sentences were used. For every method we discovered
                                                                     the optimal value of λ . We tuned the parameter on 10% of
     where n is the n-gram length and k is the sentence length       the entire dataset. The optimal values for λ are presented
     in words.                                                       in the tables 2 and 1. The ordering of sentences in the fi-
         The ROUGE metrics are not flawless. Firstly, they have      nal summary maintains the relative ordering in the original
     a problem with the representation of content. Secondly,         document.
     they will not consider chains of words such as "MUNI"
     6= "Masaryk University" and "FI" 6=" Faculty of Informat-
      ics". A study [22] found that the system can be tricked into   7   Results
      generating a summary with high ROUGE score.
                                                                     We evaluated the model using the ROUGE-1 metric,
                                                                     which is recommended for short summaries [14]. Every
     6   Sentence Scoring                                            blog post was compared against four reference summaries
                                                                     as described in chapter 4. The results of all three methods
     In this step we used the observed patterns as indicators to     can be seen in tables 1, 2 and 3. The first three columns de-
     score the sentences. The final score was obtained accord-       note whether time abstraction on both vertices and edges
     ing to formula 3 as a sum of single-vertex and multi-vertex     is used, whether the patterns were used to score sentences,
     scores:                                                         and whether the initial weights were determined by the
                                                                     confidence of DGRMiner for the given pattern, respec-
              Score(s) = Scoresingle (s) + Scoremulti (s)     (3)    tively. The last three columns correspond to the ROUGE-1
                                                                     metrics – precision (what portion of sentences we selected
     Three different metrics were implemented to calculate           were part of the reference summary), recall (what portion
     single-vertex score.                                            of sentences in reference summary we extracted), and f1
        The Jaccard coefficient (JAC) as given by formula 4,         measure (harmonic mean of precision and recall).
     expresses the overlap of two sets. In our case, between the        The FRQ method marked the most accurate sentences
     set of patterns P and the set of words in the sentence S.       among all the presented algorithms as can be seen from
                                                                     figure 2. JAC method picked fewer correct sentences than
                                     wsum (S ∩ P)
                      ScoreJ (s) =                            (4)    FRQ method but still more than any traditional approach.
                                     wsum (S ∪ P)                    Both FRQ and DEN methods ranked first in terms of pre-
     where wsum assigns weight to every element of the set and       cision.
     then sums over them. A question arises regarding what              The frequency-based method incorporating patterns
     weight we should assign to non-indicators. Empirically,         with no confidence weighting (identified as FRQ) achieved
     we received the best results for the value of 0.001.            the highest recall, the highest f1 score, and the highest
To text summarization by dynamic graph mining                                                                              31


       binall   patterns    weights     precision     recall   f1 score
         F         F          F         0.643         0.694    0.664
         F         T          F         0.643         0.686    0.659
         F         T          T         0.641         0.684    0.657
         T         F          F         0.641         0.683    0.657
         F         F          T         0.640         0.691    0.660
         T         T          F         0.640         0.678    0.654
         T         T          T         0.640         0.678    0.654
         T         F          T         0.639         0.681    0.655

            Table 1: Results for blog summarization dataset using jaccard method with parameters λ = 0.55, w0 = 0.001


       binall   patterns    weights     precision     recall   f1 score
         F         T          F         0.645         0.702    0.668
         F         F          T         0.645         0.700    0.667
         F         F          F         0.644         0.701    0.665
         F         T          T         0.643         0.693    0.662
         T         F          F         0.642         0.691    0.661
         T         T          F         0.642         0.691    0.661
         T         F          T         0.642         0.690    0.660
         T         T          T         0.642         0.688    0.660

                  Table 2: Results for blog summarization dataset using frequency method with parameter λ = 0.20


       binall   patterns    precision    recall     f1 score
         F         T          0.651      0.514      0.562
         F         F          0.650      0.514      0.562
         T         T          0.649      0.516      0.563
         T         F          0.649      0.512      0.562

                                Table 3: Results for blog summarization dataset using density method


     Figure 2: Comparison of number of correctly chosen sentences – using blog summarization dataset (our algorithms are
     displayed yellow)
32                                                                                        Matej Gallo, Luboš Popelínský, and Karel Vaculík

     number of correctly chosen sentences. The highest pre-          Markov model allows modeling these dependencies. Con-
     cision was achieved by density-based method incorporat-         roy and O’leary [6] use a joint distribution for the features
     ing patterns with no confidence weighting (identified as        set, unlike the independence-of-features assumption used
     DEN). We chose these two models and the highest scoring         in naïve Bayesian methods. The HMM was trained us-
     Jaccard method for comparison together with algorithms          ing five features: position of the sentence in the document
     presented in article [9]. We could see that FRQ behaves         (number of states); number of terms in a sentence, and
     similarly to Word frequency (WF) algorithm proposed in          likeliness of the sentence terms given the document terms.
     the paper. This is not surprising, because the single vertex-      Summarizer NetSum presented by Svore et al. [24] uses
     frequent patterns correspond to words that appear at least      an artificial neural network (ANN) called RankNet to rank
     n-times in the text. Where n is determined by the support       the sentences. RankNet is a pair-based neural network al-
     parameter in the DGRMiner tool as described in section          gorithm for ranking a set of inputs. It is trained on pairs
     3.2. The multi-vertex patterns improved the performance         of sentences (Si , S j ), such that the ROUGE score for Si
     of the summarizer in density models and in one case (the        should be higher than S j . Pairs are only generated in a
     highest scoring model) in frequency model.                      single document, not across documents. The cost function
                                                                     for RankNet is the probabilistic cross-entropy cost func-
                                                                     tion. Training is performed using a modified version of
     8 Related work                                                  back-propagation algorithm for two-layer networks, which
                                                                     is based on optimizing the cost function by gradient de-
     Kupiec et al. introduce in their work [12] method inspired      scent. The system significantly outperforms the standard
     by Edmundson’s [7]. They approached the summarization           baseline in the ROUGE-1 measure. No past system could
     as a statistical classification problem. A Bayes classifier     outperform the baseline with statistical significance.
     was trained to estimate the probability that a given sen-          A system for generating product category-based (topic-
     tence would be included in the summary. They used 6             based) extractive summarization was proposed by [4, 5].
     discrete features (presented in order of importance): para-     The collection of 45 news items corresponding to various
     graph feature (the position of sentence in paragraph s),        products are pre-processed using standard techniques: to-
     fixed-phrase feature (the sentence contains a phrase from       kenization, stopword removal, stemming. The final corpus
     a list), sentence length cutoff feature (threshold u1 = 5,      contains around 1500 features and is represented by a bag-
     length(s) > u1 ), thematic word feature (the presence of        of-words VSM based on these features. To identify the
     thematic terms), uppercase word feature (the presence of        topics, the news items about specific categories of prod-
     words in capital letters). The best results were obtained       ucts are segregated into separate clusters using K-Means
     using the first three features.                                 and then an extractive summary is generated from each of
        Aone et al. [2] built on Kupiec’s work [12] and ex-          these topical clusters. The K number of cluster is deter-
     panded the feature set of their system, called D IM S UM,       mined by a Self-Organizing Map (SOM).
     with signature terms, which indicate key concepts for a            Chakraborti and Dey [5] assigned a score to the en-
     given document. Another advantage over [12]’s system            tire summary as a single unit. The total summary score
     is the use of multi-word phrases – statistically derived        (TSS) is taken as a combination of cosine similarity be-
     collocation phrases (e.g. "omnibus bill", "crime bill",         tween centroid of corpus and the summary as a whole;
     "brady bill") and associated words ("camera" and "ob-           relative length of the summary; and redundancy penalty.
     scura", "Columbia River" and "gorge"), as well as the use       To maximize TSS constrained by the number of lines in
     of WordNet [16] to identify possible synonyms of found          summary (τ = 5 − 7%), they opted for quick Artificial Bee
     signature terms. He applied a shallow discourse analysis to     Colony optimization, a global optimization technique, for
     resolve co-references and maintain cohesion – only name         sentence selection. The summary with the highest score is
     aliases were resolved such as UK to United Kingdom.             then chosen.
        Osborne in his work [18] disagrees with the traditional         In Muresan’s paper [17], a system called GIST-IT used
     assumption of feature independence and shows empiri-            for email-summarization task is discussed. First, noun
     cally that the maximum entropy (MaxEnt) model produces          phrases (NPs) are extracted as they carry the most content-
     better extracts than the naïve Bayes model with similarly       ful information. Subsequently, machine learning is used to
     optimized prior appended to both models. Unlike naïve           select the most salient NPs. A set of nine features, divided
     Bayes, MaxEnt does not make unnecessary feature inde-           into three categories (head of the NP, whole NP, combina-
     pendence assumptions. Let c be a binary label (binary:          tion of head and modifiers of NP) were used: head of the
     part of summary or not), s the item we are interested in la-    NP TF*IDF, position of first occurrence (focc) of the head
     beling, fi the i-th feature, and ωi the corresponding feature   in text, TF*IDF of entire NP, focc of entire NP, length of
     weight.                                                         the NP in words, length of the NP in characters, position
        Hidden Markov Models (HMM), similar to MaxEnt,               of the NP in the sentence, position of the NP in the para-
     have weaker assumptions of independence. There are              graph, and combination of the TF*IDF scores of head of
     three types of dependencies: positional dependence, fea-        the NP and its modifiers.
     ture dependence, and Markovity dependence. A first-order           To find the most salient NPs, three machine learning al-
To text summarization by dynamic graph mining                                                                                            33

     gorithms were applied: decision trees (axis-parallel trees –    capturing the high-order relations between both sentences
     C4.5 and oblique trees – OC1), rule induction (production       and words. [3] proposed a hypergraph-based model for
     rule – C4.5rules and propositional rules – RIPPER), and         generic summarization based on [27]’s hypergraph-based
     decision forests (DFC using information gain ratio).            model for query-focused summarization. The ranking
        Muresan claims that shallow linguistic filtering applied     uses a semi-supervised approach to order sentences. They
     to NPs improved the classifiers accuracy [17]. The filter-      model the words as vertices and sentences as hyperedges
     ing consisted of four steps: grouping inflectional variants,    and then approach the problem as a random walk over hy-
     removing unimportant modifiers, filtering stopwords, and        peredges.
     removing empty nouns.
        A modification of PageRank algorithm called LexRank
     is used to weight sentences. The undirected graph of sen-       9 Conclusion
     tences is constructed from symmetrical similarities (mod-
                                                                     We showed that frequent patterns can contribute to the
     ified cosine measure). The score of each vertex s is cal-
                                                                     quality of text summarization. The comparison supports
     culated iteratively until the values of the vertices have not
                                                                     our statement that the performance of our frequent patterns
     been modified by more than ε = 0.0001. LexRank algo-
                                                                     based model is comparable to the simpler word frequency
     rithm is used as a component of the M EAD [8].
                                                                     method and yielded the most relevant sentences of all com-
        Unlike LexRank, TextRank uses the similarities of
                                                                     pared methods. Our methods outperformed other methods
     edges to weight the vertices. The score of each sentence si
                                                                     in precision but lacked in recall. We attribute the similarity
     is calculated iteratively until convergence is reached.
                                                                     to word frequency method to inadequate graph represen-
        As the graph is constructed from inter-sentence sim-
                                                                     tation – instead of interconnecting all the words within a
     ilarity measure, the choice of the method for sentence-
                                                                     sentence, suggest connecting them according to the parse
     weighting has significant impact. One approach is to use
                                                                     tree. Another consideration is to use a graph mining tool
     a bag-of-words to represent the sentences. The similarity
                                                                     that searches for more specific types of patterns than DGR-
     is obtained by calculating the cosine similarity weighted
                                                                     Miner.
     by inverse document frequency [1] between their vectorial
     representations. Another approach suggests using word
     overlap between sentences instead. The weak point of all        Acknowledgments
     similarity measures that use words (cosine similarity, word
     overlap, longest common subsequence) is the dependency          This work has been supported by Faculty of Informatics,
     on the lexicon of the document. The solution to this is         Masaryk University. We would like to thank to ITAT re-
     its combining with similarity measure based on chains of        viewers for their suggestions and comments.
     characters. Therefore, sentences that do not share a single
     word but contain a number of words that are close mor-
                                                                     References
     phologically can be compared [25].
        Patil [19] proposed a new graph-based model called            [1] Charu C. AGGARWAL and ChengXiang ZHAI. Mining
     S UM G RAPH. First the text is pre-processed (stemmed)               Text Data. Springer, New York, 2012.
     and represented as a VSM with TF*IDF weights. He then            [2] Chinatsu AONE, James GORLINSKY, and Bjornar
     computes pair-wise cosine similarities and subtracts the             LARSEN. A trainable summarizer with knowledge ac-
     values from 1 to obtain dissimilarities. The resulting ma-           quired from robust nlp techniques. Advances in Automatic
     trix of intra-sentence dissimilarities is then used to model         Text Summarization, 71, 1999.
     the document as graph. The vertices represent the sen-           [3] Abdelghani BELLAACHIA and Mohammed AL-
     tences and edges are weighted by intra-sentence dissimi-             DHELAAN. Multi-document hyperedge-based ranking
     larities.                                                            for text summarization. In Proceedings of the 23rd ACM
        The novel idea is the use of link reduction technique             International Conference on Conference on Information
     known as Pathfinder Network Scaling (PFnet) [21, 20] to              and Knowledge Management, pages 1919–1922, New
     scale the graph. PFnet models the human aspects of se-               York, 2014 [cit. 2016-05-10]. ACM.
     mantic memory. The centrality of a sentence and its po-          [4] Swapnajit CHAKRABORTI. Multi-document text summa-
     sition in a document are used to compute the importance              rization for competitor intelligence: A methodology based
                                                                          on topic identification and artificial bee colony optimiza-
     of sentences. Four different centrality measure were tested
                                                                          tion. In Proceedings of the 30th Annual ACM Symposium
     and closeness centrality showed to perform best. Finally,
                                                                          on Applied Computing, pages 1110–1111, New York, 2015
     the sentences are ranked according to their importance and           [cit. 2016-05-10]. ACM.
     first n highest-scoring sentences were picked.
                                                                      [5] Swapnajit CHAKRABORTI and Shubhamoy DEY. Prod-
        The approaches based on similarity graphs solely model            uct news summarization for competitor intelligence using
     the similarity between pairs of sentences with no clear rep-         topic identification and artificial bee colony optimization.
     resentation of word relations. Therefore, it is not clear            In Proceedings of the 2015 Conference on Research in
     if they adequately cover all topical information. The                Adaptive and Convergent Systems, pages 1–6, New York,
     hypergraph-based approach is to remedy this problem by               2015 [cit. 2016-05-10]. ACM.
34                                                                                             Matej Gallo, Luboš Popelínský, and Karel Vaculík

      [6] John M. CONROY and Dianne P. O’LEARY. Text sum-                      extraction. In Proceedings of the ACL’02 Workshop on Au-
          marization via hidden markov models. In Proceedings of               tomatic Summarization, Stroudsburg, 2002 [cit. 2016-05-
          the 24th Annual International ACM SIGIR Conference on                10]. Association for Computational Linguistics.
          Research and Development in Information Retrieval, pages        [19] Kaustubh PATIL and Pavel BRAZDIL. Text summariza-
          406–407, New York, 2001 [cit. 2016-05-10]. ACM.                      tion: Using centrality in the pathfinder network. Int. J.
      [7] Harold P. EDMUNDSON. New methods in automatic ex-                    Comput. Sci. Inform. Syst [online], 2:18–32, 2007 [cit.
          tracting. J. ACM [online], 16(2):264–285, april 1969 [cit.           2016-05-12].
          2016-05-10].                                                    [20] Roger W. SCHVANEVELDT, editor. Pathfinder Associa-
      [8] Günes ERKAN and Dragomir R. RADEV. Lexrank:                          tive Networks: Studies in Knowledge Organization. Ablex
          Graph-based lexical centrality as salience in text summa-            Publishing Corp., Norwood, 1990.
          rization. J. Artif. Int. Res. [online], 22(1):457–479, decem-   [21] Roger W. SCHVANEVELDT, D.W. DEARHOLT, and F.T.
          ber 2004 [cit. 2016-05-11].                                          DURSO. Graph theoretic foundations of pathfinder net-
      [9] Rafael FERREIRA, Luciano CABRAL, and Rafael D.                       works. Computers & mathematics with applications [on-
          LINS. Assessing sentence scoring techniques for extrac-              line], 15(4):337–345, 1988 [cit. 2016-05-11].
          tive text summarization. Expert Systems with Applications       [22] Jonas SJÖBERGH. Older versions of the rougeeval sum-
          [online], 40(14):5755 – 5764, october 2013 [cit. 2016-05-            marization evaluation system were easier to fool. Inf.
          10].                                                                 Process. Manage., 43(6):1500–1505, november 2007 [cit.
     [10] Meishan HU, Aixin SUN, and Ee-Peng LIM. Comments-                    2016-05-12].
          oriented blog summarization by sentence extraction. In          [23] Dr.       Aixin       SUN.                    Comments-
          Proceedings of the Sixteenth ACM Conference on Confer-               oriented       document        summarization,      2015.
          ence on Information and Knowledge Management, pages                  http://www.ntu.edu.sg/home/axsun/datasets.html,      vis-
          901–904, New York, 2007 [cit. 2016-05-10]. ACM.                      ited 2016-05-20.
     [11] Meishan HU, Aixin SUN, and Ee-Peng LIM. Comments-               [24] Krysta M. SVORE, Lucy VANDERWENDE, and Christo-
          oriented document summarization: Understanding docu-                 pher J.C. BURGES. Enhancing single-document summa-
          ments with readers’ feedback. In Sung-Hyon MYAENG,                   rization by combining RankNet and third-party sources.
          Douglas W. OARD, Fabrizio SEBASTIANI, Tat-Seng                       In Proceedings of the 2007 Joint Conference on Empir-
          CHUA, and Mun-Kew LEONG, editors, Proceedings of the                 ical Methods in Natural Language Processing and Com-
          31st Annual International ACM SIGIR Conference on Re-                putational Natural Language Learning (EMNLP-CoNLL),
          search and Development in Information Retrieval, pages               pages 448–457, Prague, 2007 [cit. 2016-05-10]. Associa-
          291–298, New York, 2008 [cit. 2016-05-10]. ACM.                      tion for Computational Linguistics.
     [12] Julian KUPIEC, Jan PEDERSEN, and Francine CHEN. A               [25] Juan-Manuel TORRES-MORENO. Automatic Text Sum-
          trainable document summarizer. In Proceedings of the 18th            marization. ISTE, London, 2014.
          Annual International ACM SIGIR Conference on Research           [26] Karel Vaculík and Lubomír Popelínský.           Dgrminer:
          and Development in Information Retrieval, pages 68–73,               Anomaly detection and explanation in dynamic graphs. In
          New York, 1995 [cit. 2016-05-10]. ACM.                               Knobbe-A. Soares C. Papapetrou P. Boström, H., editor,
     [13] Chin-Yew LIN. Rouge: a package for automatic eval-                   Advances in Intelligent Data Analysis XV - 15th Interna-
          uation of summaries. In Stan SZPAKOWICZ Marie-                       tional Symposium, IDA 2016, pages 308–319, Neuveden,
          Francine MOENS, editor, Workshop Text summarization                  2016. Springer.
          Branches Out (ACL ’04) [online], pages 74–81, Barcelona,        [27] Wei WANG, Furu WEI, Wenjie LI, and Sujian LI. Hy-
          2004 [cit. 2016-05-10]. ACL.                                         persum: Hypergraph based semi-supervised sentence rank-
     [14] Petr MACHOVEC. Automatická sumarizace textu [on-                     ing for query-oriented summarization. In Proceedings
          line]. Master’s thesis, Masaryk University, Faculty of In-           of the 18th ACM Conference on Information and Knowl-
          formatics, Brno, 2015 [cit. 2016-05-10].                             edge Management, pages 1855–1858, New York, 2009 [cit.
     [15] Christopher D. Manning, Mihai Surdeanu, John Bauer,                  2016-05-10]. ACM.
          Jenny Finkel, Steven J. Bethard, and David McClosky. The
          Stanford CoreNLP natural language processing toolkit. In
          Association for Computational Linguistics (ACL) System
          Demonstrations, pages 55–60, 2014.
     [16] George A. MILLER. Wordnet: A lexical database for en-
          glish. Commun. ACM [online], 38(11):39–41, november
          1995 [cit. 2016-05-12].
     [17] Smaranda MURESAN, Evelyne TZOUKERMANN, and
          Judith L. KLAVANS. Combining linguistic and machine
          learning techniques for email summarization. In Proceed-
          ings of the 2001 Workshop on Computational Natural Lan-
          guage Learning - Volume 7, pages 19:1–19:8, Stroudsburg,
          2001 [cit. 2016-05-10]. Association for Computational Lin-
          guistics.
     [18] Miles OSBORNE. Using maximum entropy for sentence