S. Krajči (ed.): ITAT 2018 Proceedings, pp. 28–34 CEUR Workshop Proceedings Vol. 2203, ISSN 1613-0073, c 2018 Matej Gallo, Luboš Popelínský, and Karel Vaculík To text summarization by dynamic graph mining Matej Gallo, Luboš Popelínský, and Karel Vaculík KD Lab FI MU Brno popel@fi.muni.cz Abstract: We show that frequent patterns can contribute ensure that only significant rules are extracted, the al- to the quality of text summarization. Here we focus gorithm incorporates measuring support and confidence. on single-document extractive summarization in English. While support expresses what portion of the graph is af- Performance of the frequent patterns based model ob- fected by the rule, confidence measures the occurrence tained with DGRMiner yields the most relevant sentences frequency of a specific change, given that a particular pat- of all compared methods. Two out of three proposed meth- tern was observed. Time abstraction is an extension to the ods outperform other methods if compared on ROUGE DGRMiner that allows us to analyse broader class of dy- data. namic graphs. The abstraction lies in the use of signum function on relative timestamps. All the negative times- tamps become −1, all the positive timestamps become 1 1 Introduction and the timestamp of 0 remains 0. DGRMiner allows for two types of abstraction. One affects only timestamps of Extractive summarization assigns a score to each text vertices and should be used when most changes are caused unit (phrase, sentence, paragraph, passage) and then picks by edges and vertices remain static. The second one also the most informative ones to form a summary called ex- affects the edges and is useful when there are too few pat- tract. The selected sentences are used verbatim [1, 25]. terns with exact timestamps. Graph representation is a common way for represent- ing data in automatic text summarization tasks. We intro- duce a new method for single-document extractive sum- 3 Method marization in English language where a text is represented as a dynamic graph and each sentence corresponds to a dy- The initial idea was to take a text as a temporal (i.e. dy- namic graph snapshot, i.e. the graph in a partivular time. namic) graph where each sentence represent a graph snap- E.g. the first sentence is the oldest snapshot while the last shot at a particular time and tokens (a lemma together with sentence of a text is the newest one. This method is based a part-of-speech tag) were nodes and edges connected all on the principle of mining frequent patterns from such a pairs of tokens. The label of an edge was equal to a num- dynamic graph and using the resulting patterns as indica- ber of appearances of this pair of tokens. In the following tors of sentence importance. subsections we describe particular steps of the algorithm. The structure of this text is following. In Section 2. we introduce DGRMiner, a tool for mining in dynamic 3.1 Transforming Text to graphs graphs. Section 3. describes the algorithm. In Sections 4 we briefly introduce the data used for evaluation. Sec- For pre-processing we employed CoreNLP [15]. CoreNLP tions 5 and 6 explain a way of summarization evaluation provides a set of natural language analysis tools: POS and sentence scoring. Section 7 displays the main results tagger, named entity recognition, parser, coreference res- of text summarization by means of frequent patterns mined olution system, sentiment analysis, bootstrapped pattern from dynamic graphs. Related work is a contents of Sec- learning, and open information extraction. We used four tion 8. We conclude with concluding Section 9. annotators: sentence split, tokenize, lemma, and POS. Using the coreNLP package for R, we split the text into sentences. In every iteration we removed the stop words 2 DGRMiner from a sentence, lemmatize the remaining words and as- signed the POS tags. If a lemma-tag pair was not already We used DGRMiner tool to extract these patterns from in the graph, we added it in and created edges between all the dynamic graphs. DGRMiner was proposed in [26] words in a same sentence. Otherwise, we only notified the for mining frequent patterns that capture various changes DGRMiner that a particular word had appeared again. We in dynamic graphs in the form of predictive rules. The parametrized context c. Therefore, in each step we mod- found predictive graph rules express what way the graph elled at most c sentences. changes. The rules capture patterns such as addition, dele- We will show it on simple example that contains a single tion, and transformation of the subgraph. The use of rel- sentence. ative timestamps for rules allows for mining general pat- terns while including time information simultaneously. To The great thing about 4 Data Aspen is that it has at least one of each of the really useful stores that To assess the performance of our method we used the Blog you need, sellingstuff at regular summarization dataset [10, 11, 23]. It consists of 100 posts prices. annotated in XML that were randomly chosen from two blogs (half from each blog), Cosmic Variance and Inter- After removing stop words and labeling words with a net Explorer Blog. Each of the four human summariz- lemma-tag pair, we receive a graph t # 9. ers picked approximately 7 sentences to form 4 reference t # 9 summaries in total. We manually restored apostrophes for an 74 great#ADJ shortened forms of to be and to have verbs, and in posses- an 75 thing#NOUN sive nouns. Punctuation within sentences was omitted as an 11 Aspen#NOUN the coreNLP sentence split annotator often wrongly split an 76 least#ADJ the sentences in the middle. We decided not to use CNN an 77 one#NUM Dataset, nor SUMMAC dataset mentioned in [9] because an 78 really#ADV the provided reference summaries were not extracted, ver- an 79 useful#ADJ batim sentences. This would require us to manually find an 66 store#NOUN the best matching sentence from within the document. an 5 need#VERB ae u 48 5 11 [need#VERB]-[Aspen#NOUN] ae u 432 5 66 [need#VERB]-[store#NOUN] 5 Semi-automatic evaluation ae u 433 5 67 [need#VERB]-[sell#VERB] ae u 434 5 74 [need#VERB]-[great#ADJ] Evaluating summaries on sentence level can be done semi- ae u 435 5 75 [need#VERB]-[thing#NOUN] automatically by measuring content overlap with preci- ae u 436 5 76 [need#VERB]-[least#ADJ] sion, recall, and F1 measure. An extracted sentence is ae u 437 5 77 [need#VERB]-[one#NUM] considered acceptable if the same sentence was extracted in a reference summary. This process cannot be fully In the first column, an, ae stand for "add a node" and automatized because reference summaries are created by "add an edge" respectively. human judges. Other semi-automatic evaluation methods used nowadays are: ROUGE, P YRAMID and BASIC E LE - MENTS . We will discuss only the ROUGE method [25, 14] 3.2 Pattern Mining as it was used in this work. Then we applied the DGRMiner on the data from previ- ous step to obtain predictive patterns, i.e. rules that de- 5.1 ROUGE scribe frequent (or rare) changes between past and future graphs – sentences. We received two types of patterns: The ROUGE (Recall-Oriented Understudy for Gisting frequent single-vertex patterns corresponding to nodes and Evaluation) measure is based on the B LEU metrics used rare patterns corresponding to edges. We modified the sup- in machine translation tasks. The idea is to compare the port parameter to change the threshold on frequency of ob- differences between the distribution of words in the can- served patterns. When the support parameter was set too didate summary and the distribution of words in the ref- low, the DGRMiner considered a pattern anything that ap- erence summaries. Given h reference summaries and a peared at least once in the text – every word, every possible candidate summary they are split into n-grams to calculate word combination. By setting the confidence parameter the intersection of n-grams between the references and the in DGRMiner to 0, the DGRMiner assigned confidence candidate. This process is illustrated in figure 1. (as discussed in section 2) to each extracted pattern. We Given its correlation with manual judgments ROUGE also played with time abstraction which allowed us to ig- almost seems to have become a standard. [13] reports nore the preset context c as discussed in section 2. There- Pearson coefficient for the most commonly used variations fore, patterns were observed in sentences that were more (ROUGE -2 and ROUGE-SU4 [14]) at a value of between than c − 2 sentences apart. Although this gave us more 0.94 and 0.99. Generally, the ROUGE-n is calculated from patterns, many of them were uninformative. An example the co-occurrences of n-grams between the candidate and of observed single-vertex pattern is "+supplies#NOUN". reference summaries as shown by formula 1 in [25]: This pattern indicates that the noun supplies appeared at least n times in the text, where the n is determined by the support parameter. An example of multi-vertex pattern is ∑n-grams ∈ {Sumcan ∩ Sumref } [store#NOUN]-[sell#VERB], which tells us that the noun ROUGE -n = (1) ∑n-grams ∈ Sumref store is frequently closely accompanied by the verb sell. The proximity of these two words is given by context c where the numerator is the maximum number of co- and the frequency is given by the support parameter. occurrences of n-grams in both reference and candidate 30 Matej Gallo, Luboš Popelínský, and Karel Vaculík The frequency method (FRQ) as given by formula 5, is the simplest of the three. For a sentence s and a set of patterns P the score is calculated as weighted average. ScoreF (s) = ∑ c(p)w(p) (5) p∈P where c(p) is the frequency of the pattern in the sentence and w(p) is its associated weight. The density method (DEN) as given by formula 6, counts the patterns in a sentence and normalizes by the length of the sentence. This method is parameter-free. Figure 1: The basic idea of ROUGE for evaluation of sum- maries [25] | S∩P | ScoreD (s) = (6) length(s) summary and the denominator is the total sum of the num- For multi-vertex patterns we tested frequency and den- ber of n-grams present in the reference summaries. sity methods. The only difference between the methods is The ROUGE-SUγ is an adaptation of ROUGE-2 using that instead of length of sentences we use the number of  skip units (SU) of a size ≤ γ. SU4 considers the bigrams all possible word pair combinations |S|2 . and the bigrams SU to be arbitrary in a maximum window We built the summary in a greedy fashion. In every it- length of γ = 4 words. The number of bigrams with an eration we picked the highest scoring sentence. Patterns arbitrary size of length γ is given by formula 2: observed in the sentence were penalized by a parameter   k−γ λ . The scores were recomputed and the process continued n until a desired number of sentences was picked or until all Count(k, n) = C − ∑ (k − γ); γ ≤ 1 (2) k 0 the sentences were used. For every method we discovered the optimal value of λ . We tuned the parameter on 10% of where n is the n-gram length and k is the sentence length the entire dataset. The optimal values for λ are presented in words. in the tables 2 and 1. The ordering of sentences in the fi- The ROUGE metrics are not flawless. Firstly, they have nal summary maintains the relative ordering in the original a problem with the representation of content. Secondly, document. they will not consider chains of words such as "MUNI" 6= "Masaryk University" and "FI" 6=" Faculty of Informat- ics". A study [22] found that the system can be tricked into 7 Results generating a summary with high ROUGE score. We evaluated the model using the ROUGE-1 metric, which is recommended for short summaries [14]. Every 6 Sentence Scoring blog post was compared against four reference summaries as described in chapter 4. The results of all three methods In this step we used the observed patterns as indicators to can be seen in tables 1, 2 and 3. The first three columns de- score the sentences. The final score was obtained accord- note whether time abstraction on both vertices and edges ing to formula 3 as a sum of single-vertex and multi-vertex is used, whether the patterns were used to score sentences, scores: and whether the initial weights were determined by the confidence of DGRMiner for the given pattern, respec- Score(s) = Scoresingle (s) + Scoremulti (s) (3) tively. The last three columns correspond to the ROUGE-1 metrics – precision (what portion of sentences we selected Three different metrics were implemented to calculate were part of the reference summary), recall (what portion single-vertex score. of sentences in reference summary we extracted), and f1 The Jaccard coefficient (JAC) as given by formula 4, measure (harmonic mean of precision and recall). expresses the overlap of two sets. In our case, between the The FRQ method marked the most accurate sentences set of patterns P and the set of words in the sentence S. among all the presented algorithms as can be seen from figure 2. JAC method picked fewer correct sentences than wsum (S ∩ P) ScoreJ (s) = (4) FRQ method but still more than any traditional approach. wsum (S ∪ P) Both FRQ and DEN methods ranked first in terms of pre- where wsum assigns weight to every element of the set and cision. then sums over them. A question arises regarding what The frequency-based method incorporating patterns weight we should assign to non-indicators. Empirically, with no confidence weighting (identified as FRQ) achieved we received the best results for the value of 0.001. the highest recall, the highest f1 score, and the highest To text summarization by dynamic graph mining 31 binall patterns weights precision recall f1 score F F F 0.643 0.694 0.664 F T F 0.643 0.686 0.659 F T T 0.641 0.684 0.657 T F F 0.641 0.683 0.657 F F T 0.640 0.691 0.660 T T F 0.640 0.678 0.654 T T T 0.640 0.678 0.654 T F T 0.639 0.681 0.655 Table 1: Results for blog summarization dataset using jaccard method with parameters λ = 0.55, w0 = 0.001 binall patterns weights precision recall f1 score F T F 0.645 0.702 0.668 F F T 0.645 0.700 0.667 F F F 0.644 0.701 0.665 F T T 0.643 0.693 0.662 T F F 0.642 0.691 0.661 T T F 0.642 0.691 0.661 T F T 0.642 0.690 0.660 T T T 0.642 0.688 0.660 Table 2: Results for blog summarization dataset using frequency method with parameter λ = 0.20 binall patterns precision recall f1 score F T 0.651 0.514 0.562 F F 0.650 0.514 0.562 T T 0.649 0.516 0.563 T F 0.649 0.512 0.562 Table 3: Results for blog summarization dataset using density method Figure 2: Comparison of number of correctly chosen sentences – using blog summarization dataset (our algorithms are displayed yellow) 32 Matej Gallo, Luboš Popelínský, and Karel Vaculík number of correctly chosen sentences. The highest pre- Markov model allows modeling these dependencies. Con- cision was achieved by density-based method incorporat- roy and O’leary [6] use a joint distribution for the features ing patterns with no confidence weighting (identified as set, unlike the independence-of-features assumption used DEN). We chose these two models and the highest scoring in naïve Bayesian methods. The HMM was trained us- Jaccard method for comparison together with algorithms ing five features: position of the sentence in the document presented in article [9]. We could see that FRQ behaves (number of states); number of terms in a sentence, and similarly to Word frequency (WF) algorithm proposed in likeliness of the sentence terms given the document terms. the paper. This is not surprising, because the single vertex- Summarizer NetSum presented by Svore et al. [24] uses frequent patterns correspond to words that appear at least an artificial neural network (ANN) called RankNet to rank n-times in the text. Where n is determined by the support the sentences. RankNet is a pair-based neural network al- parameter in the DGRMiner tool as described in section gorithm for ranking a set of inputs. It is trained on pairs 3.2. The multi-vertex patterns improved the performance of sentences (Si , S j ), such that the ROUGE score for Si of the summarizer in density models and in one case (the should be higher than S j . Pairs are only generated in a highest scoring model) in frequency model. single document, not across documents. The cost function for RankNet is the probabilistic cross-entropy cost func- tion. Training is performed using a modified version of 8 Related work back-propagation algorithm for two-layer networks, which is based on optimizing the cost function by gradient de- Kupiec et al. introduce in their work [12] method inspired scent. The system significantly outperforms the standard by Edmundson’s [7]. They approached the summarization baseline in the ROUGE-1 measure. No past system could as a statistical classification problem. A Bayes classifier outperform the baseline with statistical significance. was trained to estimate the probability that a given sen- A system for generating product category-based (topic- tence would be included in the summary. They used 6 based) extractive summarization was proposed by [4, 5]. discrete features (presented in order of importance): para- The collection of 45 news items corresponding to various graph feature (the position of sentence in paragraph s), products are pre-processed using standard techniques: to- fixed-phrase feature (the sentence contains a phrase from kenization, stopword removal, stemming. The final corpus a list), sentence length cutoff feature (threshold u1 = 5, contains around 1500 features and is represented by a bag- length(s) > u1 ), thematic word feature (the presence of of-words VSM based on these features. To identify the thematic terms), uppercase word feature (the presence of topics, the news items about specific categories of prod- words in capital letters). The best results were obtained ucts are segregated into separate clusters using K-Means using the first three features. and then an extractive summary is generated from each of Aone et al. [2] built on Kupiec’s work [12] and ex- these topical clusters. The K number of cluster is deter- panded the feature set of their system, called D IM S UM, mined by a Self-Organizing Map (SOM). with signature terms, which indicate key concepts for a Chakraborti and Dey [5] assigned a score to the en- given document. Another advantage over [12]’s system tire summary as a single unit. The total summary score is the use of multi-word phrases – statistically derived (TSS) is taken as a combination of cosine similarity be- collocation phrases (e.g. "omnibus bill", "crime bill", tween centroid of corpus and the summary as a whole; "brady bill") and associated words ("camera" and "ob- relative length of the summary; and redundancy penalty. scura", "Columbia River" and "gorge"), as well as the use To maximize TSS constrained by the number of lines in of WordNet [16] to identify possible synonyms of found summary (τ = 5 − 7%), they opted for quick Artificial Bee signature terms. He applied a shallow discourse analysis to Colony optimization, a global optimization technique, for resolve co-references and maintain cohesion – only name sentence selection. The summary with the highest score is aliases were resolved such as UK to United Kingdom. then chosen. Osborne in his work [18] disagrees with the traditional In Muresan’s paper [17], a system called GIST-IT used assumption of feature independence and shows empiri- for email-summarization task is discussed. First, noun cally that the maximum entropy (MaxEnt) model produces phrases (NPs) are extracted as they carry the most content- better extracts than the naïve Bayes model with similarly ful information. Subsequently, machine learning is used to optimized prior appended to both models. Unlike naïve select the most salient NPs. A set of nine features, divided Bayes, MaxEnt does not make unnecessary feature inde- into three categories (head of the NP, whole NP, combina- pendence assumptions. Let c be a binary label (binary: tion of head and modifiers of NP) were used: head of the part of summary or not), s the item we are interested in la- NP TF*IDF, position of first occurrence (focc) of the head beling, fi the i-th feature, and ωi the corresponding feature in text, TF*IDF of entire NP, focc of entire NP, length of weight. the NP in words, length of the NP in characters, position Hidden Markov Models (HMM), similar to MaxEnt, of the NP in the sentence, position of the NP in the para- have weaker assumptions of independence. There are graph, and combination of the TF*IDF scores of head of three types of dependencies: positional dependence, fea- the NP and its modifiers. ture dependence, and Markovity dependence. A first-order To find the most salient NPs, three machine learning al- To text summarization by dynamic graph mining 33 gorithms were applied: decision trees (axis-parallel trees – capturing the high-order relations between both sentences C4.5 and oblique trees – OC1), rule induction (production and words. [3] proposed a hypergraph-based model for rule – C4.5rules and propositional rules – RIPPER), and generic summarization based on [27]’s hypergraph-based decision forests (DFC using information gain ratio). model for query-focused summarization. The ranking Muresan claims that shallow linguistic filtering applied uses a semi-supervised approach to order sentences. They to NPs improved the classifiers accuracy [17]. The filter- model the words as vertices and sentences as hyperedges ing consisted of four steps: grouping inflectional variants, and then approach the problem as a random walk over hy- removing unimportant modifiers, filtering stopwords, and peredges. removing empty nouns. A modification of PageRank algorithm called LexRank is used to weight sentences. The undirected graph of sen- 9 Conclusion tences is constructed from symmetrical similarities (mod- We showed that frequent patterns can contribute to the ified cosine measure). The score of each vertex s is cal- quality of text summarization. The comparison supports culated iteratively until the values of the vertices have not our statement that the performance of our frequent patterns been modified by more than ε = 0.0001. LexRank algo- based model is comparable to the simpler word frequency rithm is used as a component of the M EAD [8]. method and yielded the most relevant sentences of all com- Unlike LexRank, TextRank uses the similarities of pared methods. Our methods outperformed other methods edges to weight the vertices. The score of each sentence si in precision but lacked in recall. We attribute the similarity is calculated iteratively until convergence is reached. to word frequency method to inadequate graph represen- As the graph is constructed from inter-sentence sim- tation – instead of interconnecting all the words within a ilarity measure, the choice of the method for sentence- sentence, suggest connecting them according to the parse weighting has significant impact. One approach is to use tree. Another consideration is to use a graph mining tool a bag-of-words to represent the sentences. The similarity that searches for more specific types of patterns than DGR- is obtained by calculating the cosine similarity weighted Miner. by inverse document frequency [1] between their vectorial representations. Another approach suggests using word overlap between sentences instead. The weak point of all Acknowledgments similarity measures that use words (cosine similarity, word overlap, longest common subsequence) is the dependency This work has been supported by Faculty of Informatics, on the lexicon of the document. The solution to this is Masaryk University. We would like to thank to ITAT re- its combining with similarity measure based on chains of viewers for their suggestions and comments. characters. Therefore, sentences that do not share a single word but contain a number of words that are close mor- References phologically can be compared [25]. Patil [19] proposed a new graph-based model called [1] Charu C. AGGARWAL and ChengXiang ZHAI. Mining S UM G RAPH. First the text is pre-processed (stemmed) Text Data. Springer, New York, 2012. and represented as a VSM with TF*IDF weights. He then [2] Chinatsu AONE, James GORLINSKY, and Bjornar computes pair-wise cosine similarities and subtracts the LARSEN. A trainable summarizer with knowledge ac- values from 1 to obtain dissimilarities. The resulting ma- quired from robust nlp techniques. Advances in Automatic trix of intra-sentence dissimilarities is then used to model Text Summarization, 71, 1999. the document as graph. The vertices represent the sen- [3] Abdelghani BELLAACHIA and Mohammed AL- tences and edges are weighted by intra-sentence dissimi- DHELAAN. Multi-document hyperedge-based ranking larities. for text summarization. In Proceedings of the 23rd ACM The novel idea is the use of link reduction technique International Conference on Conference on Information known as Pathfinder Network Scaling (PFnet) [21, 20] to and Knowledge Management, pages 1919–1922, New scale the graph. PFnet models the human aspects of se- York, 2014 [cit. 2016-05-10]. ACM. mantic memory. The centrality of a sentence and its po- [4] Swapnajit CHAKRABORTI. Multi-document text summa- sition in a document are used to compute the importance rization for competitor intelligence: A methodology based on topic identification and artificial bee colony optimiza- of sentences. Four different centrality measure were tested tion. In Proceedings of the 30th Annual ACM Symposium and closeness centrality showed to perform best. Finally, on Applied Computing, pages 1110–1111, New York, 2015 the sentences are ranked according to their importance and [cit. 2016-05-10]. ACM. first n highest-scoring sentences were picked. [5] Swapnajit CHAKRABORTI and Shubhamoy DEY. Prod- The approaches based on similarity graphs solely model uct news summarization for competitor intelligence using the similarity between pairs of sentences with no clear rep- topic identification and artificial bee colony optimization. resentation of word relations. Therefore, it is not clear In Proceedings of the 2015 Conference on Research in if they adequately cover all topical information. The Adaptive and Convergent Systems, pages 1–6, New York, hypergraph-based approach is to remedy this problem by 2015 [cit. 2016-05-10]. ACM. 34 Matej Gallo, Luboš Popelínský, and Karel Vaculík [6] John M. CONROY and Dianne P. O’LEARY. Text sum- extraction. In Proceedings of the ACL’02 Workshop on Au- marization via hidden markov models. In Proceedings of tomatic Summarization, Stroudsburg, 2002 [cit. 2016-05- the 24th Annual International ACM SIGIR Conference on 10]. Association for Computational Linguistics. Research and Development in Information Retrieval, pages [19] Kaustubh PATIL and Pavel BRAZDIL. Text summariza- 406–407, New York, 2001 [cit. 2016-05-10]. ACM. tion: Using centrality in the pathfinder network. Int. J. [7] Harold P. EDMUNDSON. New methods in automatic ex- Comput. Sci. Inform. Syst [online], 2:18–32, 2007 [cit. tracting. J. ACM [online], 16(2):264–285, april 1969 [cit. 2016-05-12]. 2016-05-10]. [20] Roger W. SCHVANEVELDT, editor. Pathfinder Associa- [8] Günes ERKAN and Dragomir R. RADEV. Lexrank: tive Networks: Studies in Knowledge Organization. Ablex Graph-based lexical centrality as salience in text summa- Publishing Corp., Norwood, 1990. rization. J. Artif. Int. Res. [online], 22(1):457–479, decem- [21] Roger W. SCHVANEVELDT, D.W. DEARHOLT, and F.T. ber 2004 [cit. 2016-05-11]. DURSO. Graph theoretic foundations of pathfinder net- [9] Rafael FERREIRA, Luciano CABRAL, and Rafael D. works. Computers & mathematics with applications [on- LINS. Assessing sentence scoring techniques for extrac- line], 15(4):337–345, 1988 [cit. 2016-05-11]. tive text summarization. Expert Systems with Applications [22] Jonas SJÖBERGH. Older versions of the rougeeval sum- [online], 40(14):5755 – 5764, october 2013 [cit. 2016-05- marization evaluation system were easier to fool. Inf. 10]. Process. Manage., 43(6):1500–1505, november 2007 [cit. [10] Meishan HU, Aixin SUN, and Ee-Peng LIM. Comments- 2016-05-12]. oriented blog summarization by sentence extraction. In [23] Dr. Aixin SUN. Comments- Proceedings of the Sixteenth ACM Conference on Confer- oriented document summarization, 2015. ence on Information and Knowledge Management, pages http://www.ntu.edu.sg/home/axsun/datasets.html, vis- 901–904, New York, 2007 [cit. 2016-05-10]. ACM. ited 2016-05-20. [11] Meishan HU, Aixin SUN, and Ee-Peng LIM. Comments- [24] Krysta M. SVORE, Lucy VANDERWENDE, and Christo- oriented document summarization: Understanding docu- pher J.C. BURGES. Enhancing single-document summa- ments with readers’ feedback. In Sung-Hyon MYAENG, rization by combining RankNet and third-party sources. Douglas W. OARD, Fabrizio SEBASTIANI, Tat-Seng In Proceedings of the 2007 Joint Conference on Empir- CHUA, and Mun-Kew LEONG, editors, Proceedings of the ical Methods in Natural Language Processing and Com- 31st Annual International ACM SIGIR Conference on Re- putational Natural Language Learning (EMNLP-CoNLL), search and Development in Information Retrieval, pages pages 448–457, Prague, 2007 [cit. 2016-05-10]. Associa- 291–298, New York, 2008 [cit. 2016-05-10]. ACM. tion for Computational Linguistics. [12] Julian KUPIEC, Jan PEDERSEN, and Francine CHEN. A [25] Juan-Manuel TORRES-MORENO. Automatic Text Sum- trainable document summarizer. In Proceedings of the 18th marization. ISTE, London, 2014. Annual International ACM SIGIR Conference on Research [26] Karel Vaculík and Lubomír Popelínský. Dgrminer: and Development in Information Retrieval, pages 68–73, Anomaly detection and explanation in dynamic graphs. In New York, 1995 [cit. 2016-05-10]. ACM. Knobbe-A. Soares C. Papapetrou P. Boström, H., editor, [13] Chin-Yew LIN. Rouge: a package for automatic eval- Advances in Intelligent Data Analysis XV - 15th Interna- uation of summaries. In Stan SZPAKOWICZ Marie- tional Symposium, IDA 2016, pages 308–319, Neuveden, Francine MOENS, editor, Workshop Text summarization 2016. Springer. Branches Out (ACL ’04) [online], pages 74–81, Barcelona, [27] Wei WANG, Furu WEI, Wenjie LI, and Sujian LI. Hy- 2004 [cit. 2016-05-10]. ACL. persum: Hypergraph based semi-supervised sentence rank- [14] Petr MACHOVEC. Automatická sumarizace textu [on- ing for query-oriented summarization. In Proceedings line]. Master’s thesis, Masaryk University, Faculty of In- of the 18th ACM Conference on Information and Knowl- formatics, Brno, 2015 [cit. 2016-05-10]. edge Management, pages 1855–1858, New York, 2009 [cit. [15] Christopher D. Manning, Mihai Surdeanu, John Bauer, 2016-05-10]. ACM. Jenny Finkel, Steven J. Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pages 55–60, 2014. [16] George A. MILLER. Wordnet: A lexical database for en- glish. Commun. ACM [online], 38(11):39–41, november 1995 [cit. 2016-05-12]. [17] Smaranda MURESAN, Evelyne TZOUKERMANN, and Judith L. KLAVANS. Combining linguistic and machine learning techniques for email summarization. In Proceed- ings of the 2001 Workshop on Computational Natural Lan- guage Learning - Volume 7, pages 19:1–19:8, Stroudsburg, 2001 [cit. 2016-05-10]. Association for Computational Lin- guistics. [18] Miles OSBORNE. Using maximum entropy for sentence