Towards Accessible Abstractive Text Summarization Hacia los resúmenes abstractivos y accesibles Tatiana Vodolazova Departamento de Lenguajes y Sistemas Informáticos Universidad de Alicante tvodolazova@dlsi.ua.es Abstract: In Natural Language Processing, text summarization and text simpli- fication are the two areas that improve information access for the user. This PhD research project investigates the possibility of integrating text simplification into the abstractive text summarization framework for the purpose of adapting gener- ated summaries to user language proficiency and cognitive ability. Keywords: Language generation, abstractive text summarization, readability Resumen: En el Procesamiento de Lenguaje Natural, el resumen y la simplificación de texto son dos áreas cuyo objetivo es mejorar el acceso de la información al usuario. Esta tesis doctoral investga la posibilidad de integrar la simplificación del texto en el marco de generación de resúmenes abstractivos como componente esencial para adaptar resúmenes generados a la competencia lingüı́stica y cognitiva del usuario. Palabras clave: Generación de lenguaje, resúmenes abstractivos, legibilidad 1 Motivation sion and grammatical accuracy (Gupta and Gupta, 2018). Text summarization is classi- The right to information is defined as a ba- fied into extractive and abstractive. Extrac- sic human right by UNESCO1 , but in the age tive approaches produce summaries through of data overload accessing the required infor- selection and concatenation of original text mation is not a straightforward task. The segments. These approaches reveal a number amount of data and their semantic and syn- of weaknesses that include: tactic complexity require the development of automatic methods capable of representing • concatenation of non-adjacent text seg- information in both a compact and compre- ments increases the risk of “dangling hensible way. Within the field of natural lan- anaphora” (i.e. pronouns without ref- guage processing (NLP), text summarization erents or with the incorrect ones) and and text simplification are the two areas of misleading temporal expressions (Stein- text-to-text generation that can tackle this berger et al., 2007; Smith, Danielsson, task. and Jönsson, 2012); Text simplification aims to transform • tendency to include lengthy sentences complex text into a more comprehensible ver- that, apart from the essential infor- sion while preserving its underlying meaning. mation, carry irrelevant text segments The main tasks in text simplification include (McKeown et al., 2005); readability assessment, lexical and syntactic • highly incoherent summaries that fail to simplification (Saggion, 2017). They encom- convey the gist, especially in the case pass a broad range of techniques from design- of concatenating non-adjacent text seg- ing readability formulas to developing com- ments in documents with a high degree plex word substitution and sentence simplifi- of polarized opinions (Cheung, 2008); cation algorithms. The main goal of text summarization is to • information representation is identical to generate a shorter version of the original data the original text. In the worst case sce- while preserving its main concepts, cohe- nario, where essential knowledge is scat- tered across all text segments, the gener- 1 https://en.unesco.org/themes/access- ated summary would contain all the orig- information inal text segments. Lloret, E.; Saquete, E.; Martı́nez-Barco, P.; Sepúlveda-Torres, R. (eds.) Proceedings of the Doctoral Symposium of the XXXV International Conference of the Spanish Society for Natural Language Processing (SEPLN 2019), p. 63–68 Bilbao, Spain, September 25th 2019. Copyright c 2019 his paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). These deficiencies hinder the extraction graded according to these proficiency guide- of the key concepts and, at the same time, lines. These texts are written in a clear style they affect readability of generated sum- and include main communication concepts as maries making them less comprehensible. well as only the necessary linguistic features In recent years, interest in text summa- corresponding to each linguistic level. They rization has switched towards abstractive ap- offer a perfect environment for experiments proaches (Gupta and Gupta, 2018). Unlike with readability assessment and text summa- extractive summarization, abstractive sum- rization. marization methods aim to generate partially or completely novel text segments. Abstrac- 2 Background and Related Work tive text summarization methods that in- Due to the exponential growth of textual volve natural language generation tools, such data on the web, manual filtering and ex- as sentence realizers, can generate sentences traction of necessary information is a te- with resolved agreements (Genest and La- dious and time-consuming task. The first palme, 2012). As an input, a sentence realizer attempt to tackle this task occurred in the requires base forms of words and a sentence mid-twentieth century, when Luhn (1958) de- structure in terms of syntactic constituents. signed the first extractive approach to text Control over sentence realization addresses summarization. Since then, this area of NLP the limitations faced by the extractive ap- has been extensively researched, exploiting a proach, such that anaphoric expressions and wide range of both extractive and abstractive relative information importance are usually techniques (Gupta and Gupta, 2018; Gamb- resolved on the representation level, while hir and Gupta, 2017). sentence length depends on the chosen sen- Automatic text simplification, on the tence structure. other hand, has become an established NLP Both text summarization and text sim- field only recently. It was designed origi- plification are designed to improve informa- nally to solve the problem of reduced lit- tion accessibility, but from different perspec- eracy, but has been also shown to benefit tives: summarization reduces text volume to L2 learners, children, people with limited the key concepts and simplification makes it domain knowledge and with cognitive dif- more comprehensible. ficulties, such as dyslexia or aphasia (Sid- This PhD research project explores the dharthan, 2014). Text simplification in- possibility of integrating text simplification volves a number of transformations that in- within the framework of abstractive text clude sentence split, sentence deletion, inser- summarization in order to generate sum- tion, reordering and substitution among oth- maries adapted to user language proficiency, ers (Saggion, 2017). knowledge and cognitive ability. This pro- To the best of our knowledge there have posal is designed around the hypothesis that been very few studies, all conducted by the abstractive paradigm provides deep control same authors, that aim to generate accessible of summarization process that enables the re- summaries through integration of text simpli- quired flexibility to incorporate simplification fication into summarization process. These techniques of both a syntactic and lexical na- authors designed an extractive summariza- ture. The focus is on examining, applying tion approach based on a differential evolu- and analyzing the impact of different tech- tion algorithm (Nandhini and Balasundaram, niques and approaches in order to detect the 2014). Their method represents each sen- most auspicious ones. Through their opti- tence as a set of 4 informativeness features mal combination, the objective is to develop (sentence position, title similarity, etc.) and an accessible abstractive text summarization 5 readability features (word length, sentence approach. length, etc.). Summarization is considered as One of the possible research scenarios fo- an optimization problem that aims to maxi- cuses on second language (L2) learners who mize both the informativeness and the read- will benefit from the proposed summarization ability scores. However, their approach is approach. Since the implementation of Com- based on the extractive summarization tech- mon European Framework of Reference for niques and doesn’t involve any simplification. Languages (CEFR) grading scale(Council of At the same time their set of readability fea- Europe, 2001), all texts for L2 learners are tures is small. 64 3 Main Hypothesis and process and define the workflow direction. Objectives The approach proposed by this PhD research This PhD research project explores the pos- project consists of 6 main stages, namely, in- sibility of integrating text simplification into formation extraction, storage, scoring, text the framework of abstractive text summa- planning, adaptation and text generation. rization in order to generate summaries Each of these stages poses a set of corre- adapted to user language proficiency, domain sponding questions that include some of the knowledge and cognitive ability. It is based following: on the hypothesis that natural language gen- 1. What features to identify in the process eration plays a key role for this integration by of information extraction? providing access to and manipulation of both 2. How to store extracted information? deep semantic and syntactic data structure. Evaluation of this hypothesis requires re- 3. How to rank each piece of coherent in- search, analysis and development of summa- formation? rization, natural language generation, sim- 4. How to combine the selected pieces of plification and readability assessment tech- information? niques with their subsequent application to 5. How to assess readability and what kind generate accessible abstractive summaries. of simplification techniques to use? To achieve this goal, the following sub- 6. How to generate text from the selected objectives are proposed: information? • to conduct exhaustive research in text Though defined as separate issues, all of summarization, language generation, these questions are interrelated and cannot text simplification and readability as- be handled in a linear order. For example, sessment tasks, analyzing current ap- depending on the required level of simplifica- proaches; tion a different piece of duplicated informa- • to investigate, propose and analyze new tion, based on its readability level, may be approaches for these tasks and for the selected during the text planning stage. intelligent representation of extracted information using techniques based on 4.1 Relevant Features NLP, focusing on syntactic and semantic The first set of experiments aims to identify knowledge; the key features that need to be extracted • to design the application of the proposed from the raw text in order to benefit both approach for automatic summary gener- the process of text summarization and simpli- ation following an abstractive paradigm; fication. In our initial research we analyzed • to exhaustively evaluate both the pro- whether semantic information such as word posed approach and the produced sum- senses, anaphora resolution and textual en- maries. The evaluation will consist of tailment improve informativeness of extrac- both intrinsic and extrinsic, quantitative tive summaries (Vodolazova et al., 2012). Ex- and qualitative techniques; periments showed that the combination of the 3 techniques outperforms the baseline • to analyze possible extensions of the pro- and some of the existing summarization sys- posed approach for other languages and tems and, at the same time, it benefits the different user profiles; and, summarization process more than each tech- • to draw conclusions and outline benefits nique individually. of this research together with a proposal This analysis was followed by a closely for future work. related experiment that considered whether any type of text can equally benefit from 4 Methodology and the proposed these techniques. The experiment setup in- experiments volved the evaluation of certain linguistic Since this research encompasses a number of properties of the original text related to NLP areas, designing an accurate set of ex- anaphora resolution and textual entailment periments requires a clear understanding of such as, proper noun, pronoun and noun where and how these areas interact. For this ratios, and how they affect informativeness purpose we need to identify the stages of the of extractive summaries (Vodolazova et al., 65 2013a). As expected, the results showed that der to, for example, convert passive construc- high ratios of at least 2 of these linguis- tions into active ones for the purpose of syn- tic properties introduce a lot of ambiguity tactic simplification. In our first approxima- and that the available tools could not handle tion of an abstractive method we designed an it. This decrease in the quality of generated abstract representation based on the concept summaries emphasized the need for an addi- of subject-verb-object triplets. We adapted tional text analysis stage that would help to terminology proposed by Genest and La- identify the most favourable summarization palme (2011) and referred to them as infor- technique depending on the linguistic prop- mation items (InIts). Each InIt represents erties of the original text. a piece of coherent information. This repre- The informativeness of generated sum- sentation was used to generate ultra-concise maries is not the only goal of our approach. summaries within an abstractive summariza- While semantic information, under certain tion method that obtained better results in conditions, benefits the informativeness, it terms of informativeness than other summa- may not necessarily benefit the readabil- rization approaches. The next step would ity. An additional experiment studied how include readability evaluation of summaries the same semantic techniques within the generated from this abstract representation. framework of extractive summarization af- fected readability of the generated summaries 4.3 Scoring (Lloret et al., 2019). It was shown that, The scoring stage may be considered as a depending on the chosen readability met- component of the actual summarization pro- ric, evaluation of informativeness versus read- cess rather than of the preprocessing stage ability can generate conflicting results. Out described so far. The aim of the scoring of 8 tested readability metrics, the extrac- stage is to rank informativeness of InIts tive summarization approach that involves a with repect to the selected set of features. combination of word sense disambiguation, Our experiments with extractive summariza- anaphora resolution and textual entailment tion methods showed that scoring based on scored best only on 3 of the metrics. At the concept frequency with resolved anaphoric same time, the summarization approach that relations and disambiguated word senses is based on anaphora resolution and delivered improves the informativeness of summaries the worst ROUGE(Lin, 2004) results scored (Vodolazova et al., 2012). Similarly, a dif- best on the other 3 readability metrics. ferent experiment with an abstractive sum- marization approach that scores InIts on 4.2 Information Representation subject-verb-object and named entities fre- Both simplification and abstractive summa- quencies was shown to outperform other sum- rization methods require a deep analysis of marization systems (Lloret et al., 2015). original data to extract their semantic and We plan to design the next set of experi- syntactic information. This information can ments for the scoring stage around the combi- be manipulated to generate adapted sum- nation of all features that we have tested. As- maries while maintaining the original mean- signing weights to different features accord- ing and correct grammar. In our initial re- ing to their impact on informativeness and search within the framework of extractive readability may also benefit the summariza- summarization we experimented with a sim- tion process. Another extension to the scor- plified abstract data representation in a form ing stage may involve the integration of read- of a bag of enriched words (Vodolazova et al., ability either as a separate feature or, follow- 2013b). Each word was either an instance of ing the example of Nandhini and Balasun- a function or of a content word, with the lat- daram (2014), by combining it with the in- ter carrying information about its word sense, formativeness features in a composite score. concept frequency, part of speech and others. However, for a fully abstractive summa- 4.4 Text Planning rization approach that uses a sentence real- Once the InIts have been scored, it may be izer for text generation, this representation sufficient to generate a summary by selecting lacks information about semantic roles, voice, the top ranked InIts individually and con- etc. This information will also be required verting each one to text until the required during the readability adjustment stage in or- summary size has been reached. In this case, 66 the present stage may be omitted. However, method outperformed some other summa- mere scoring may be insufficient to produce rization approaches. We will repeat this ex- well-formed summaries. The text planning periment once all the aforementioned stages stage raises a number of challenges that re- are fully developed and integrated with re- quire additional experiments, such as: dundancy detection, anaphora resolution, as well the other open issues previously men- • Redundancy detection is the most evi- tioned. dent. Redundancy may be present both in terms of identical (or semantically 5 Issues to Discuss very related) InIts and repeated sub- This paper describes a research proposal that jects in adjacent sentences; focuses on examining how text summariza- • Context information, namely, whether tion and text simplification can be com- the method should select only the high- bined in order to make information accessi- est ranked InIts or whether it should ble through adaptation of summaries to the include the InIt that precedes the high- users with different language proficiency lev- est ranked ones in the original text; and, els and cognitive abilities. The outlined ap- • Compression rate considerations, proach raises the following issues for discus- whereby for higher compression rates sion: (i.e. shorter summaries) it may be • A general structure of the approach de- beneficial to include more InIts by scribing the stages involved and the reducing noun phrases to their head workflow has been defined. Each stage nouns. will trigger a series of experiments in or- 4.5 Readability Assessment der to analyze and determine its most auspicious implementation. However, is Once provided with the simplification re- it the optimal combination of stages? quirement, the readability of each InIt needs Does each stage meet its objective or to be assessed. This may involve conversion should it integrate additional functions? of passive constructions into active ones, sub- • If we had to choose between syntactic stitution of long and infrequent words with and lexical simplification of summaries, their shorter and more frequent counterparts. which would be the most appropriate for The exact simplification techniques that the L2 learners target group? may be involved at this stage will be governed by the simplification requirements. The read- • What semantic readability metrics ability metrics that we used in our initial re- would be the most representative of the search are of a general nature and all belong target group and what source can be to the same family of superficial length-based used to gauge their distribution across metrics (Lloret et al., 2019). They neither different levels? reflect syntactic nor lexical complexity. Fu- ture experiments in the field of readability Acknowledgements assessment will include an in-depth research This research work has been partially of readability metrics and corresponding sim- funded by the University of Alicante plification techniques for each of the possible (Spain), Generalitat Valenciana and target groups, including (but not limited to) the Spanish Government through the L2 learners. projects SIIA (PROMETEU/2018/089), LIVING-LANG (RTI2018-094653-B-C22), 4.6 Text Generation INTEGER:(RTI2018-094649-B-I00) and Red The final stage of our proposal uses a text re- iGLN (TIN2017-90773-REDT). alizer to generate sentences from the selected InIts. To evaluate the quality of generated References sentences we conducted some initial exper- Cheung, J. C. 2008. Comparing abstractive iments with our first approximation to ab- and extractive summarization of evalua- stractive summarization. The results showed tive text: controversiality and content se- a decrease in the informativeness of generated lection. Ph.D. thesis, Department of Com- summaries when compared to the original puter Science of the Faculty of Science, sentences (Lloret et al., 2015). However, this University of British Columbia. 67 Council of Europe. 2001. The Common Eu- McKeown, K., J. Hirschberg, M. Galley, ropean Framework of Reference for Lan- and S. Maskey. 2005. From text guages. Cambridge University Press. to speech summarization. In Proceed- Gambhir, M. and V. Gupta. 2017. Recent ings of IEEE International Conference on automatic text summarization techniques: Acoustics, Speech, and Signal Processing A survey. Artificial Intelligence Review, (ICASSP’05)., volume 5, pages 997–1000. 47(1):1–66, jan. IEEE, March. Genest, P.-E. and G. Lapalme. 2011. Frame- Nandhini, K. and S. R. Balasundaram. work for abstractive summarization using 2014. Extracting easy to understand text-to-text generation. In Proceedings of summary using differential evolution algo- the Workshop on Monolingual Text-To- rithm. Swarm and Evolutionary Compu- Text Generation, pages 64–73, Portland, tation, 16:19 – 27. Oregon, June. Association for Computa- Saggion, H. 2017. Automatic text simplifica- tional Linguistics. tion. Synthesis Lectures on Human Lan- Genest, P.-E. and G. Lapalme. 2012. Fully guage Technologies, 10(1):1–137. abstractive approach to guided summa- Siddharthan, A. 2014. A survey of rization. In Proceedings of the 50th research on text simplification. ITL- Annual Meeting of the Association for International Journal of Applied Linguis- Computational Linguistics: Short Papers, tics, 165(2):259–298. volume 2 of ACL ’12, pages 354–358, Stroudsburg, PA, USA. Association for Smith, C., H. Danielsson, and A. Jönsson. Computational Linguistics. 2012. A more cohesive summarizer. Pro- ceedings of COLING 2012: Posters, pages Gupta, S. and S. Gupta. 2018. Abstractive 1161–1170. summarization: An overview of the state of the art. Expert Systems with Applica- Steinberger, J., M. Poesio, M. A. Kabadjov, tions, 121:49–65. and K. Ježek. 2007. Two uses of anaphora resolution in summarization. Information Lin, C.-Y. 2004. ROUGE: A package for Processing & Management, 43(6):1663– automatic evaluation of summaries. In 1680, November. Text Summarization Branches Out: Pro- ceedings of the ACL-04 Workshop, pages Vodolazova, T., E. Lloret, R. Muñoz, and 74–81, Barcelona, Spain, July. Association M. Palomar. 2012. A comparative study for Computational Linguistics. of the impact of statistical and seman- tic features in the framework of extractive Lloret, E., E. Boldrini, T. Vodolazova, text summarization. In Text, Speech and P. Martı́nez-Barco, R. Muñoz, and Dialogue - 15th International Conference, M. Palomar. 2015. A novel concept- TSD 2012, Brno, Czech Republic, pages level approach for ultra-concise opin- 306–313. ion summarization. Expert Syst. Appl., 42(20):7148–7156. Vodolazova, T., E. Lloret, R. Muñoz, and M. Palomar. 2013a. Extractive text sum- Lloret, E., T. Vodolazova, P. Moreda, marization: Can we use the same tech- R. Muñoz, and M. Palomar. 2019. Are niques for any text? In E. Métais, better summaries also easier to under- F. Meziane, M. Saraee, V. Sugumaran, stand? analyzing text complexity in auto- and S. Vadera, editors, Natural Lan- matic summarization: Challenges, mod- guage Processing and Information Sys- els, and approaches. In M. Litvak and tems, pages 164–175, Berlin, Heidelberg. N. Vanetik, editors, Multilingual text anal- Springer Berlin Heidelberg. ysis challenges, models, and approaches. World Scientific, New Jersey, 02, pages Vodolazova, T., E. Lloret, R. Muñoz, and 337–369. M. Palomar. 2013b. The role of statistical and semantic features in single-document Luhn, H. P. 1958. The automatic creation extractive summarization. Artificial Intel- of literature abstracts. IBM Journal of ligence Research, 2(3):35–44. research and development, 2(2):159–165, April. 68