<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Automatic Generation of Research Highlights from Scientific Abstracts</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Tohida</forename><surname>Rehman</surname></persName>
							<email>tohida.rehman@gmail.com</email>
							<affiliation key="aff0">
								<orgName type="institution">Jadavpur University Kolkata</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Debarshi</forename><forename type="middle">Kumar</forename><surname>Sanyal</surname></persName>
							<email>debarshisanyal@gmail.com</email>
							<affiliation key="aff1">
								<orgName type="institution">Indian Association for the Cultivation of Science Kolkata</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Samiran</forename><surname>Chattopadhyay</surname></persName>
							<email>samirancju@gmail.com</email>
							<affiliation key="aff2">
								<orgName type="laboratory">TCG CREST</orgName>
								<orgName type="institution">Jadavpur University Kolkata</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Plaban</forename><surname>Kumar Bhowmick</surname></persName>
							<email>plaban@cet.iitkgp.ac.in</email>
							<affiliation key="aff3">
								<orgName type="institution">IIT Kharagpur</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Partha</forename><forename type="middle">Pratim</forename><surname>Das</surname></persName>
							<affiliation key="aff4">
								<orgName type="institution">IIT Kharagpur</orgName>
								<address>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Automatic Generation of Research Highlights from Scientific Abstracts</title>
					</analytic>
					<monogr>
						<imprint>
							<date/>
						</imprint>
					</monogr>
					<idno type="MD5">B90CCB098B5502D8099872C403997D55</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-24T15:20+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>CCS CONCEPTS</term>
					<term>Information systems → Information extraction; Summarization Pointer-generator network</term>
					<term>Deep learning</term>
					<term>Natural language generation</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>The huge growth in scientific publications makes it difficult for researchers to keep track of new research even in narrow sub-fields. While an abstract is a traditional way to present a high level view of the paper, recently it is getting supplemented with research highlights that explicitly identify the important findings in the paper. In this poster, we aim to automatically construct research highlights given the abstract of a paper. We use deep neural network-based models for this purpose and achieve high ROUGE and METEOR scores on a large corpus of computer science papers.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1">INTRODUCTION</head><p>The count of scientific publications doubles roughly every 9 years <ref type="bibr" target="#b9">[10]</ref>, making it hard for researchers to track even their own fields. One recent trend is to provide research highlights -a bulleted list of the main contributions of the paper -along with the abstract and the main text. They are potentially easier to read than abstracts, especially on mobile devices, and focus more on findings than on background. Additionally research highlights could be useful for other tasks like finding surrogates for access-restricted papers <ref type="bibr" target="#b4">[5,</ref><ref type="bibr" target="#b6">7]</ref> and keyphrase extraction <ref type="bibr" target="#b5">[6]</ref>. We use a pointer-generator network with coverage mechanism to automatically generate highlights given the abstract of a research paper. Distinct from a prior work <ref type="bibr" target="#b1">[2]</ref> that classifies sentences in the full text as highlights or not, our focus is on generation of highlights.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2">METHODOLOGY</head><p>We use a dataset released by Collins et al. <ref type="bibr" target="#b1">[2]</ref> containing URLs of 10142 computer science publications from ScienceDirect 1 . Each 1 https://www.sciencedirect.com/ EEKE '21, September 30, 2021, Online example in the dataset is organized as (abstract, author-written research highlights): 8115 pairs are used for training, 1014 pairs for validation and 1013 pairs for testing. In this dataset, the average abstract size is 186 words while that of highlights is 52; for 98% of the papers, highlights are 1.5 times or more shorter than the abstract.</p><p>We have used three deep learning-based models to generate research highlights. Model 1 is the sequence-to-sequence (seq2seq) model with attention <ref type="bibr" target="#b2">[3]</ref>. Each abstract is tokenized and the tokens are converted to 128-dimensional GloVe vectors <ref type="bibr" target="#b3">[4]</ref> that are sequentially fed into the encoder which is a single-layer bidirectional Long Short-Term Memory (BiLSTM). The decoder is a single-layer unidirectional LSTM. The model uses neural attention <ref type="bibr" target="#b0">[1]</ref> to attend to the words in the source document while generating the target words for the summary. Model 2 is a pointer-generator network <ref type="bibr" target="#b7">[8]</ref>, which augments the above seq2seq model with a special copying mechanism. When generating words, the decoder probabilistically decides between generating new words from the vocabulary (i.e. from the training corpus) and copying words from the input abstract (by sampling from the attention distribution). While the generator helps in novel paraphrasing, copying helps to tackle out-of-vocabulary (OOV) words. Model 3 augments the second model with coverage mechanism of Tu et al. <ref type="bibr" target="#b8">[9]</ref> to avoid erroneously repeating the same words during decode. For all the models, we used the same vocabulary of around 50K tokens, beam search in the decoder with size 4, maximum input size of 400 tokens and maximum output size of 100 tokens.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3">RESULTS &amp; ANALYSIS</head><p>Results are shown in Table <ref type="table" target="#tab_0">1</ref>   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4">CONCLUSION</head><p>We applied three different deep neural models to generate research highlights from the abstract of a research paper. The pointer-generator network with coverage mechanism achieved the best performance.</p><p>But the predicted research highlights are not yet perfect. A simple post-processing operation could be to remove sentences that contain entities that are absent in the given abstract. We are currently exploring this and other techniques to improve the system.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>for ROUGE-1, ROUGE-2, ROUGE-L and METEOR as (R)ecall, (P)recision and (F1)-score. Author-written highlights are used as the golden output. Model 3 (pointer-generator model with coverage mechanism) always achieved highest F1-score. In the case study in Fig. 1, Model 1 generated many OOV words and factual errors . Model 2 generates more meaningful research highlights and even relevant novel words that capture the context of the paper much better. Model 2 sometimes outputs repeating words but Model 3 reduces them. The first sentence from Model 3 Copyright 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1 :</head><label>1</label><figDesc>Comparison of models for research highlight generation. Multiscale computation for transient heat conduction problem with radiation boundary condition in porous materials Abstract: This paper reports a multiscale asymptotic analysis and computation for predicting heat transfer performance of periodic porous materials with radiation boundary condition. In these porous materials thermal radiation effect at micro-scale have an important impact on the macroscopic temperature field, which is our particular interest in this study. The multiscale asymptotic expansions for computing temperature field of the problem are constructed, and associated explicit convergence rates are obtained on some regularity hypothesis. Finally, the corresponding finite element algorithms based on the multiscale method are brought forward and some numerical results are given in details. The numerical tests indicate that the developed method is feasible and valid for predicting the heat transfer performance of periodic porous materials, and support the approximate convergence results proposed in this paper. Author-written highlights: A novel multiscale analysis and computation is proposed. Heat transfer problem of periodic porous materials with radiation boundary condition are considered. Error estimates of the multiscale approximate solution are derived on some regularity hypothesis. Some numerical results are given in details to validate the multiscale method. Output of Model 1: A non-intrusive numerical tool is developed for solar artery supply planning . The results were analyzed, based on the fe modeling of finite element model . The approaches provides practical advantages of the cohort and accuracy of concrete equipment . Agent-based fe technology neural network procedures are used for assessment assessment. Results obtained from a real composite sample are considered and discussed. Output of Model 2: This paper reports a multiscale asymptotic analysis and developed protocol. The proposed approach is based on regularity hypothesis expansions . The proposed method is robust and can achieve predicting heat transfer performance . The proposed method is robust and efficient for given bone microstructure samples .</figDesc><table><row><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>ROUGE</cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell></cell><cell>METEOR</cell></row><row><cell></cell><cell></cell><cell>ROUGE-1</cell><cell></cell><cell></cell><cell>ROUGE-2</cell><cell></cell><cell></cell><cell>ROUGE-L</cell><cell></cell><cell></cell><cell cols="3">Synonym/paraphrase/stem</cell></row><row><cell></cell><cell>R</cell><cell>P</cell><cell>F1</cell><cell>R</cell><cell>P</cell><cell>F1</cell><cell>R</cell><cell>P</cell><cell>F1</cell><cell>R</cell><cell>P</cell><cell>F1</cell><cell>Final Score</cell></row><row><cell>Model 1</cell><cell cols="12">20.90 20.47 19.90 02.02 02.02 1.93 19.49 19.16 18.58 17.86 17.69 17.78</cell><cell>7.39</cell></row><row><cell>Model 2</cell><cell cols="2">30.99 32.07</cell><cell>30.9</cell><cell>7.48</cell><cell cols="8">8.06 7.55 28.66 30.34 28.62 25.53 26.61 26.06</cell><cell>11.04</cell></row><row><cell>Model 3</cell><cell cols="4">31.6 33.32 31.46 8.52</cell><cell>9.2</cell><cell cols="7">8.57 29.2 30.9 29.14 27.64 29.26 28.43</cell><cell>12.01</cell></row><row><cell cols="4">Title: Output of Model 3: Reports a</cell><cell cols="4">multiscale asymptotic analysis</cell><cell cols="6">without object propagation using minimal porous properties .</cell></row></table><note>Predicting heat transfer performance of periodic porous materials with radiation boundary condition. Finite element algorithms and computation of approximate convergence results . Figure 1: Original abstract, author-written research highlights and model-generated research highlights. The meaning of the colors (e.g., green = correct) is explained in main text. Abstract taken from https:/</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>/www.sciencedirect.com/science/article/abs/pii/S0168874X15000621 contains</head><label></label><figDesc>words ( 'without object ... properties' ) that do not fit into the context, but its other highlights are meaningful.</figDesc><table /></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>ACKNOWLEDGMENTS</head><p>This work is supported by research grant from Department of Science and Technology, Government of India at Indian Association for the Cultivation of Science, Kolkata and National Digital Library of India Project sponsored by the Ministry of Education, Government of India at IIT Kharagpur.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Neural machine translation by jointly learning to align and translate</title>
		<author>
			<persName><forename type="first">Dzmitry</forename><surname>Bahdanau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Kyunghyun</forename><surname>Cho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yoshua</forename><surname>Bengio</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICLR</title>
				<imprint>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A supervised approach to extractive summarisation of scientific papers</title>
		<author>
			<persName><forename type="first">Ed</forename><surname>Collins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Isabelle</forename><surname>Augenstein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sebastian</forename><surname>Riedel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CoNLL</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Abstractive text summarization using sequence-to-sequence RNNs and beyond</title>
		<author>
			<persName><forename type="first">Ramesh</forename><surname>Nallapati</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bowen</forename><surname>Zhou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Caglar</forename><surname>Gulcehre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Bing</forename><surname>Xiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CoNLL</title>
				<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="280" to="290" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">GloVe: Global vectors for word representation</title>
		<author>
			<persName><forename type="first">Jeffrey</forename><surname>Pennington</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Richard</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EMNLP</title>
				<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1532" to="1543" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Surrogator: A tool to enrich a digital library with open access surrogate resources</title>
		<author>
			<persName><forename type="first">Debarshi</forename><surname>Tyss Santosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Plaban</forename><surname>Kumar Sanyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Partha</forename><forename type="middle">Pratim</forename><surname>Kumar Bhowmick</surname></persName>
		</author>
		<author>
			<persName><surname>Das</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">JCDL</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="379" to="380" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">DAKE: Document-Level Attention for Keyphrase Extraction</title>
		<author>
			<persName><forename type="first">Tokala</forename><surname>Yaswanth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Sri</forename><surname>Sai Santosh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Debarshi</forename><surname>Kumar Sanyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Plaban</forename><surname>Kumar Bhowmick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Partha</forename><forename type="middle">Pratim</forename><surname>Das</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ECIR</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Enhancing access to scholarly publications with surrogate resources</title>
		<author>
			<persName><forename type="first">Debarshi</forename><surname>Kumar Sanyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Plaban</forename><surname>Kumar Bhowmick</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Partha</forename><forename type="middle">Pratim</forename><surname>Das</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Samiran</forename><surname>Chattopadhyay</surname></persName>
		</author>
		<author>
			<persName><surname>Santosh</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Scientometrics</title>
		<imprint>
			<biblScope unit="volume">121</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="1129" to="1164" />
			<date type="published" when="2019">2019. 2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Get to the point: Summarization with pointer-generator networks</title>
		<author>
			<persName><forename type="first">Abigail</forename><surname>See</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Peter</forename><forename type="middle">J</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Christopher</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACL</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Modeling coverage for neural machine translation</title>
		<author>
			<persName><forename type="first">Zhaopeng</forename><surname>Tu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Zhengdong</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Yang</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Xiaohua</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Hang</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ACL</title>
				<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Global scientific output doubles every nine years</title>
		<author>
			<persName><forename type="first">Richard</forename><surname>Van Noorden</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Nature news blog</title>
		<imprint>
			<date type="published" when="2014">2014. 2014</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
