<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Enhancing Essay Argument Persuasiveness Prediction Using a RoBERTa-LSTM Hybrid Model</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Fahad</forename><forename type="middle">M</forename><surname>Alzaidee</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">1University of York</orgName>
								<address>
									<postCode>YO10 5DD</postCode>
									<settlement>Heslington</settlement>
									<region>York</region>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Tommy</forename><surname>Yuan</surname></persName>
							<email>tommy.yuan@york.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">1University of York</orgName>
								<address>
									<postCode>YO10 5DD</postCode>
									<settlement>Heslington</settlement>
									<region>York</region>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Peter</forename><surname>Nightingale</surname></persName>
							<email>peter.nightingale@york.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">1University of York</orgName>
								<address>
									<postCode>YO10 5DD</postCode>
									<settlement>Heslington</settlement>
									<region>York</region>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Khaled</forename><surname>El Ebyary</surname></persName>
							<email>khaled.elebyary@york.ac.uk</email>
							<affiliation key="aff0">
								<orgName type="institution">1University of York</orgName>
								<address>
									<postCode>YO10 5DD</postCode>
									<settlement>Heslington</settlement>
									<region>York</region>
									<country key="GB">United Kingdom</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Enhancing Essay Argument Persuasiveness Prediction Using a RoBERTa-LSTM Hybrid Model</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">FC3D320A86D08FDB1A07C3E8A54C80A3</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T20:16+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Natural Language Processing</term>
					<term>Persuasiveness scoring</term>
					<term>Argument evaluation 2. Related Work</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Over the past five decades, automated essay scoring has been a significant focus in both research and industry, capturing the interest of the NLP community due to its potential to provide valuable educational tools that save time for educators worldwide. The persuasiveness of arguments is a key aspect of argumentative essay quality. However, despite its significance, the persuasiveness of arguments has often been overlooked in research, with most studies still in their infancy. In this paper, we introduce several neural models aimed at enhancing the prediction of the persuasiveness score of an argument. Our proposed model improved the prediction accuracy compared to the approach suggested in [1].</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In our globalizing world, learning English has become essential. Writing, a fundamental aspect of language learning, requires accurate assessment. Automated Essay Scoring (AES) offers a solution to the complex and time-consuming task of manual essay scoring. Even with standardised rubrics, manual scoring is often subjective and unreliable due to individual factors like mood and personality. AES provides an objective and efficient alternative, facilitating consistent evaluation of writing skills and supporting self-study through automated, unbiased, and instant feedback.</p><p>There are many types of essays, each serving different purposes and engaging readers in unique ways. This paper will focus on persuasive and argumentative essays, which are designed to convince the reader of a particular viewpoint through well-reasoned arguments and evidence. Argumentation, which can take various forms but generally involves presenting and defending a claim with supporting evidence or reasoning, is a critical skill for students to master. Effective argumentation not only strengthens academic performance but also equips students with essential communication skills for the real world. Automated Essay Scoring (AES) systems can play a significant role in this learning process by providing immediate, objective feedback, enabling students to refine their persuasive writing skills and develop stronger, more compelling arguments over time. Although previous studies (e.g., <ref type="bibr" target="#b0">[1]</ref>, <ref type="bibr" target="#b1">[2]</ref>]) have explored automated essay scoring and feedback, further research is necessary to fully realize this vision.</p><p>In this study, we explore new NLP model architectures for automatically scoring argumentative essays based on their persuasiveness. We also augmented the standard dataset used in <ref type="bibr" target="#b0">[1]</ref> by extracting arguments from each essay, paraphrasing them, and enriching them with supplementary information. These enhancements improve the dataset's overall quality and depth. detailed argumentative discourse units to examine persuasive strategies. Schaefer et al. <ref type="bibr" target="#b5">[6]</ref> identify key factors that influence the persuasiveness of a text, including the usage patterns of argument components, the structure of the essay, the flow and sequence of argument types, as well as the impact of the essay prompt and the individual author's style. Additionally, Wachsmuth et al. <ref type="bibr" target="#b6">[7]</ref> identify and annotate 15 dimensions-logical, rhetorical, and dialectical-relevant for automatically evaluating argument quality. Ke et al. <ref type="bibr" target="#b0">[1]</ref> introduce an artificial neural network, bidirectional LSTM model with attention mechanisms, to score metrics like persuasiveness, specificity, and strength using a neural network on their annotated dataset of 102 essays <ref type="bibr" target="#b7">[8]</ref>. Toledo et al. <ref type="bibr" target="#b8">[9]</ref> publish a new dataset with arguments annotated for quality and compared arguments in pairs to determine the stronger one. They utilised a BERT language model to generate numerical representations for words in both arguments and then fine-tuned it for the classification and ranking tasks. In 2023, another study <ref type="bibr" target="#b9">[10]</ref> used the PERSUADE dataset to predict persuasiveness ratings for discourse elements based on their type labels. Previous studies, such as <ref type="bibr" target="#b1">[2]</ref>, that evaluate the overall persuasiveness of an entire essay often provide generalized feedback, lacking the granularity needed to highlight specific areas for improvement. On the other hand, the study in <ref type="bibr" target="#b0">[1]</ref> offers feedback on different traits impacting the persuasiveness of various essay sections but still demonstrates only modest performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Dataset</head><p>The aspect of persuasiveness in essays has been annotated in several available datasets, such as <ref type="bibr" target="#b10">[11]</ref>, <ref type="bibr" target="#b11">[12]</ref>, and <ref type="bibr" target="#b7">[8]</ref>. However, datasets <ref type="bibr" target="#b10">[11]</ref> and <ref type="bibr" target="#b11">[12]</ref> lack the granularity required to train our model effectively. Therefore, we decided to use the dataset <ref type="bibr" target="#b7">[8]</ref>, which comprises 102 essays from the Annotated Essays corpus by Stab and Gurevych <ref type="bibr" target="#b12">[13]</ref>. Each essay is annotated with an argument tree, mirrors the natural flow of argumentative essays, avoiding cycles and maintaining clarity. These trees typically have three to four levels, beginning with the Major Claim, followed by Claims and Premises that support or challenge their parent nodes. The dataset includes 1,459 components: 185 Major Claims, 567 Claims, and 707 Premises, each of which is assigned with various score metrics, including persuasiveness, which is the focus of our work. The Krippendorff's values for persuasiveness annotations (0.739 for Major Claims, 0.701 for Claims, and 0.552 for Premises) indicate that the dataset is well-suited for training our model. The dataset was split into training and testing sets, with the training set representing 80% of the essays.</p><p>To address the dataset's limited size, we identified all possible arguments in each set, generating 1,459 distinct arguments. We created two different sequences for each argument: one based on the order of appearance in its original essay and the other using postorder traversal. Inspired by <ref type="bibr" target="#b13">[14]</ref>, we enriched each component with lexical and structural features as illustrated in Figure <ref type="figure">?</ref>?, increasing the maximum length from 58 to 85 words. Additionally, we paraphrased each argument component in the training set five times using a fine-tuned ChatGPT paraphraser on T5 (Text-to-Text Transfer Transformer). This resulted in four forms of input data: plain and enriched arguments using their order of appearance, and plain and enriched arguments using postorder traversal. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Proposed Methodology</head><p>In this study, we design four different neural models and compare their accuracy with two baseline models. Starting with baseline models, we fine-tuned the language model Longformer and the second model based on a Hierarchical BERT framework (HBM) <ref type="bibr" target="#b14">[15]</ref> designed for classifying long documents with limited labelled data. In the HBM-based model, we identify the argument components in each argument and independently convert their tokens, which can be words or subwords, into their numerical vectors with the RoBERTa encoder. We use mean pooling to average these vectors, creating a single representation for each argument component. The new computed vectors are then input into the sentence-level Hierarchical BERT encoder, generating an intermediate representation of the entire argument. We adapted this model to predict persuasiveness scores as continuous values, by adding a sigmoid activation function after the linear layer, multiplying its output by 6, and rounding to the nearest integer.</p><p>We designed four neural models, illustrated in Figures ?? and ??, combining a transformer model with an LSTM layer. RoBERTa and Longformer were used as embedding layers to generate a representation for each argument component in each argument, while the LSTM layer captured the dependency among the argument components. We refer to those models as LONG-LSTM, LONG-LSTM-TAG, ROB-LSTM, and ROB-LSTM-TAG. The term "TAG" in the model name indicates that the model uses a multi-output approach. In LONG-LSTM and LONG-LSTM-TAG, tokens for all argument components are embedded in a single pass using Longformer. Token embeddings are extracted and mean pooled to create a single representation for each component, which is then fed into an LSTM layer. In LONG-LSTM, the final hidden state is passed through a linear layer followed by a sigmoid activation function, producing a persuasiveness score between 0 and 6. The output is scaled by 6 and rounded to the nearest integer. MAE is then computed. In LONG-LSTM-TAG, each encoded argument component's hidden state is used to predict its persuasiveness score. ROB-LSTM and ROB-LSTM-TAG follow the same process except each argument component is encoded independently using RoBERTa. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Experiment Setup</head><p>We begin by randomly dividing our training set into five parts and perform five-fold cross-validation. In each experiment, four parts are used for training and one for development. After each iteration, we test For training the HBM-based model, we used a learning rate of 2 × 10 −5 , a dropout rate of 0.01, 50 epochs, a batch size of 4, and the Adam optimizer with a learning rate decay set to 1 × 10 −8 . For other models, we used a learning rate of 1 × 10 −3 , a dropout rate of 0.7, 50 epochs with a batch size of 4, and the Adam optimizer.</p><p>To evaluate the models' prediction accuracy, we used Mean Absolute Error (MAE) and Pearson's Correlation Coefficient (PC), computed after rounding the predicted scores. MAE measures the average distance between the predicted and actual scores. PC reflects the consistency and directionality of the predictions. We use the MAE instead of the Mean Squared Error (MSE) for its equal treatment of errors and reduced sensitivity to outliers. Additionally, it facilitates direct comparison with the models in <ref type="bibr" target="#b0">[1]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Evaluating the Effectiveness of Models</head><p>Table <ref type="table" target="#tab_0">1</ref> presents a summary of the evaluation experiments conducted on the test set. The leftmost column lists the various modeling approaches, while the top row identifies the different types of input data used. The HBM-based model shows the strongest correlations (PC) for plain arguments (0.309) and plain arguments (Postorder) (0.322). For rich arguments, the Fine-tuned Longformer achieves the highest correlation (0.702), while the ROB-LSTM-TAG model has the highest correlation (0.696) for rich arguments (Postorder).</p><p>The table clearly illustrates that incorporating rich features alongside argument content significantly improves model performance. Among the models, the ROB-LSTM stands out for its balanced performance, achieving a low MAE of 0.743 and a high PC of 0.691 in the 'Rich Argument (Postorder)' category. This suggests the potential effectiveness of leveraging hierarchical dependencies between argument components to predict persuasiveness. We also trained all models on the training set after paraphrasing each argument component, but this did not lead to improved results when tested on the test set.</p><p>The improvement in RoBERTA-based models is attributed to generating representations for each argument component independently, which reduces noise from other components in the same argument. In contrast, encoding the entire argument using the Longformer introduces more noise. Adding an LSTM layer further helps by separating the content of the argument component from its contextual dependency, thus reducing noise.</p><p>In comparison to the closest related work by <ref type="bibr" target="#b0">[1]</ref>, which was also trained on the same dataset and reported a MAE of 0.983 and a PC of 0.353, our ROB-LSTM model demonstrates a substantial </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion and Future Work</head><p>In this preliminary study, we explored different models to predict the persuasiveness score of arguments with varying complexity and structures. The RoBERTa-LSTM model demonstrated a balanced performance where it achieved a low MAE and a relatively high PC. The addition of rich features and the consideration of hierarchical order relations highlighted the potential benefits of these factors in improving persuasiveness prediction. A significant challenge we face is incorporating the types of relationships between arguments within an essay. We aim to understand how the persuasiveness of lower-level arguments in the argument tree affects the overall persuasiveness of related higher-level arguments. This understanding is crucial for developing a feedback component in our system that effectively leverages these relationships.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Enriching an argument</figDesc><graphic coords="2,72.00,610.39,203.07,97.11" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: longformer-based models</figDesc><graphic coords="3,72.00,427.01,203.08,213.66" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: RoBERTa-based models</figDesc><graphic coords="4,72.00,65.61,203.07,187.18" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Persuasiveness scores of all models on the test set , achieving a 24.4% reduction in MAE and an 95.8% increase in PC.</figDesc><table><row><cell></cell><cell cols="2">Plain</cell><cell cols="2">Plain Argument</cell><cell></cell><cell>Rich</cell><cell cols="2">Rich Argument</cell></row><row><cell>Model</cell><cell cols="2">Argument</cell><cell cols="2">(Postorder)</cell><cell cols="2">Argument</cell><cell cols="2">(Postorder)</cell></row><row><cell></cell><cell>MAE</cell><cell>PC</cell><cell>MAE</cell><cell>PC</cell><cell>MAE</cell><cell>PC</cell><cell>MAE</cell><cell>PC</cell></row><row><cell>HBM-based model</cell><cell>2.478</cell><cell>0.309</cell><cell>2.532</cell><cell>0.322</cell><cell>2.078</cell><cell>0.441</cell><cell>2.025</cell><cell>0.443</cell></row><row><cell cols="2">Fine-tuned Longformer 1.297</cell><cell>0.284</cell><cell>1.351</cell><cell>0.238</cell><cell>0.919</cell><cell>0.702</cell><cell>0.932</cell><cell>0.664</cell></row><row><cell>LONG-LSTM</cell><cell>1.419</cell><cell>0.125</cell><cell>1.257</cell><cell>0.320</cell><cell>1.000</cell><cell>0.566</cell><cell>1.041</cell><cell>0.513</cell></row><row><cell>LONG-LSTM-TAG</cell><cell>1.311</cell><cell>0.124</cell><cell>1.311</cell><cell>0.159</cell><cell>0.835</cell><cell>0.612</cell><cell>0.880</cell><cell>0.598</cell></row><row><cell>ROB-LSTM</cell><cell>1.230</cell><cell>0.251</cell><cell>1.203</cell><cell>0.223</cell><cell>0.757</cell><cell>0.685</cell><cell>0.743</cell><cell>0.691</cell></row><row><cell>ROB-LSTM-TAG</cell><cell>1.137</cell><cell>0.213</cell><cell>1.146</cell><cell>0.198</cell><cell>0.790</cell><cell>0.604</cell><cell>0.747</cell><cell>0.646</cell></row></table><note>improvement</note></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Learning to give feedback: Modeling attributes affecting argument persuasiveness in student essays</title>
		<author>
			<persName><forename type="first">Z</forename><surname>Ke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Carlile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Gurrapadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence</title>
				<meeting>the Twenty-Seventh International Joint Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="4130" to="4136" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Al: An adaptive learning support system for argumentation skills</title>
		<author>
			<persName><forename type="first">T</forename><surname>Wambsganss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Niklaus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cetto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Söllner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Handschuh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">M</forename><surname>Leimeister</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems</title>
				<meeting>the 2020 CHI Conference on Human Factors in Computing Systems</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="1" to="14" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Why can&apos;t you convince me? modelling weaknesses in unpersuasive arguments</title>
		<author>
			<persName><forename type="first">I</forename><surname>Persing</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th International Joint Conference on Artificial Intelligence</title>
				<meeting>the 26th International Joint Conference on Artificial Intelligence</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="page" from="4082" to="4088" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Recognizing insufficiently supported arguments in argumentative essays</title>
		<author>
			<persName><forename type="first">C</forename><surname>Stab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
		<ptr target="https://api.semanticscholar.org/CorpusID:6801402" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</title>
				<meeting>the 15th Conference of the European Chapter of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">A news editorial corpus for mining argumentation strategies</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">Al</forename><surname>Khatib</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wachsmuth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Kiesel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Hagen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers</title>
				<meeting>COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers</meeting>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="3433" to="3443" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Towards fine-grained argumentation strategy analysis in persuasive essays</title>
		<author>
			<persName><forename type="first">R</forename><surname>Schaefer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Knaebel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Stede</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2023.argmining-1.8</idno>
		<ptr target="https://aclanthology.org/2023.argmining-1.8.doi:10.18653/v1/2023.argmining-1.8" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th Workshop on Argument Mining, Association for Computational Linguistics</title>
				<editor>
			<persName><forename type="first">M</forename><surname>Alshomary</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C.-C</forename><surname>Chen</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Muresan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Park</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Romberg</surname></persName>
		</editor>
		<meeting>the 10th Workshop on Argument Mining, Association for Computational Linguistics<address><addrLine>Singapore</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="76" to="88" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Computational argumentation quality assessment in natural language</title>
		<author>
			<persName><forename type="first">H</forename><surname>Wachsmuth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Naderi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Hou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Bilu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Prabhakaran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Thijm</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Hirst</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</title>
				<meeting>the 15th Conference of the European Chapter of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="176" to="187" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Give me more feedback: Annotating argument persuasiveness and related attributes in student essays</title>
		<author>
			<persName><forename type="first">W</forename><surname>Carlile</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Gurrapadi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Ke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ng</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics</title>
		<title level="s">Long Papers</title>
		<meeting>the 56th Annual Meeting of the Association for Computational Linguistics</meeting>
		<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="621" to="631" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">A</forename><surname>Toledo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gretz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Cohen-Karlik</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Friedman</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Venezian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lahav</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jacovi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Aharonov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Slonim</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1909.01007</idno>
		<title level="m">Automatic argument quality assessment-new datasets and methods</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Hicke</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Tian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">H</forename><surname>Kim</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2307.04276</idno>
		<title level="m">Automated essay scoring in argumentative writing: Deberteachingassistant</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Recognizing insufficiently supported arguments in argumentative essays</title>
		<author>
			<persName><forename type="first">C</forename><surname>Stab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/E17-1092" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics</title>
		<title level="s">Long Papers, Association for Computational Linguistics</title>
		<editor>
			<persName><forename type="first">M</forename><surname>Lapata</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Blunsom</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Koller</surname></persName>
		</editor>
		<meeting>the 15th Conference of the European Chapter of the Association for Computational Linguistics<address><addrLine>Valencia, Spain</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2017">2017</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="980" to="990" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">ICLE++: Modeling fine-grained traits for holistic essay scoring</title>
		<author>
			<persName><forename type="first">S</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Ng</surname></persName>
		</author>
		<idno type="DOI">10.18653/v1/2024.naacl-long.468</idno>
		<ptr target="https://aclanthology.org/2024.naacl-long.468.doi:10.18653/v1/2024.naacl-long.468" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</title>
		<title level="s">Long Papers</title>
		<editor>
			<persName><forename type="first">K</forename><surname>Duh</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Gomez</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Bethard</surname></persName>
		</editor>
		<meeting>the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies<address><addrLine>Mexico City, Mexico</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="page" from="8465" to="8486" />
		</imprint>
	</monogr>
	<note>Association for Computational Linguistics</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Annotating argument components and relations in persuasive essays</title>
		<author>
			<persName><forename type="first">C</forename><surname>Stab</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Gurevych</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers</title>
				<meeting>COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="1501" to="1510" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Argument classification with bert plus contextual, structural and syntactic features as text</title>
		<author>
			<persName><forename type="first">U</forename><surname>Mushtaq</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Cabessa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Neural Information Processing</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="622" to="633" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A sentence-level hierarchical bert model for document classification with limited labelled data</title>
		<author>
			<persName><forename type="first">J</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Henchion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Bacher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">Mac</forename><surname>Namee</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Discovery Science: 24th International Conference, DS 2021</title>
				<meeting><address><addrLine>Halifax, NS, Canada</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2021">October 11-13, 2021. 2021</date>
			<biblScope unit="page" from="231" to="241" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
