<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Eevvgg at CheckThat! 2024: Evaluative Terms, Pronouns and Modal Verbs as Markers of Subjectivity in Text Notebook for the CheckThat! Lab at CLEF 2024</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author role="corresp">
							<persName><forename type="first">Ewelina</forename><surname>Gajewska</surname></persName>
							<email>gajewska.dokt@pw.edu.pl</email>
							<affiliation key="aff0">
								<orgName type="institution">Warsaw University of Technology</orgName>
								<address>
									<addrLine>Plac Politechniki 1</addrLine>
									<postCode>00-661</postCode>
									<settlement>Warsaw</settlement>
									<country key="PL">Poland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Eevvgg at CheckThat! 2024: Evaluative Terms, Pronouns and Modal Verbs as Markers of Subjectivity in Text Notebook for the CheckThat! Lab at CLEF 2024</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">2966F7BC34F22A13DDA7EDF4D9169A74</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:58+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>stance</term>
					<term>subjectivity</term>
					<term>fake news</term>
					<term>text classification</term>
					<term>information extraction</term>
					<term>opinion mining</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This work tests performance of simple machine learning algorithms against large language models (LLMs) utilising transfer learning for a binary detection of subjectivity in news articles. Second, the influence of feature normalisation on classification performance is examined. Third, the work measures impact of training data size on subjectivity extraction. The proposed BERTd model that makes use of additional information about stance markers in news articles was placed 8th in the official ranking of the CLEF 2024 CheckThat! lab Task 2: Subjectivity in News Articles competition for English data, achieving 0.70 macro-averaged 𝐹1. Models that distinguish subjective opinions from objective facts could be utilised in studies on information verification (detection of fake news, understood as a mixture of subjective opinion and facts).</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Subjectivity is inherently encoded in language and involves expressions of the speaker's position, attitude, and feelings towards the uttered message <ref type="bibr" target="#b16">[17]</ref>. Thus, identification of articles written from a subjective perspective of the author involves detection of stance markers: words that express some form of evaluation or judgement (e.g. words denoting emotional valence), pronouns or modal verbs and passive constructions <ref type="bibr" target="#b15">[16]</ref>. This work makes use of syntactic and semantic features that are fed to machine learning algorithms and large language models (LLMs) for a binary detection of news articles written from a subjective versus objective perspective of the author <ref type="bibr" target="#b3">[4]</ref>. The datasets have been provided by the organizers of the CLEF 2024 CheckThat! lab <ref type="bibr" target="#b4">[5]</ref>, which is the international contest on challenging classification and retrieval problems. The aim of this competition is to advance the field of information retrieval from text. To this end, a new approach to subjectivity detection is proposed: a Transformer-based model that makes use of both text content and additional meta features derived from articles. Such an approach shows more consistent results than simple transfer learning (TL; adding a classification layer on top of BERT encoder) fed with textual content only. The current work describes the approach by the eevvgg team for Task 2: Subjectivity in News Articles of the CLEF 2024 CheckThat! lab in English news articles.</p><p>It is a common approach in information retrieval research to experiment with different text representation models, as in <ref type="bibr" target="#b5">[6]</ref> where conversion of text with TF-IDF (Term Frequency Inverse Document Frequency) algorithm outperformed models with so-called Count Vectorizer, that is, a simple frequency count of particular terms in each text sample. Influence of preprocessing techniques on classification performance of deep learning models was tested in <ref type="bibr" target="#b2">[3]</ref>. Machine learning methods are among fundamental methods used in the field of natural language processing <ref type="bibr" target="#b1">[2]</ref>. They are used, for example, in web search engines for information retrieval <ref type="bibr" target="#b6">[7]</ref>. Thereby, a user looking for specific information gets results relevant to the searched topic. Previous studies investigated performance of several machine learning models: logistic regression, SVM, and Naive Bayes for text classification tasks such as recognition of political affiliation of the US presidential candidates from their presidential campaign speeches <ref type="bibr" target="#b0">[1]</ref>. Contribution of this work is three-fold: first, examination of machine learning vs. transfer learning approaches for information extraction; second, impact of feature normalisation on classification performance; third, influence of training data size on detection performance. Models for detection of subjective opinions versus objective facts could be utilised, for example, in studies on information verification (detection of fake news, which are a mixture of subjective opinion and facts <ref type="bibr" target="#b7">[8]</ref>).</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Differentiation between subjective opinions and objective facts compose a basis of proper journalism. Subjective reporting of news might reflect bias of the author which automatic solutions can identify and tag properly to reduce the spread of fake news, for example. Several work have experimented with subjectivity detection methods from text. Unified method for detection of subjectivity in multilingual text content was proposed in <ref type="bibr" target="#b11">[12]</ref>. Fine-tuned ELECTRA large model outperformed other large language models (such as BERT and RoBERTa) for subjectivity analysis in text in the context of fake news detection, achieving 0.983 accuracy <ref type="bibr" target="#b19">[20]</ref>. Such analyses have proved to be useful for detecting fake news with a lexicon-based approach <ref type="bibr" target="#b10">[11]</ref> as well as in opinion mining using deep learning techniques <ref type="bibr" target="#b17">[18]</ref>. The current work extends these studies by combining previous approaches and testing the proposed model in several experimental settings. Team eevvgg proposes a deep learning model that combines the fine-tuned BERT encoder <ref type="bibr" target="#b8">[9]</ref> and lexicon-based method for extracting linguistic markers of subjectivity. The proposed architecture is called BERTd and is tested against other BERT-based models and machine learning algorithms in two experimental conditions: on a smaller vs. a larger set of data (n=667 vs. n=1511 training samples, and n=166 vs. n=484 test samples, respectively); and on raw vs. normalised values of linguistic meta features.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Methodology</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Material</head><p>This work deals with binary detection of subjectivity from text material: whether a sentence expresses a subjective view of the author (SUBJ) or presents an objective view on the topic (OBJ). The paper proposes solutions for subjectivity extraction for English data. Initial training data comprises 833 text samples (data units) from newspaper articles: short pieces of text of up to 100 words. Development of a text classification system starts with pre-processing of the data, then splitting it into training and testing sets, extraction of features, model training and model validation. Then, the dataset was divided into two splits: 80% for training purposes and 20% for testing. Results of training and evaluation on this set (called small set) are reported in Table <ref type="table" target="#tab_0">1</ref>. Evaluation and training is conducted also on a bigger setofficial test set -released by the CheckThat! organizers after submission deadline and training data comprising all available train and dev sets -1511 in total; evaluation is conducted on 484 samples from the official test. Results of training and evaluation on this set (called big set) are reported in Table <ref type="table">2</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Text Preprocessing</head><p>Usually, the first step in information retrieval tasks is to represent the text using a certain model. A common approach is to represent a document as a vector of features -most simple representations of text include Bag-of-Words (BOW) models. Regarding machine learning algorithms TF-IDF method is employed (with max_df=0.75, min_df=2). Transfer learning approaches utilise BERT encoder (BERT base uncased) as a text representation method. In order to improve the predictive performance of these models, several meta features were constructed from the textual content of news articles using the concept of feature engineering <ref type="bibr" target="#b13">[14]</ref>. It involves the application of transformation functions on given features to generate new ones. In order to extract such features from text, it needs to be normalised. Data cleaning involved 3 steps: conversion of text into lowercase, removal of stop-words and punctuation symbols. Then, the text was lemmatised, that is, words were converted to their dictionary forms. Finally, linguistic features were extracted from the clean and lemmatised text of articles. In total, 16 syntactic features (stance markers) were extracted from text samples, specifically, frequency of occurrence of each category of terms in a given text sample. Specific terms that belong to each category are specified below:</p><p>1. Subject pronouns: I, you, he, she, it, we, you, they; 2. Object pronouns: me, you, him, her, it, us, you, them; 3. Possessive pronouns: mine, yours, his, hers, its, ours, yours, theirs; 4. Demonstrative pronouns: this, these, that, those 5. Interrogative pronouns: who, whom, which, what 6. Relative pronouns: who, whom, that, which, whoever, whichever, whomever; 7. Indefinite pronouns: all, another, any, anybody, anyone, anything, each, everybody, everyone, everything, few, many, nobody, none, one, several, some, somebody, someone; 8. Reflexive pronouns: myself, yourself, himself, herself, ourselves, yourselves, themselves 9. Modal verbs: must, shall, will, should, would, can, could, may, might; 10. Obligation verbs: need, have to, must, might, may, has to, shall; 11. Frequency adverbs: hardly, ever, rarely, scarcely, seldom, never, sometimes, often, always, usually, normally; 12. Comparison adverbs: bad, badly, worse, worst, good, better, well, best, far, farther, further, farthest, furthest, little, less, least, few, somehow; 13. Reporting verbs: advise, agree, challenge, claim, decide, demand, encourage, invite, offer, persuade, promise, refuse, remind, say; 14. Pronouns: a sum of features 1-8; 15. Emotive words: words associated with an expression of emotions marked as such in the lexicon of emotion-laden terms <ref type="bibr" target="#b12">[13]</ref>; 16. Polarising words: words associated with inducing social polarisation (dividing the society into 'us' versus 'them' groups) marked as such in the lexicon of polarising terms <ref type="bibr" target="#b18">[19]</ref>;</p><p>The summary of preprocessing procedure is illustrated in Figure <ref type="figure" target="#fig_0">1</ref>. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Tools</head><p>In the light of available algorithms, scikit-learn map of estimators is followed in order to chose four of them: Naive Bayes (NB), logistic regression (LR), decision trees (DT) and decision forests (DF). The Naive Bayes is one of the simplest and most popular models in the field of supervised machine learning <ref type="bibr" target="#b1">[2]</ref> and amongst the most efficient and effective classifiers <ref type="bibr" target="#b9">[10]</ref>. Classification with the use of this estimator is based on the calculated probabilities: probability of a certain label for a given data point is estimated through multiplying the probability of this label by a sum of probabilities of all features describing this data point given the label <ref type="bibr" target="#b6">[7]</ref>. The NB algorithm learns probabilities based on prior distribution across classes from the training data following the assumption that all features are independent. SVM models learn to categorise data into separate classes by building a margin in the feature space that minimizes the distance between each class and that margin <ref type="bibr" target="#b20">[21]</ref>. The logistic regression algorithm calculates the log-odds (converted into probabilities) of an event as a linear combination of one or more independent variables. The Decision Tree model predicts the value of a target variable by learning decision rules, which are inferred from the training data. DT builds a tree-like structure from these rules by splitting data into subsets based on the values of particular features until a stopping criterion is met. DTs have two main advantages of being simple to understand and interpret <ref type="bibr" target="#b14">[15]</ref>. A decision tree forest is an ensemble learning method that combines outputs of multiple decision trees to reach the final result. Finally, the suitability of large language models for subjectivity detection is investigated. Specifically, BERT uncased model<ref type="foot" target="#foot_0">1</ref> ) that produces contextualised text embeddings and achieves state of the art results in most information retrieval tasks. Transfer learning paradigm is utilised for training BERT-based models. Tensorflow and transformers libraries are employed for their implementation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Experimental Settings</head><p>Machine Learning. Default settings of hyper-parameters set by the authors of the scikit-learn library are employed in machine learning estimators. TF-IDF method is utilised as a text representation method.</p><p>Textual features are combined with (normalised) meta features described in Section 2.1 and fed to these algorithms.</p><p>Transfer Learning. Regarding TL with LLMs, BERT is utilised for as a text encoder, specifically the CLS token. Then, additional layers are added on top of embeddings returned by the BERT-encoder. Two BERT-based models are proposed: BERTs comprises of two fully-connected layers<ref type="foot" target="#foot_1">2</ref> of size 128 and a 0.5 dropout rate between them; in BERTd two fully-connected layers of size 246 and 32, separated by a 0.4 dropout layer are attached to the BERT encoder. In addition BERTb is utilised a baseline model with only a classification layer added on top of the BERT encoder. Cross-entropy is employed as the loss function and rectified linear unit (ReLU) as the activation function in all hidden layers. Classification layer comprises of two units, which in combination with the softmax function returns probabilities that a given text sample belong to class "OBJ" and "SUBJ". Learning rate is set to 5e-5 as advised by the authors of the transformers library. BERT-based models are trained for either 2, 3 or 4 epochs and the best result is reported. Tensorflow implementation of these networks and functions is employed.</p><p>Evaluation metrics. Three metrics are employed for evaluation purposes: weighted 𝐹 1 (Eq. 2), macro-averaged 𝐹 1 (Eq. 3) and accuracy (Eq. 4).</p><formula xml:id="formula_0">𝐹 1 = 2 × 𝑇 𝑃 2 × 𝑇 𝑃 + 𝐹 𝑃 + 𝐹 𝑁<label>(1)</label></formula><p>𝑊 𝑒𝑖𝑔ℎ𝑡𝑒𝑑𝐹 1 =</p><formula xml:id="formula_1">∑︀ 𝑛 𝑖=1 ×𝑤 𝑖 × 𝐹 1 𝑖 𝑛<label>(2)</label></formula><p>𝑀 𝑎𝑐𝑟𝑜𝐹 1 =</p><formula xml:id="formula_2">∑︀ 𝑛 𝑖=1 ×𝐹 1 𝑖 𝑛<label>(3)</label></formula><formula xml:id="formula_3">𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = 𝑇 𝑃 + 𝑇 𝑁 𝑇 𝑃 + 𝑇 𝑁 + 𝐹 𝑃 + 𝐹 𝑁<label>(4)</label></formula><p>where TP: true positives, FP: false positives, TN: true negatives, FN: false negatives.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Results</head><p>Results on a small train set, reported in Table <ref type="table" target="#tab_0">1</ref>, indicate that BERTb, a simple transfer learning approach, outperforms all other classifiers. Nonetheless, all tested models perform above baseline using prior label distribution. Normalisation of additional features boosts performance for ML algorithms (NB, LR, DT, RB), although BERT-based models note minor differences in performance. All models developed with a TL approach outperform all ML algorithms achieving, on average, 27% higher results (23% for normalised meta features and 30% for models without normalisation or meta features) in terms of macro 𝐹 1 .</p><p>The proposed BERTd model (officially submitted to the task), consisting of two hidden layers and fed with data without normalisation of meta features, outperforms other solutions in turn on the bigger set (see Table <ref type="table">2</ref>). The difference in performance between ML and TL approaches decreases to 20% -due to lower macro 𝐹 1 of two TL models: BERTb and BERTs. Compared to BERTb and BERTs, BERTd notes smaller differences in performance between small and big training sets. Thus, its performance is more stable across data than BERTb and BERTs. Nonetheless, all models outperform a baseline classifier using class prior distribution. All TL models and the LR classifier achieve also higher macro 𝐹 1 than the baseline provided by the CheckThat! organisers.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Results: big set. Classification performance of machine learning (ML) and transfer learning (TL) models. A single asterisk signals models that perform above baseline in all three metrics (accuracy, weighted (w-𝐹 1 ) and macro (m-𝐹 1 ) 𝐹 1 scores). Double asterisk signals best model in each category of machine learning (ML) and transfer learning (TL) models. A plus indicates models that achieve higher results than the baseline provider by CheckThat! organizers. In bold the best performing model is marked.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Category Model</head><p>Meta features w-𝐹 1 m-𝐹 </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>Once again, transfer learning outperformed simpler machine learning approaches for information extraction from text. All BERT-based models (with or without syntactic features) achieved substantially higher results for a binary detection of subjectivity than 4 ML algorithms. BERTd fed with additional features (syntactic features of marker) shows more consistent performance than a baseline BERTb model comprising of BERT encoder and a classification layer. Normalisation of meta features was found to boost performance for ML models. Increase of training data size had almost no impact on prediction performance for ML models and a negative influence for TL solutions in terms of macro 𝐹 1 metric.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Limitations</head><p>Baselines. The current work would benefit from a more thorough analysis of results of the proposed models against other systems developed on the employed dataset.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Generalisability of performance.</head><p>The proposed architecture notes satisfactory performance on the utilised dataset (ranking 8th in the official leaderboard of the competition on English data), however, its robustness is yet to be tested, for example, in scenarios with different training datasets or multilingual data.</p><p>Ablation studies. Ablation analysis in future work could measure the impact of individual features on the final performance of the proposed subjectivity detectors.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Pipeline for extraction of meta features (linguisitc markers of subjectivity).</figDesc><graphic coords="3,87.80,517.80,419.69,54.82" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Results: small set. Classification performance of machine learning (ML) and transfer learning (TL) models. A single asterisk signals models that perform above baseline in all three metrics (accuracy, weighted (w-𝐹 1 ) and macro (m-𝐹 1 ) 𝐹 1 scores). Double asterisk signals best model in each category of machine learning (ML) and transfer learning (TL) models. In bold the best performing model is marked.</figDesc><table><row><cell cols="2">Category Model</cell><cell>Meta features</cell><cell cols="3">w-𝐹 1 m-𝐹 1 Accuracy</cell></row><row><cell>Baseline</cell><cell>Stratified</cell><cell>-</cell><cell>0.54</cell><cell>0.51</cell><cell>0.54</cell></row><row><cell></cell><cell>NB *</cell><cell>1-16</cell><cell>0.56</cell><cell>0.51</cell><cell>0.63</cell></row><row><cell></cell><cell>LR *</cell><cell>1-16</cell><cell>0.57</cell><cell>0.53</cell><cell>0.61</cell></row><row><cell></cell><cell>DT *</cell><cell>1-16</cell><cell>0.57</cell><cell>0.55</cell><cell>0.57</cell></row><row><cell>ML</cell><cell>RF * NB *</cell><cell cols="2">1-16 normalised 1-16 0.68 0.53</cell><cell>0.48 0.57</cell><cell>0.59 0.68</cell></row><row><cell></cell><cell>LR **</cell><cell cols="2">normalised 1-16 0.70</cell><cell>0.60</cell><cell>0.71</cell></row><row><cell></cell><cell>DT *</cell><cell cols="2">normalised 1-16 0.62</cell><cell>0.52</cell><cell>0.61</cell></row><row><cell></cell><cell>RF *</cell><cell cols="2">normalised 1-16 0.70</cell><cell>0.57</cell><cell>0.72</cell></row><row><cell></cell><cell>BERTb **</cell><cell>-</cell><cell>0.78</cell><cell>0.76</cell><cell>0.78</cell></row><row><cell></cell><cell>BERTs *</cell><cell>1-16</cell><cell>0.76</cell><cell>0.75</cell><cell>0.76</cell></row><row><cell>TL</cell><cell>BERTd *</cell><cell>1-16</cell><cell>0.75</cell><cell>0.73</cell><cell>0.75</cell></row><row><cell></cell><cell>BERTs *</cell><cell cols="2">normalised 1-16 0.76</cell><cell>0.75</cell><cell>0.76</cell></row><row><cell></cell><cell>BERTd *</cell><cell cols="2">normalised 1-16 0.74</cell><cell>0.73</cell><cell>0.74</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://huggingface.co/google-bert/bert-base-uncased</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://keras.io/api/layers/core_layers/dense/</note>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title level="m" type="main">A nation divided: Classifying presidential speeches</title>
		<author>
			<persName><forename type="first">A</forename><surname>Acharya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Crawford</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Maduabum</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m" type="main">Mining text data</title>
		<editor>Aggarwal, C. C., &amp; Zhai, C.</editor>
		<imprint>
			<date type="published" when="2012">2012</date>
			<publisher>Springer Science &amp; Business Media</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">The effect of preprocessing techniques, applied to numeric features, on classification algorithms&apos; performance</title>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">A</forename><surname>Alshdaifat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Alshdaifat</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Alsarhan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Hussein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M D F S</forename><surname>El-Salhi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data</title>
		<imprint>
			<biblScope unit="volume">6</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page">11</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A Corpus for Sentence-Level Subjectivity Detection on English News Articles</title>
		<author>
			<persName><forename type="first">F</forename><surname>Antici</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Ruggeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Galassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Korre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Muti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Bardi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING</title>
				<meeting>the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING</meeting>
		<imprint>
			<date type="published" when="2024-05">2024. May. 2024</date>
			<biblScope unit="page" from="273" to="285" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">The CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness</title>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-56069-9_62</idno>
		<ptr target="https://doi.org/10.1007/978-3-031-56069-9_62" />
	</analytic>
	<monogr>
		<title level="m">Advances in Information Retrieval. ECIR 2024</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">N</forename><surname>Goharian</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
			<biblScope unit="volume">14612</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<title level="m" type="main">Document Classification using Machine Learning</title>
		<author>
			<persName><forename type="first">A</forename><surname>Basarkar</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Natural language processing with Python: analyzing text with the natural language toolkit</title>
		<author>
			<persName><forename type="first">S</forename><surname>Bird</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Klein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Loper</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2009">2009</date>
			<publisher>O&apos;Reilly Media, Inc</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Multi-criteria decision making and supervised learning for fake news detection in microblogging</title>
		<author>
			<persName><forename type="first">M</forename><surname>De Grandis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Pasi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Viviani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Workshop on Reducing Online Misinformation Exposure</title>
				<imprint>
			<date type="published" when="2019-07">2019. July</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<title level="m" type="main">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Text as data: The promise and pitfalls of automatic content analysis methods for political texts</title>
		<author>
			<persName><forename type="first">J</forename><surname>Grimmer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">M</forename><surname>Stewart</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Political analysis</title>
		<imprint>
			<biblScope unit="volume">21</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="267" to="297" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Fake news classification based on subjective language</title>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L M</forename><surname>Jeronimo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">B</forename><surname>Marinho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Campelo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Veloso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">S</forename><surname>Da Costa Melo</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st International Conference on Information Integration and Web-based Applications &amp; Services</title>
				<meeting>the 21st International Conference on Information Integration and Web-based Applications &amp; Services</meeting>
		<imprint>
			<date type="published" when="2019-12">2019. December</date>
			<biblScope unit="page" from="15" to="24" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">A language-model-based approach for subjectivity detection</title>
		<author>
			<persName><forename type="first">S</forename><surname>Karimi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shakery</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Information Science</title>
		<imprint>
			<biblScope unit="volume">43</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="356" to="377" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Crowdsourcing a word-emotion association lexicon</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Mohammad</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><forename type="middle">D</forename><surname>Turney</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Computational intelligence</title>
		<imprint>
			<biblScope unit="volume">29</biblScope>
			<biblScope unit="issue">3</biblScope>
			<biblScope unit="page" from="436" to="465" />
			<date type="published" when="2013">2013</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Learning Feature Engineering for Classification</title>
		<author>
			<persName><forename type="first">F</forename><surname>Nargesian</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Samulowitz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">U</forename><surname>Khurana</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">B</forename><surname>Khalil</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">S</forename><surname>Turaga</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IJCAI</title>
		<imprint>
			<biblScope unit="volume">17</biblScope>
			<biblScope unit="page" from="2529" to="2535" />
			<date type="published" when="2017-08">2017. August</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Scikit-learn: Machine learning in Python</title>
		<author>
			<persName><forename type="first">F</forename><surname>Pedregosa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Varoquaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gramfort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Thirion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><forename type="middle">, .</forename><surname>Grisel</surname></persName>
		</author>
		<ptr target="https://scikit-learn.org/stable/" />
	</analytic>
	<monogr>
		<title level="j">The Journal of machine learning research</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2825" to="2830" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">Acquiring perspective in English: the development of stance</title>
		<author>
			<persName><forename type="first">J</forename><surname>Reilly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Zamora</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">F</forename><surname>Mcgivern</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of pragmatics</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="issue">2</biblScope>
			<biblScope unit="page" from="185" to="208" />
			<date type="published" when="2005">2005</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">On the Definition of Prescriptive Annotation Guidelines for Language-Agnostic Subjectivity Detection</title>
		<author>
			<persName><forename type="first">F</forename><surname>Ruggeri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Antici</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Galassi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Korre</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Muti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Barrón-Cedeño</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Text2Story at ECIR</title>
		<imprint>
			<biblScope unit="volume">3370</biblScope>
			<biblScope unit="page" from="103" to="111" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<analytic>
		<title level="a" type="main">An attention-based CNN-LSTM model for subjectivity detection in opinion-mining</title>
		<author>
			<persName><forename type="first">S</forename><surname>Sagnika</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">S P</forename><surname>Mishra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">K</forename><surname>Meher</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Neural Computing and Applications</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">24</biblScope>
			<biblScope unit="page" from="17425" to="17438" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Troll and divide: the language of online polarization</title>
		<author>
			<persName><forename type="first">A</forename><surname>Simchon</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">J</forename><surname>Brady</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Van Bavel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">PNAS nexus</title>
		<imprint>
			<biblScope unit="volume">1</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">19</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Analysis of the subjectivity level in fake news fragments</title>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">L</forename><surname>Vieira</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">L M</forename><surname>Jeronimo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">E</forename><surname>Campelo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">B</forename><surname>Marinho</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Brazilian Symposium on Multimedia and the Web</title>
				<meeting>the Brazilian Symposium on Multimedia and the Web</meeting>
		<imprint>
			<date type="published" when="2020-11">2020. November</date>
			<biblScope unit="page" from="233" to="240" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Sentiment analysis and opinion mining: a survey</title>
		<author>
			<persName><forename type="first">G</forename><surname>Vinodhini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M</forename><surname>Chandrasekaran</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="282" to="292" />
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
