<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Oppositional Thinking Analysis: Conspiracy Theories vs Critical Thinking Narratives Notebook for PAN at CLEF 2024</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Prabavathy</forename><surname>Balasundaram</surname></persName>
							<email>prabavathyb@ssn.edu.in</email>
							<affiliation key="aff0">
								<orgName type="department">Department of CSE</orgName>
								<orgName type="institution">SSN College of Engineering</orgName>
								<address>
									<addrLine>Rajiv Gandhi Salai</addrLine>
									<settlement>Chennai</settlement>
									<region>Tamil Nadu</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Karthikeyan</forename><surname>Swaminathan</surname></persName>
							<email>karthikeyan2210394@ssn.edu.in</email>
							<affiliation key="aff0">
								<orgName type="department">Department of CSE</orgName>
								<orgName type="institution">SSN College of Engineering</orgName>
								<address>
									<addrLine>Rajiv Gandhi Salai</addrLine>
									<settlement>Chennai</settlement>
									<region>Tamil Nadu</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Oviasree</forename><surname>Sampath</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of CSE</orgName>
								<orgName type="institution">SSN College of Engineering</orgName>
								<address>
									<addrLine>Rajiv Gandhi Salai</addrLine>
									<settlement>Chennai</settlement>
									<region>Tamil Nadu</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Pradeep</forename><surname>Km</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Department of CSE</orgName>
								<orgName type="institution">SSN College of Engineering</orgName>
								<address>
									<addrLine>Rajiv Gandhi Salai</addrLine>
									<settlement>Chennai</settlement>
									<region>Tamil Nadu</region>
									<country key="IN">India</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Oppositional Thinking Analysis: Conspiracy Theories vs Critical Thinking Narratives Notebook for PAN at CLEF 2024</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">A6B5FC6107BC89766488F4B2686BD1BA</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:59+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>BERT</term>
					<term>Multi-label classification</term>
					<term>Conspiracy Theories (CTs)</term>
					<term>Tokenizer</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Conspiracy theories <ref type="bibr" target="#b0">[1]</ref> are complex narratives that attempt to explain the ultimate causes of significant events as cover plots orchestrated by secret, powerful, and malicious groups, whereas critical thinking on the other hand is the process of objectively analyzing and evaluating information to form a reasoned judgment and putting them forth for the public view. Identifying conspiracy theories using Natural Language Processing (NLP) models is challenging because it is hard to tell them apart from critical thinking. Mislabeling critical messages as conspiratorial can push curious individuals towards conspiracy communities, and hence it is highly important to be accurate in such classifications. The task involves distinguishing between two types of oppositional narratives:</p><p>(1) conspiracy narratives, which suggest secret plots by powerful, malicious groups, and (2) critical thinking narratives, which question major decisions without implying a conspiracy. To achieve subtask 1, a pre-trained BERT classifier is employed to differentiate between the two classes using a sigmoid activation function. The model for subtask 2 is a pretrained BERT-based sequence classifier fine-tuned for multi-label classification, which enables span-level classification of oppositional narratives. This working note paper presents the results of the Kaprov team at the Oppositional thinking analysis: Conspiracy theories vs critical thinking narratives [2] of PAN at CLEF 2024 [3],which includes two subtasks.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the realm of Natural Language Processing, the computational detection and analysis of conspiracy theories (CTs) within textual data has gained significant momentum <ref type="bibr" target="#b3">[4]</ref>. CTs are elaborate narratives attributing significant events to covert actions by powerful clandestine groups, contrasting with critical thinking, which challenges mainstream beliefs without endorsing conspiracies. Differentiating between these is crucial, as mislabeling opposing views as conspiratorial may sway individuals towards extreme viewpoints. Current research predominantly focuses on binary classification tasks aimed at accurately distinguishing between conspiratorial and critical texts. Existing methodologies for distinguishing between conspiratorial and critical texts typically involve leveraging advanced natural language processing techniques and machine learning models. Some common approaches include:</p><p>• Feature-based Classification: Using algorithms like SVMs (Support Vector Machines) <ref type="bibr" target="#b4">[5]</ref> or logistic regression, which analyze word frequencies, n-grams, and syntax to classify texts. • Graph-based Methods: Representing texts as graphs, where nodes represent entities (e.g., words or phrases) and edges represent relationships (e.g., co-occurrence). Graph-based methods can capture structural patterns and semantic relationships indicative of conspiratorial or critical narratives.</p><p>• Sentiment Analysis: Analyzing the sentiment expressed in texts can provide insights into whether the text is promoting conspiratorial beliefs (e.g., distrust, fear) or engaging in critical discourse (e.g., skepticism, questioning).</p><p>Subtask 2 focuses on token-level classification within oppositional narratives, distinguishing between conspiracy theories and critical thinking. It aims to identify specific text segments-goals, effects, agents, facilitators, objectives, and negative effects-using advanced NLP techniques. This approach enhances understanding of nuanced narrative elements for effective content moderation and societal discourse analysis. The approach includes:</p><p>• Topic Modeling: Techniques like Latent Dirichlet Allocation (LDA) <ref type="bibr" target="#b5">[6]</ref> or Non-Negative Matrix Factorization (NMF) <ref type="bibr" target="#b6">[7]</ref>  The two tasks discussed in this paper and their successful implementation collectively advance the field by enabling automated detection and analysis of conspiratorial narratives, facilitating nuanced understanding and effective management of such discourse in various domains, and thereby create stable and peaceful platforms for discussion on public health issues.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Task and Dataset Description</head><p>There are two tasks that have been worked upon, the first involves distinguishing between critical and conspiracy texts, while the second focuses on detecting elements within oppositional narratives. The dataset contains Telegram messages in English and Spanish. Both the tasks are performed exclusively in English. Subtask 1 involves classifying texts into two categories: (1) messages that critically question public health decisions without promoting conspiracy theories, and (2) messages that attribute pandemic or health decisions to secret, influential conspiracies. Each text in the dataset is labeled as either CONSPIRACY or CRITICAL. Evaluation of model performance is done based on Matthews Correlation Coefficient (MCC) <ref type="bibr" target="#b7">[8]</ref>, with a baseline established by a BERT classifier <ref type="bibr" target="#b8">[9]</ref>.</p><p>Subtask 2 involves a token-level classification challenge where the goal is to identify specific text segments that represent essential elements in oppositional narratives. Each text data in the input dataset contains span texts along with their starting and ending positions and the type of oppositional narrative the span text belongs to out of: AGENT, FACILITATOR, VICTIM, CAMPAIGNER, OBJECTIVE, and NEGATIVE EFFECT. The performance of models is evaluated using the macro-averaged span-F1 score, which assesses overall accuracy across all span categories.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Data Pre-Processing</head><p>This section outlines the process of preparing data for the two tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Subtask 1 : Distinguishing between critical and conspiracy texts</head><p>In the data pre-processing stage for subtask 1, the dataset is initially split into two subsets: one for critical messages and another for conspiracy messages . Each subset is filtered based on the "category" column values. Exploratory Data Analysis (EDA) <ref type="bibr" target="#b9">[10]</ref> begins with a count plot to visualize the distribution of categories ("CRITICAL" and "CONSPIRACY") [Fig. <ref type="figure" target="#fig_0">1</ref>]. This provides an initial understanding of the dataset's class distribution. Following EDA, data cleansing involves checking for missing values. Addressing any missing data ensures the dataset is ready for subsequent steps such as tokenization, feature extraction, and model training for binary classification. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Subtask 2 : Detecting elements of the oppositional narratives</head><p>Subtask 2 pre-processing starts with extracting annotations from each JSON (JavaScript Object Notation) entry, gathering crucial details about the relevant text spans and their corresponding categories. This step prepared the dataset for subsequent pre-processing, ensuring its alignment with the machine learning pipeline. Post annotation extraction, the Hugging Face AutoTokenizer <ref type="bibr" target="#b10">[11]</ref> tailored for BERT models was employed to tokenize the dataset. Tokenization converted raw text sequences into numerical token IDs suitable for ingestion by the BERT-based model. To meet BERT's input specifications, a truncation strategy was applied to handle sequences exceeding the model's maximum input length. This approach maintained consistency in sequence lengths across the dataset, optimizing computational efficiency during training and evaluation phases.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Methodologies Used</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Tiny BERT Text Classifier</head><p>The Tiny BERT Text Classifier model <ref type="bibr" target="#b11">[12]</ref> is a variant of BERT optimized for English text classification tasks, specifically focusing on the SST (Stanford Sentiment Treebank)-2 dataset <ref type="bibr" target="#b12">[13]</ref> for sentiment analysis. Built on transformer architecture, this model enables bidirectional understanding of language nuances, enhancing accuracy in classifying sentences as either critical or conspiracy in nature. This capability is crucial for distinguishing between texts that question public health decisions (critical) and those that attribute them to malevolent conspiracies (conspiracy). By leveraging bidirectional context, these models can capture subtle linguistic cues that differentiate between these two types of narratives effectively.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Enhanced Multi-label BERT Classifier</head><p>Methodologies of subtask 2 typically involve initial dataset preparation by sourcing annotated text spans and categorizing them for training, validation, and test sets to ensure unbiased model evaluation. Utilizing tools like AutoTokenizer from Hugging Face's Transformers library <ref type="bibr" target="#b10">[11]</ref>, raw text sequences are tokenized into numerical token IDs, with strategies like truncation and padding managing sequence lengths. Model selection focuses on transformer-based architectures pretrained on extensive text corpora, fine-tuned for span-level classification using transfer learning techniques. Training optimizes model parameters with Adam optimizer and Binary Cross-Entropy loss <ref type="bibr" target="#b13">[14]</ref>, while evaluation metrics such as span-level F1-score, precision, recall, and micro-averaged F1-score assess model performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Implementation</head><p>To implement subtask 1, the dataset is structured into a format where each text sample is categorized either as "CRITICAL" or "CONSPIRACY". The BertClassifier model from keras-nlp.models is then employed with specific configurations for binary classification. Pre-trained weights are loaded, and a sigmoid activation function is utilized to facilitate binary output. The model is trained on the training data to distinguish between critical viewpoints and conspiracy theories regarding public health decisions. Evaluation is performed on the test set to assess the model's capability in accurately classifying these texts. This approach leverages the capabilities of BERT for semantic understanding, thereby supporting the task's objective of discerning between critical analyses and conspiratorial narratives in the domain of public health.</p><p>The BERT-Based Multi-Label Text Classifier was implemented in Python using the bert-base-uncased model architecture from Hugging Face's Transformers library. The dataset, sourced from JSON files, contained annotated text spans (span text) categorized into specific classes (category). After partitioning the dataset into training (70%), validation (10%), and test (20%) sets, annotations were extracted to prepare the data for tokenization. The Hugging Face AutoTokenizer <ref type="bibr" target="#b10">[11]</ref> was employed to tokenize the text sequences into numerical token IDs, with a truncation strategy applied to handle sequences longer than BERT's maximum input length. The model was fine-tuned for multi-label classification, optimizing with the Adam optimizer and Binary Cross-Entropy loss function <ref type="bibr" target="#b13">[14]</ref> over multiple epochs. Evaluation on the validation set involved monitoring metrics such as accuracy, precision, recall, and F1-score to ensure model performance. Finally, the trained model and tokenizer were saved for deployment, emphasizing reproducibility and scalability in future applications. The BERT-Based Multi-Label Text Classifier was implemented in Python using the bert-base-uncased model architecture from Hugging Face's Transformers library. The dataset, sourced from JSON files, contained annotated text spans (span text) categorized into specific classes (category). After partitioning the dataset into training (70%), validation (10%), and test (20%) sets, annotations were extracted to prepare the data for tokenization. The Hugging Face AutoTokenizer was employed to tokenize the text sequences into numerical token IDs, with a truncation strategy applied to handle sequences longer than BERT's maximum input length. The model was fine-tuned for multi-label classification, optimizing with the Adam optimizer and Binary Cross-Entropy loss function over multiple epochs. Evaluation on the validation set involved monitoring metrics such as accuracy, precision, recall, and F1-score to ensure model performance. Finally, the trained model and tokenizer were saved for deployment, emphasizing reproducibility and scalability in future applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Results and Analysis</head><p>Based on the provided results for subtask 1 and subask 2 in English, the performance of the BERT-Based Multi-Label Text Classifier was evaluated. For subtask 1 [Table <ref type="table" target="#tab_1">1</ref>], focusing on conspiracy and critical categorization, the model achieved an F1-macro score of 0.3700 and 0.8255, respectively, indicating moderate performance in identifying critical texts compared to conspiracy-related ones . In subtask 2 [Table <ref type="table" target="#tab_2">2</ref>], which evaluated span-level F1-score and micro-averaged F1, the model attained scores of 0.0150 and 0.0600, respectively, suggesting challenges in precise span-level predictions. The implementation utilized Python with the bert-base-uncased model from Hugging Face's Transformers library, leveraging AutoTokenizer for tokenization and fine-tuning with Adam optimizer and Binary Cross-Entropy loss. The results underscore the model's effectiveness in critical text classification but highlight areas for improvement in span-level prediction accuracy. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="7.">Conclusion</head><p>The task Oppositional Thinking Analysis: Conspiracy vs Critical tackles the challenge of distinguishing conspiratorial from critical narratives in oppositional texts, especially regarding COVID-19. Conspiracy theories, often depicting events as manipulated by secretive, powerful groups, are complex and hard to separate from genuine critical thinking. The competition aims to enhance understanding and automatic detection of these narratives, crucial for content moderation on social media. Differentiating conspiratorial messages from critical ones is essential, as mislabeling can push individuals toward conspiracy communities. This task involved developing sophisticated NLP models to discern these nuances for accurate classification and better content management. The approach included preprocessing steps like text cleaning and feature extraction using TF-IDF (Term Frequency-Inverse Document Frequency) <ref type="bibr" target="#b14">[15]</ref> and word embeddings. Both traditional machine learning algorithms, such as logistic regression and support vector machines, and advanced deep learning models, like LSTM (Long Short-Term Memory) and BERT, were used. Evaluations with metrics such as accuracy, precision, recall, and F1-score showed deep learning models, especially BERT, outperformed traditional ones. Cross-validation ensured robustness and mitigated overfitting. The methodologies from this competition promise to improve automatic detection of conspiratorial versus critical narratives, aiding effective content moderation on digital platforms.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Visualization of category distribution in the dataset.</figDesc><graphic coords="3,172.36,145.29,250.56,196.56" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 1</head><label>1</label><figDesc>Subtask 1 : Distinguishing between critical and conspiracy texts.</figDesc><table><row><cell>Metric</cell><cell>Value</cell></row><row><cell>MCC</cell><cell>0.3700</cell></row><row><cell>F1-MACRO</cell><cell>0.6240</cell></row><row><cell cols="2">F1-CONSPIRACY 0.4224</cell></row><row><cell>F1-CRITICAL</cell><cell>0.8255</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc>SUubtask 2 : Detecting elements of the oppositional narratives.</figDesc><table><row><cell>Metric</cell><cell>Value</cell></row><row><cell>span-F1</cell><cell>0.0150</cell></row><row><cell>span-P</cell><cell>0.0261</cell></row><row><cell>span-R</cell><cell>0.0165</cell></row><row><cell cols="2">micro-span-F1 0.0600</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">What are conspiracy theories? a definitional approach to their correlates, consequences, and communication</title>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">M</forename><surname>Douglas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">M</forename><surname>Sutton</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Annual review of psychology</title>
		<imprint>
			<biblScope unit="volume">74</biblScope>
			<biblScope unit="page" from="271" to="298" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of the oppositional thinking analysis pan task at clef</title>
		<author>
			<persName><forename type="first">D</forename><surname>Korenčić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Bonet Casals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Taulé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of CLEF 2024 -Conference and Labs of the Evaluation Forum</title>
				<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Galuvakova</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">G S</forename><surname>Herrera</surname></persName>
		</editor>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024. 2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Overview of pan 2024: Multi-author writing style analysis, multilingual text detoxification, oppositional thinking analysis, and generative ai authorship verification -condensed lab overview</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">A</forename><surname>Ayele</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Babakov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Bevendorff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><forename type="middle">B</forename><surname>Casals</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Chulvi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Dementieva</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Elnagar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Freitag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Fröbe</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Korenčić</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mayerl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Moskovskiy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mukherjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Panchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Potthast</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Rangel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Rizwan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Rosso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Schneider</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Smirnova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Stamatatos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Stein</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Taulé</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Ustalov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Wiegmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">M</forename><surname>Yimam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Zangerle</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association CLEF-2024</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Ghasemizade</surname></persName>
		</author>
		<title level="m">A computational journey through conspiracy theories: A genealogical approach</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<title level="m" type="main">Support vector machine, Machine learning models and algorithms for big data classification: thinking with examples for effective learning</title>
		<author>
			<persName><forename type="first">S</forename><surname>Suthaharan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Suthaharan</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2016">2016</date>
			<biblScope unit="page" from="207" to="235" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Latent Dirichlet allocation</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Blei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Y</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">I</forename><surname>Jordan</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">3</biblScope>
			<biblScope unit="page" from="993" to="1022" />
			<date type="published" when="2003">2003</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<title level="m" type="main">Non-negative matrix factorization (nmf), Machine Learning for Adaptive Many-Core Machines-A Practical Approach</title>
		<author>
			<persName><forename type="first">N</forename><surname>Lopes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ribeiro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Lopes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Ribeiro</surname></persName>
		</author>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="127" to="154" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The matthews correlation coefficient (mcc) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation</title>
		<author>
			<persName><forename type="first">D</forename><surname>Chicco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Tötsch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Jurman</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">BioData mining</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="page" from="1" to="22" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Exploratory data analysis (eda)</title>
		<author>
			<persName><forename type="first">E</forename><surname>Camizuli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">J</forename><surname>Carranza</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The encyclopedia of archaeological sciences</title>
				<imprint>
			<date type="published" when="2018">2018</date>
			<biblScope unit="page" from="1" to="7" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Delangue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cistac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Rault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Louf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Funtowicz</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1910.03771</idno>
		<title level="m">Huggingface&apos;s transformers: State-of-the-art natural language processing</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">X</forename><surname>Jiao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Shang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Liu</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1909.10351</idno>
		<title level="m">Tinybert: Distilling bert for natural language understanding</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Recursive deep models for semantic compositionality over a sentiment treebank</title>
		<author>
			<persName><forename type="first">R</forename><surname>Socher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Perelygin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">D</forename><surname>Manning</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">Y</forename><surname>Ng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Potts</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2013 conference on empirical methods in natural language processing</title>
				<meeting>the 2013 conference on empirical methods in natural language processing</meeting>
		<imprint>
			<date type="published" when="2013">2013</date>
			<biblScope unit="page" from="1631" to="1642" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Addressing imbalance in multi-label classification using weighted cross entropy loss function</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">R</forename><surname>Rezaei-Dastjerdehei</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mijani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Fatemizadeh</surname></persName>
		</author>
		<idno type="DOI">10.1109/ICBME51989.2020.9319440</idno>
	</analytic>
	<monogr>
		<title level="m">27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME)</title>
				<imprint>
			<date type="published" when="2020">2020. 2020</date>
			<biblScope unit="page" from="333" to="338" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">A simple probabilistic explanation of term frequency-inverse document frequency (tf-idf) heuristic (and variations motivated by this explanation)</title>
		<author>
			<persName><forename type="first">L</forename><surname>Havrlant</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Kreinovich</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of General Systems</title>
		<imprint>
			<biblScope unit="volume">46</biblScope>
			<biblScope unit="page" from="27" to="36" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
