<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">CLEF 2024 JOKER Task 2: Using BERT and Random Forest Classifier for Humor Classification According to Genre and Technique ⋆</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">M</forename><surname>Saipranav</surname></persName>
							<email>saipranav2310324@ssn.edu.in</email>
							<affiliation key="aff0">
								<orgName type="institution">Sri Sivasubramaniya Nadar College Of Engineering</orgName>
								<address>
									<settlement>Chennai</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jaswanth</forename><surname>Sridharan</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Sri Sivasubramaniya Nadar College Of Engineering</orgName>
								<address>
									<settlement>Chennai</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Gautham</forename><surname>Narayan</surname></persName>
							<email>gauthamnarayan2310332@ssn.edu.in</email>
							<affiliation key="aff0">
								<orgName type="institution">Sri Sivasubramaniya Nadar College Of Engineering</orgName>
								<address>
									<settlement>Chennai</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Angel</forename><surname>Deborah</surname></persName>
							<email>angeldeborahs@ssn.edu.in</email>
							<affiliation key="aff0">
								<orgName type="institution">Sri Sivasubramaniya Nadar College Of Engineering</orgName>
								<address>
									<settlement>Chennai</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Samyuktaa</forename><surname>Sivakumar</surname></persName>
							<email>samyuktaa2210189@ssn.edu.in</email>
							<affiliation key="aff0">
								<orgName type="institution">Sri Sivasubramaniya Nadar College Of Engineering</orgName>
								<address>
									<settlement>Chennai</settlement>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="department">Evaluation Forum</orgName>
								<address>
									<addrLine>September 09-12</addrLine>
									<postCode>2024</postCode>
									<settlement>Grenoble</settlement>
									<country key="FR">France</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">CLEF 2024 JOKER Task 2: Using BERT and Random Forest Classifier for Humor Classification According to Genre and Technique ⋆</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">A5C335067DDA3131E5C08D3152B110E1</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:54+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Humor</term>
					<term>Genre Classification</term>
					<term>BERT</term>
					<term>TF-IDF Vectors</term>
					<term>Sentence Embedding</term>
					<term>Random Forest</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>In this paper, we present our work for the Automatic Humour Analysis (JOKER) Lab at CLEF 2024. The objective of the JOKER Lab is to research the automated processing of humour that includes tasks such as retrieval, classification, and interpretation of various forms of humorous texts. Our task involved the classification of humorous texts into different genres for which we undertook two different approaches. These approaches involved the usage of BERT (a transformer architecture) and a traditional machine learning model such as a Random Forest classifier. Out of the two models, BERT had a higher accuracy score of 0.6731. From this, we concluded BERT is better for most Natural Language Processes. We showcase our experiments on the training data and the results on the provided test dataset are presented in the forthcoming pages.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Humor plays a crucial role in human communication and social interaction. However, it is multifaceted and elicits different types of responses from various types of audiences. Accurate classification of humor not only enhances our understanding of its various forms but also has practical application in fields such as sentiment analysis, human-computer interaction and social media content moderation.</p><p>Traditional humor classification techniques can be labor and time consuming. Automating this process through NLP and ML techniques can improve the efficiency and accuracy of humor classification, benefiting academic research. With the proliferation of digital media, humor is more pervasive and varied than ever, presenting a challenge to even state of the art models to discern the differences between various genres of humor.</p><p>The CLEF 2024 JOKER <ref type="bibr" target="#b0">[1]</ref>[2] <ref type="bibr" target="#b2">[3]</ref> Track comprised of 3 tasks, which were: Task 1-Humor-aware information retrieval <ref type="bibr" target="#b0">[1]</ref>, Task 2-Humour classification according to genre and technique <ref type="bibr" target="#b0">[1]</ref> and Task 3-Translation of puns from English to French <ref type="bibr" target="#b0">[1]</ref>. We participated in task 2.</p><p>By leveraging some advanced natural language processing techniques and fine-tuning some of the well-known pre-trained models, this study for the chosen task -2 aims to develop a system capable of accurately classifying text into the following humor categories</p><p>• IR -Irony relies on a gap between the literal meaning and the intended meaning, creating a humorous twist or reversal. • SC -Sarcasm involves using irony to mock, criticize, or convey contempt.</p><p>• EX -Exaggeration involves magnifying or overstating something beyond its normal or realistic proportions.</p><p>• AID -Incongruity refers to the unexpected or contradictory elements that are combined in a humorous way and Absurdity involves presenting situations, events, or ideas that are inherently illogical, irrational, or nonsensical. • SD -Self-deprecating humor involves making fun of oneself or highlighting one's own flaws, weaknesses, or embarrassing situations in a lighthearted manner. • WS -Wit refers to clever, quick, and intelligent humor and Surprise in humor involves introducing unexpected elements, twists, or punchlines that catch the audience off guard.</p><p>This automated approach significantly benefits various fields by providing deeper insights into the mechanics of humor and enhancing the way machines understand and respond to human emotions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Approach</head><p>We took up 2 approaches for the humor classification task: multiclass classification using BERT base uncased and classification using Random Forest classifier. Preprocessing of the data was done differently for both methods.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Data Preparation</head><p>The provided dataset <ref type="bibr" target="#b2">[3]</ref> consisted of 1742 examples of text that must be classified into the abovementioned 7 genres of humor. We partitioned the dataset into an 80% training dataset and a 20% validation dataset. The content from the dataset was of the following format The winter drive -by shooting was a slay ride.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>WS 537</head><p>Basic text preprocessing was done to the provided dataset. Firstly, the class identifiers for each humorous text were mapped with respective numerical values. All texts were stripped of punctuation, stop words, and other special characters. These texts were then lemmatized. This preprocessed dataset was directly used for BERT (see figure <ref type="figure" target="#fig_0">1</ref>)</p><p>For the approach involving the use of the Random Forest classifier, the preprocessed text data were further prepared by combining Sentence Transformer, a pre-trained model, and TfidfVectorizer, a scikit-learn tool, to generate sentence embeddings and TF-IDF feature vectors, respectively (see figure <ref type="figure" target="#fig_1">2</ref> )</p><p>SentenceTransformer: This pre-trained model (multi-qa-mpnet-base-dot-v1) from the sentencetransformers library is utilized to generate sentence embeddings. This model captures the semantic meaning of text at the sentence level, effectively embedding the contextual nuances and relationships between words within sentences.  The target labels (classes) are extracted from the data frame to prepare the target variable for model training and evaluation. This extraction isolates the dependent variable, which the machine learning model will learn to predict based on the input feature set which is a combination of the TF-IDF vectors and sentence embeddings.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Methodology</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1.">BERT</head><p>BERT <ref type="bibr" target="#b3">[4]</ref> stands for Bidirectional Encoder Representations from Transformers. It is faster and is better at capturing context than normal Long Short Term Memory or other traditional models. BERT is pretrained on a large corpus of text using two unsupervised learning tasks namely Masked Language Model (MLM) and Next Sentence Prediction(NSP). In MLM, a percentage of the input tokens are randomly masked, and the model is trained to predict the original tokens based on the context of the surrounding words. This bidirectional context allows BERT to learn representations that capture deeper semantic meaning. For NSP, pairs of sentences are sampled from the corpus, and the model is trained to predict whether the second sentence follows the first one. This exercise helps BERT to understand relationships between sentences and improves its ability to handle tasks like question answering and natural language inference.</p><p>BERT <ref type="bibr" target="#b4">[5]</ref> consists of a stack of Transformer encoder layers. In the case of BERT Base Uncased, it has 12 such layers. Each layer contains self-attention mechanisms and feedforward neural networks.</p><p>At every layer, BERT calculates the attention scores for each token in the input sequence, indicating the importance of other tokens about it. This allows BERT to understand contextual information by trying to understand all tokens in the input sequence simultaneously, in both directions. After selfattention, the output is passed through a feedforward neural network, typically with a ReLU activation function. This network helps find complex patterns in the data and further improves the representations learned by the self-attention mechanism (see figure <ref type="figure" target="#fig_2">3</ref>).</p><p>Before inputting text into BERT, it undergoes tokenization into subword units using WordPiece tokenization. This allows BERT to handle out-of-vocabulary words effectively. Each input sequence is then represented as a combination of three types of embeddings namely token, segment, and positional embedding. Token Embedding represents the identity of each token in the input sequence. These embeddings are learned during the pre-training stage and understand the semantic meaning of individual words. Segment Embedding indicates whether a token belongs to the first sentence or the second sentence in a pair of sentences. This helps BERT understand the relationship between sentences, especially in tasks like question answering and natural language processing. Positional Embedding encodes the position of each token in the input sequence allowing BERT to capture sequential information and understand the order of words in a sentence.</p><p>After pre-training, BERT can be fine-tuned on specific tasks using task-specific labeled data <ref type="bibr" target="#b5">[6]</ref>. During fine-tuning, the pre-trained parameters are adjusted to optimize performance on the task. Fine-tuning BERT on specific tasks enables it to achieve state-of-the-art results across various natural language processing tasks. Random Forest <ref type="bibr" target="#b6">[7]</ref> is an ensemble classifier that contains several decision trees. Instead of using a single decision tree, this ensemble method leverages the decision-making ability of multiple decision trees and based on the majority number of predictions, the final output is predicted. The prepared input feature set is passed to the Random Forest classifier comprising 1500 decision trees. (see figure <ref type="figure" target="#fig_4">5</ref>) The use of out-of-bag samples is also enabled to estimate the generalization accuracy of the model. This provides an internal cross-validation measure of the model performance. Decision trees make up the most fundamental component of the Random Forest classifier. Each decision tree works to find the best split to divide the data into multiple subsets and is trained through the Classification and Regression Tree (CART) algorithm. Gini impurity, information gain, or mean square error are some of the commonly used metrics to evaluate the quality of the split. A single decision tree can be prone to bias and over-fitting, hence an ensemble classifier consisting of multiple decision trees is used to improve the accuracy of the predictions. Random Forest algorithm (see figure <ref type="figure" target="#fig_3">4</ref> ) makes use of bagging and feature randomness to create an uncorrelated forest of decision trees. Each tree in the ensemble comprises of data sample drawn from the provided training data set with replacement. One-third of it is set as the out-of-bag sample. The diversity of the dataset is increased and correlation among the decision trees is reduced through feature bagging. For a classification task, such as the one performed, the most frequent categorical variable will yield the predicted class. Finally, the out-of-bag sample is used for cross-validation.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2.">Random Forest</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Results</head><p>The metrics of precision, recall, accuracy, and f1-score are reported for the two models that were used to complete the given task. Precision is calculated mathematically as the ratio of true positives and the sum of true and false positives. Accuracy is the ratio of the number of correct predictions to the total number of data points. Recall is calculated as the ratio of true positive and the sum of true positive and false negative. The F1 score is calculated from the values of precision and recall. It mathematically, is equal to twice the ratio of the product of precision and recall to the sum of precision and recall.</p><p>Tables <ref type="table" target="#tab_1">2 and 3</ref> summarise the results of our runs as sent by the Joker lab for the fore mentioned approaches. These were carried out on the provided test dataset. Using a transformer architecture model such as BERT gave a higher accuracy of 0.6731 compared to a traditional machine learning model such as the Random Forest Classifier which only gave a accuracy of 0.5235. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusions</head><p>As mentioned before two different approaches were used to solve the given task. The first approach involved using a transformer architecture such as BERT. The second approach involved using a traditional machine learning model such as a Random Forest classifier. Higher accuracy (0.6731) of BERT suggests that using transformer architecture like BERT for classification proves to be more accurate than traditional and feature-dependent machine learning models that are commonly used for classification. Overall, it can be concluded that BERT's deep contextual and language understanding with its ability to leverage transfer learning, makes it better suited for the nuanced task of humor classification according to genre.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Data Preprocessing</figDesc><graphic coords="3,139.69,65.61,315.89,172.96" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Sentence Embedding/TFidf Vectorisation of Preprocessed Data</figDesc><graphic coords="3,128.41,277.76,338.47,89.24" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: BERT Classification Process</figDesc><graphic coords="4,128.41,65.61,338.46,72.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Random Forest Classification Process</figDesc><graphic coords="4,128.41,569.10,338.46,72.01" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: A Random Forest comprising of 3 decision trees.</figDesc><graphic coords="5,128.41,65.61,338.46,367.95" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Different Classes of Humor from the given Train Dataset</figDesc><table><row><cell>Id</cell><cell>Text</cell><cell>Class</cell><cell>Number of texts available per class</cell></row><row><cell>1112</cell><cell>Honesty may be the best policy, but</cell><cell>SC</cell><cell>356</cell></row><row><cell></cell><cell>insanity is the best defense.</cell><cell></cell><cell></cell></row><row><cell>782</cell><cell>no more instagram. we must all return to</cell><cell>IR</cell><cell>212</cell></row><row><cell></cell><cell>scrapbooks.</cell><cell></cell><cell></cell></row><row><cell>484</cell><cell>The answer is going to a grocery store</cell><cell>EX</cell><cell>125</cell></row><row><cell></cell><cell>during a pandemic . That's what I'd do for</cell><cell></cell><cell></cell></row><row><cell></cell><cell>a Klondike bar</cell><cell></cell><cell></cell></row><row><cell>1613</cell><cell>Knock knock. Who's there? Tank. Tank</cell><cell>AID</cell><cell>232</cell></row><row><cell></cell><cell>who? You're welcome.</cell><cell></cell><cell></cell></row><row><cell>167</cell><cell>All my imaginary friends tell me that I</cell><cell>SD</cell><cell>169</cell></row><row><cell></cell><cell>need therapy.</cell><cell></cell><cell></cell></row><row><cell>2140</cell><cell></cell><cell></cell><cell></cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc></figDesc><table><row><cell>Accuracy Metrics</cell><cell></cell><cell></cell><cell></cell></row><row><cell></cell><cell>Model</cell><cell></cell><cell cols="2">Accuracy</cell></row><row><cell></cell><cell>BERT</cell><cell></cell><cell>0.6731</cell></row><row><cell></cell><cell cols="2">Random Forest</cell><cell>0.5235</cell></row><row><cell>Table 3</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Precision, Recall and F1 scores</cell><cell></cell><cell></cell><cell></cell></row><row><cell>Model</cell><cell>Type</cell><cell cols="3">Precision Recall</cell><cell>F1</cell></row><row><cell>BERT</cell><cell>macro</cell><cell></cell><cell>0.6024</cell><cell>0.6027 0.6006</cell></row><row><cell></cell><cell>weighted</cell><cell></cell><cell>0.6662</cell><cell>0.6731 0.6687</cell></row><row><cell>Random Forest</cell><cell>macro</cell><cell></cell><cell>0.5353</cell><cell>0.3736 0.3742</cell></row><row><cell></cell><cell>weighted</cell><cell></cell><cell>0.5278</cell><cell>0.5223 0.4583</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Overview of JOKER @ CLEF-2024: Automatic humour analysis</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ermakova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-G</forename><surname>Bosser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><forename type="middle">M</forename><surname>Palma Preciado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sidorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jatowt</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction: Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF 2024)</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<editor>
			<persName><forename type="first">L</forename><surname>Goeuriot</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Mulhem</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Quénot</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">D</forename><surname>Schwab</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">L</forename><surname>Soulier</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><forename type="middle">M D</forename><surname>Nunzio</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Galuščáková</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><forename type="middle">G S</forename><surname>De Herrera</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">G</forename><surname>Faggioli</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">N</forename><surname>Ferro</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note>To appear</note>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">The joker corpus: English-french parallel data for multilingual wordplay recognition</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ermakova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-G</forename><surname>Bosser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jatowt</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Miller</surname></persName>
		</author>
		<idno type="DOI">10.1145/3539618.3591885</idno>
		<idno>doi:10.1145/3539618.3591885</idno>
		<ptr target="https://doi.org/10.1145/3539618.3591885" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;23</title>
				<meeting>the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR &apos;23<address><addrLine>New York, NY, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Association for Computing Machinery</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="2796" to="2806" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Ermakova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A.-G</forename><surname>Bosser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Miller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Thomas-Young</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Preciado</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Sidorov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Jatowt</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-031-56072-9_5</idno>
		<title level="m">CLEF 2024 JOKER Lab: Automatic Humour Analysis</title>
				<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="36" to="43" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<title level="m" type="main">Bert: A review of applications in natural language processing and understanding</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">V</forename><surname>Koroteev</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2103.11943</idno>
		<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Prabhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Mohamed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Misra</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2104.14289</idno>
		<title level="m">Multi-class text classification using bert-based active learning</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A random forest guided tour</title>
		<author>
			<persName><forename type="first">G</forename><surname>Biau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Scornet</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Test</title>
		<imprint>
			<biblScope unit="volume">25</biblScope>
			<biblScope unit="page" from="197" to="227" />
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
