<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">ELiRF-VRAIN at eRisk 2024: Using LongFormers for Early Detection of Signs of Anorexia</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Andreu</forename><surname>Casamayor</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">Valencian Research Institute for Artificial Intelligence (VRAIN)</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<addrLine>Camino de Vera s/n</addrLine>
									<postCode>46022</postCode>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Vicent</forename><surname>Ahuir</surname></persName>
							<email>vahuir@dsic.upv.es</email>
							<affiliation key="aff0">
								<orgName type="department">Valencian Research Institute for Artificial Intelligence (VRAIN)</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<addrLine>Camino de Vera s/n</addrLine>
									<postCode>46022</postCode>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Antonio</forename><surname>Molina</surname></persName>
							<email>amolina@dsic.upv.es</email>
							<affiliation key="aff0">
								<orgName type="department">Valencian Research Institute for Artificial Intelligence (VRAIN)</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<addrLine>Camino de Vera s/n</addrLine>
									<postCode>46022</postCode>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Lluís-Felip</forename><surname>Hurtado</surname></persName>
							<email>lhurtado@dsic.upv.es</email>
							<affiliation key="aff0">
								<orgName type="department">Valencian Research Institute for Artificial Intelligence (VRAIN)</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<addrLine>Camino de Vera s/n</addrLine>
									<postCode>46022</postCode>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
							<affiliation key="aff0">
								<orgName type="department">Valencian Research Institute for Artificial Intelligence (VRAIN)</orgName>
								<orgName type="institution">Universitat Politècnica de València</orgName>
								<address>
									<addrLine>Camino de Vera s/n</addrLine>
									<postCode>46022</postCode>
									<settlement>Valencia</settlement>
									<country key="ES">Spain</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">ELiRF-VRAIN at eRisk 2024: Using LongFormers for Early Detection of Signs of Anorexia</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">161AE05BC2A8B6FEA53974782DC7748B</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:52+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Longformers</term>
					<term>Transformers</term>
					<term>Support Vector Machine</term>
					<term>Anorexia</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper describes the approaches taken by the ELiRF-VRAIN team at the Task 2 of eRisk at CLEF 2024 focused on the early detection of signs of anorexia on English-language social media. Our work involved three distinct approaches: one using a Support Vector Machine (SVM) and the other two based on pre-trained Transformer models. Among the Transformer models, one approach employed BERT-like models, while the other used LongFormer models. To fine-tune our models, we implemented a data augmentation process on the dataset provided by the organization. In the validation phase, the models trained on the augmented dataset improved the F1 score results. In particular, F1 increased from 0.89 to 0.94 for the LongFormer model. During the testing phase the SVM model and LongFormer with data augmentation obtained the best results. LongFormer improved BERT-like model performance due to its ability to handle large contexts. Seeing the results achieved in the validation phase, we can say that the overall performance was not as good as expected. A detailed analysis of the results would be necessary to find out the reasons.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Anorexia nervosa is the formal term for anorexia, and it's a complex, really multi-structural eating disorder. This is a disorder characterized by a fear of gaining weight and by the maintenance of a distorted body image through severe food restriction and excessive weight loss. It is hazardous for both males and females, but is most common among young women. Women account for 90-95% of those affected; the age range is usually between 12 and 25 years, and it is most common between 12 and 17 years of age. <ref type="bibr" target="#b0">[1]</ref> The impacts of anorexia extend to all aspects of one's health and functioning, extending far beyond malnutrition to nearly every organ system in the body, even when comorbid with other mental health issues like depression and anxiety. Little is done, anorexia is often difficult to detect and treat due to its insidious onset and the societal stigma surrounding mental health and eating disorders.</p><p>For this reason, the analysis of social interactions to detect risks of anorexia has recently become one of the most important ways of detection. This type of problem, anorexia detection, is complicated due to some reasons, such as the amount and quality of the data. CLEF eRisk created different tasks, to provide quality data and promote the creation of models for this early detection.</p><p>In 2024's edition, eRisk proposed three shared tasks <ref type="bibr" target="#b1">[2,</ref><ref type="bibr" target="#b2">3]</ref>: <ref type="bibr" target="#b0">(1)</ref> Search for symptoms of depression, (2) Early Detection of Signs of Anorexia, and (3) Measuring the severity of the signs of Eating Disorders.</p><p>We focused our participation on the second shared task, where we used three different approaches to tackle the problem posed by the task:</p><p>1. The initial approach employs a traditional machine learning algorithm, Support Vector Machines (SVM). SVMs have shown meaningful performance in classifying lengthy texts, similar to this case. We use this approach to evaluate the effectiveness of classical models. 2. The second approach utilizes Transformers <ref type="bibr" target="#b3">[4]</ref> by leveraging a pre-trained RoBERTa model <ref type="bibr" target="#b4">[5]</ref> as a foundation, followed by a fine-tuning process to adapt it to the downstream task. We performed fine-tuning using two distinct datasets: one provided by the organization and the other created through data augmentation. 3. The final approach is similar to the second one but aims to capture more context by using a pre-trained LongFormer model <ref type="bibr" target="#b5">[6]</ref>. This model accommodates larger input sizes, allowing it to grasp more contextual information. We fine-tuned the LongFormer model using the same dataset as in the previous approach.</p><p>We submitted four runs for Task 2, one for approaches 1 and 2, and two for approach 3. Before selecting the best model for each approach, we put them through a validation phase, where we tested different configurations and datasets used.</p><p>We have done this kind of experimentation before and had excellent results, proving how reliable and effective our approach is. In related topic works, we used similar methods and achieved substantial outcomes <ref type="bibr" target="#b6">[7]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Description of Dataset and Task</head><p>Task 2 involves the early detection of anorexia risk by sequentially analyzing pieces of evidence to identify early signs of the disorder as promptly as possible. This task primarily focuses on evaluating natural language processing solutions, particularly those that analyze texts from social media. Texts must be processed in the chronological order in which they were created. This simulates better what the system would do: monitor real-time user interactions on blogs, social networks, or other online platforms.</p><p>The dataset in Task 2 consisted of a writing (post or comments) collection from a set of Social Media users formed from the datasets from previous editions of the task in 2018 and 2019. This collection has the same format as the one delivered in <ref type="bibr" target="#b7">[8]</ref>, where there are two different classes: users who suffer from anorexia and a control group (non-anorexia). Every user has a chronological collection of messages or writings.</p><p>Table <ref type="table" target="#tab_0">1</ref> shows the distribution among the different labels in the dataset As mentioned, the primary goal of this competition is to predict signs of anorexia as promptly as possible. To simulate realistic conditions, the organizers set up a server that sequentially delivers data packets, each containing a message from a user. The system must predict the user's signs of anorexia, if any, by considering both the current message and all previous messages before receiving the next data packet.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Systems and Architecture and Techniques</head><p>In this type of task, a relevant factor to consider is the amount of context required for accurate detection. Since each user can have numerous messages, the size of the input to the system becomes a crucial consideration. One of our team's objectives was to examine the impact of context in these tasks. Specifically, we aimed to evaluate the performance of different systems based on their ability to handle varying amounts of context. We selected three different systems to achieve this goal: the first based on Support Vector Machines (SVM), the second based on a RoBERTa model, and the third based on LongFormer model. Each system evaluated has a different size for context:</p><p>• Support Vector Machines (SVM) do not have a fixed limit on input size; they construct a vector with a length corresponding to the vocabulary size. This flexibility allows SVMs to handle a large and variable amount of data, as they can create feature vectors based on the entirety of the input text's vocabulary, accommodating diverse and extensive datasets. • The selected RoBERTa model has a limit of 512 tokens in the input.</p><p>• The selected LongFormer model has a limit of 4096 tokens in the input.</p><p>Additionally, we developed two distinct datasets to train and evaluate the performance of the transformer-based systems.</p><p>Dataset 1. We created only one sample per user by aggregating all their messages, both for positive and negative labeled users. This approach ensures that the dataset effectively captures the overall context and messaging patterns of every user, facilitating a more accurate evaluation of the models' performance in distinguishing between positive and negative cases.</p><p>Dataset 2. If we had some a priori evidence of in which message a user begins to present symptoms of mental illness risk, we could label the samples from previous messages as negative, and the samples containing that message and subsequent ones as positive. In this way, we could increase the number of positive samples to achieve a more precise model. This data augmentation process is explained in the next section.</p><p>To conduct our experimentation, we split the original dataset into two partitions: training (80% of users) and development (20% of users). We ensured that both partitions maintained the same proportions of positive and negative samples to preserve the dataset's balance and integrity. Table <ref type="table" target="#tab_1">2</ref> shows the distribution of samples in Dataset 1. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data Augmentation</head><p>The data augmentation process aims to generate additional samples for each positive user. As mentioned earlier, we need evidence of when a user begins to exhibit signs of anorexia in their messages. To identify this, we relied on predictions from the SVM-based classifier. We assume that all messages preceding the SVM decision point do not express signs of anorexia. To implement this, we followed these steps:</p><p>1. For positive users, we calculated how many messages the SVM needs to classify the user as positive. Each user has a different trigger value. 2. For false negatives, we used the mean of the true positive trigger values as the trigger value. 3. For each positive user in the original data set, let 𝑛 be the number of messages that the SVM model needs to determine this user's mental disorder risk, 𝑀 𝐴𝑋 be the maximum number of messages the model supports as input, and 𝑚 𝑖 the ith message from the user. a) we created 𝑛 − 1 negative samples as follows: 4. Note that the value of 𝑀 𝐴𝑋 depends on which model was used and the number of tokens in the messages. That is, we discard messages from an accumulated history of more than 512 tokens for RoBERTa and 4096 for LongFormer. So, if 𝑛 &gt; 𝑀 𝐴𝑋 only negative samples are generated. 5. For negative users, we created new samples accumulating the history as before, stopping when the MAX was reached.</p><formula xml:id="formula_0">(𝑚 1 ), (𝑚 1 𝑚 2 ), (</formula><p>The result of this technique is a new dataset with a higher number of positive samples for the training. In the development partition, we held a sample per user, as in Dataset 1. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Classical Machine Learning Classifier Approach</head><p>To evaluate the significance of the context, we aimed to use a classical machine learning classifier that is capable of handling all the available context. One of the major issues with Transformer-based models is that their ability to handle large texts is limited by the input size. This greatly affects performance because the input cannot contain the length of the sample, whereby crucial information may be lost.</p><p>We would use such a classical machine learning model as SVM to create a vector as long as the size of the vocabulary to show the model's performance when it has no such restriction. First, we experimented to compare different types of classical machine learning classifiers. We utilized the Scikit-learn library <ref type="bibr" target="#b8">[9]</ref> for this purpose, employing its default classifiers to identify the best-performing model. The results, presented in Table <ref type="table" target="#tab_4">4</ref>, indicate that the Linear SVM emerged as the top performer among the classifiers tested. Once the classifier was chosen, we wanted to test different approaches:</p><p>• Preprocess of Data:</p><p>1. First approach: Transform the text into tokens using TweetTokenizer and then eliminate stop words. 2. Second Approach: Same as the first approach with the addition of methods to clean the text, eliminate non-alphanumerical characters and others, and lemmatize tokens.</p><p>• Sentimental Analysis: We used the model "lxyuan/distilbert-base-multilingual-cased-sentimentsstudent" <ref type="bibr" target="#b9">[10]</ref> to perform sentiment analysis on every user message. This process yielded three results: positive messages, negative messages, and neutral messages. These results were normalized and subsequently added as a new feature to the TF-IDF representation. This enhancement allowed us to incorporate sentiment-based insights into our analysis, potentially improving the performance and accuracy of our classification models. • TF-IDF: We used the class TfidfVectorizer from Scikit-learn to vectorize the data. We experimented with different configurations for the analyzer and ngram_range parameters, while using the default values for other features. This approach allowed us to identify the optimal configuration for the task.</p><p>To find the best models for every approach, we did an exhaustive grid search over some specific parameters, such as regularization parameter C, different tols, and different loss.</p><p>We obtained 8 different approaches. Table <ref type="table" target="#tab_5">5</ref> shows the different configurations used in the experimentation, the column TF-IDF refers to the type of analyzers (word or char) used and the number of n-grams. The last column refers to the best model found in the search grid. The result shows in Table <ref type="table" target="#tab_6">6</ref> the best configuration is the SVM-1, using the first preprocess for the data, without sentimental analysis, "char_wb" as the analyzer and (4-5) as ngram_range. This model was used for Run0 in Task 2. We tested adding sentimental analysis as a feature because it has been shown to be effective in improving performance in similar tasks using SVM. In particular, we achieved significant improvements in MentalRiskES 2024 <ref type="bibr" target="#b6">[7]</ref>, a shared task for the early detection of depression symptoms.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">BERT-like Model Approach</head><p>It is well known that state-of-the-art models in NLP are based on Transformers. Models like BERT and RoBERTa typically offer excellent versatility for classification tasks. However, these models are often limited to handling a maximum of 512 tokens, which can be problematic for tasks requiring the processing of long contexts, such as the one at hand. To address this issue, we used one of these models as a baseline to compare against other models with a better capacity for managing large contexts. This comparison allows us to evaluate the performance trade-offs and benefits of different approaches in handling extended textual data.</p><p>We conducted research to find a base model trained in domains related to eating disorders; however, we did not find any pre-trained model specialized in eating disorders. While we were doing the research, we found the following: between 50% to 75% of those who struggle with an eating disorder will also experience symptoms of depression or anxiety <ref type="bibr" target="#b10">[11]</ref>. Therefore, we used a pre-trained model related to mental disorders instead.</p><p>Research by Alireza Pourkeyvan <ref type="bibr" target="#b11">[12]</ref> indicates that the state-of-the-art model in mental disorder detection is MentalRoBERTa <ref type="bibr" target="#b12">[13]</ref>. MentalRoBERTa is a variant of the RoBERTa model that is specialized for mental health applications. It is pre-trained on a specialized corpus that includes texts from mental health forums, clinical notes, and general language corpus. This pre-training enables MentalRoBERTa to better understand and process language related to mental health, enhancing its applicability and effectiveness in this domain.</p><p>The model selected was AIMH/mental-roberta-large <ref type="bibr" target="#b13">[14]</ref>, a RoBERTa variant trained specifically on mental health-related posts from Reddit. This model is available on the HuggingFace <ref type="bibr" target="#b14">[15]</ref> public hub (https://huggingface.co/AIMH/mental-roberta-large) and provides specialized capabilities for understanding mental health discourse.</p><p>We obtained two models by fine-tuning the base pre-trained model with two datasets: one using Dataset 1 (RoBERTa-1) and the other using Dataset 2 (RoBERTa-2), with the second incorporating data augmentation. Table <ref type="table" target="#tab_7">7</ref> shows the configuration used in the fine-tuning process.   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">LongFormer Approach</head><p>As previously mentioned, one of the major drawbacks of BERT-like or RoBERTa-like models based on Transformers is their limited capacity to handle large contexts. However, there is a variant of Transformers called LongFormer, which can process longer texts effectively <ref type="bibr" target="#b5">[6]</ref> LongFormer, which stands for "Long-Document Transformer, " is designed to process long contexts more efficiently than traditional Transformer models such as BERT or RoBERTa. The LongFormer architecture exhibits the following characteristics:</p><p>• New attention mechanism: An efficient attention mechanism that uses a sliding window, where each token only attends to a fixed number of neighborhood tokens, reducing the complexity. • Global attention selection: The architecture can select which tokens are globally attended and which are just attended locally.</p><p>The pre-trained model chosen was AIMH/mental-longformer-base-4096 <ref type="bibr" target="#b15">[16]</ref> a pre-trained Long-Former for the mental health domain. This model can be found in https://huggingface.co/AIMH/ mental-longformer-base-4096.</p><p>As in with the RoBERTa model, we fine-tuned the LongFormer with the two datasets: Dataset 1 without data augmentation (LongFormer-1), and Dataset 2 with data augmentation (LongFormer-2). We used the same fine-tuning parameters as in RoBERTa's experimentation; the configuration is in Table <ref type="table" target="#tab_7">7</ref>.</p><p>Table <ref type="table" target="#tab_10">9</ref> shows the results of the experimentation, where LongFormer-2 (fine-tuned with data augmentation) achieves better performance than LongFormer-1 (fine-tuned without data augmentation). We used the two models in our participation, as Run2 and Run3 </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Runs</head><p>Table <ref type="table" target="#tab_11">10</ref> summarizes the selected model for each run, also the development performance is shown. The rationale for selecting these models was to evaluate the significance of context in predicting anorexia. Each model varies in its capacity to handle input length, allowing for the processing of different context sizes. By comparing models with varying context-handling capabilities, we aim to determine how the extent of context affects the accuracy and effectiveness of mental illness prediction.</p><p>The results demonstrate that the SVM model, despite being less powerful in general, achieved performance comparable to MentalRoBERTa. This can be attributed to the SVM's ability to handle large texts, leveraging the full context provided by the input data. On the other hand, LongFormer models outperformed both BERT-like models and the SVM in this task. The performance of LongFormer can be credited to its capability to process larger contexts while maintaining the powerful features of Transformer-based models. This combination allows LongFormer to capture more comprehensive contextual information, leading to more accurate predictions in mental illness detection tasks.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Run Configuration</head><p>Besides, to select the model for each run, the classification systems contained additional parameters that needed to be set:</p><p>• For every round in the competition, we used as the input classifier a new sample created combining the new message of the user with the previous ones.</p><p>• Each system has an initial context, in other words, we made our systems wait until the initial context was sufficiently large. This context was different in each system:</p><p>-SVM: An initial context of 50 tokens after the pre-process.</p><p>-RoBERTa and LongFormer: An initial context of 100 tokens.</p><p>• The RoBERTa and LongFormer system has a limit of tokens, when the system was full we just returned the last prediction made.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Results</head><p>Table <ref type="table" target="#tab_12">11</ref> shows the results achieved by our teams in Task 2. The structure of the Table <ref type="table" target="#tab_12">11</ref> is the following: rows refer to each run and a special row refers to the highest values of the competition. The systems in the competition were ranked using the Macro-F1 score (last column). A total of 46 different systems (runs) participated in this task. Table <ref type="table" target="#tab_12">11</ref> shows how the best systems are Run 0 and Run 3 if we take F1-score as the evaluation metric. Run 0 refers to SVM-1, a Support Vector Machine without sentimental analysis and a basic preprocess for the data. Run 3 refers to the LongFormer-2: pre-trained LongFormer fine-tuned with the data augmentation. These two runs achieved the eighth position in the global table at the competition.</p><p>However, our first thought was that LongFormer would perform better because of its power and capacity to handle large text, SVM has proven to achieve equal results thanks to its ability to deal with long texts. This indicates that classical approaches like SVMs continue to be useful in detecting mental illnesses because of their ability to handle large contexts. Therefore, SVMs still well-fitted in situations with low computational resources.</p><p>On the other hand, the results show how data augmentation has improved the performance of our models if we compare Run2 and Run3. Data augmentation helped our model learn more about positive samples and fit into the problem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="6.">Conclusion</head><p>In this paper, we have presented the participation of the ELiRF-VRAIN team in Task 2 of eRisk at CLEF 2024: early detection of signs of anorexia. In addition to testing classic classification models and state-of-the-art Transformer models, we used LongFormers models to expand the context when making the decision. In addition, a proposal for data augmentation was presented with successful results during the training process.</p><p>For future work, two lines of improvement are identified. On the one hand, try to improve early detection so that the system does not need as much context to make the right decision; on the other hand, use Explainable Artificial Intelligence (XAI) techniques to understand the system's behavior better.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Distribution of samples across the 2018 and 2019 partitions of the Task 2 dataset.</figDesc><table><row><cell></cell><cell cols="3">2018 2019 Total</cell></row><row><cell>None</cell><cell>411</cell><cell>742</cell><cell>1153</cell></row><row><cell>Anorexia</cell><cell>61</cell><cell>73</cell><cell>134</cell></row><row><cell>Total</cell><cell>472</cell><cell>815</cell><cell>1287</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 2</head><label>2</label><figDesc>Distribution of samples in Dataset 1 for training and development partitions</figDesc><table><row><cell></cell><cell cols="2">Train Development</cell></row><row><cell>None</cell><cell>920</cell><cell>233</cell></row><row><cell>Anorexia</cell><cell>109</cell><cell>25</cell></row><row><cell>Total</cell><cell>1029</cell><cell>258</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>(𝑚 1 ...𝑚 𝑛 ), (𝑚 1 ...𝑚 𝑛 𝑚 𝑛+1 ), ..., (𝑚 1 ...𝑚 𝑛 ...𝑚 𝑀 𝐴𝑋 )</figDesc><table /><note>𝑚 1 𝑚 2 𝑚 3 ), ..., (𝑚 1 ...𝑚 𝑛−1 ) b) and 𝑀 𝐴𝑋 − 𝑛 + 1 positive samples:</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3</head><label>3</label><figDesc>Distribution of samples in Dataset 2 for training and development partitions</figDesc><table><row><cell></cell><cell cols="2">Train Development</cell></row><row><cell>None</cell><cell>18255</cell><cell>233</cell></row><row><cell cols="2">Anorexia 2272</cell><cell>25</cell></row><row><cell>Total</cell><cell>20527</cell><cell>258</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4</head><label>4</label><figDesc>The results from different classifiers in the development partition. The scores are the Macro-precision, recall and f1-score.</figDesc><table><row><cell></cell><cell cols="3">precision recall f1-score</cell></row><row><cell>Linear SVM</cell><cell>0.83</cell><cell>0.80</cell><cell>0.81</cell></row><row><cell>Gradient Boosting</cell><cell>0.72</cell><cell>0.75</cell><cell>0.74</cell></row><row><cell>K-Neighboors</cell><cell>0.45</cell><cell>0.50</cell><cell>0.47</cell></row><row><cell>AdaBoost</cell><cell>0.74</cell><cell>0.74</cell><cell>0.74</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 5</head><label>5</label><figDesc>Summary of the different configurations of the SVM classifiers.</figDesc><table><row><cell></cell><cell cols="2">Preprocess Sentiment data analysis approach</cell><cell>TF-IDF</cell><cell>Best Model</cell></row><row><cell>SVM-1</cell><cell>1</cell><cell>No</cell><cell cols="2">"char_wb" , 4-5 n-gram 'C': 100, 'loss': 'hinge', 'tol': 0.01</cell></row><row><cell cols="4">SVM-2 "char_wb" SVM-5 2 No 1 No "word" , 1-2 n-gram</cell><cell>'C': 1, 'loss': 'squared_hinge', 'tol': 0.01</cell></row><row><cell>SVM-6</cell><cell>2</cell><cell>No</cell><cell>"word" , 1-2 n-gram</cell><cell>'C': 1, 'loss': 'squared_hinge', 'tol': 0.01</cell></row><row><cell>SVM-7</cell><cell>1</cell><cell>Yes</cell><cell>"word" , 1-2 n-gram</cell><cell>'C': 10, 'loss': 'hinge', 'tol': 0.1</cell></row><row><cell>SVM-8</cell><cell>2</cell><cell>Yes</cell><cell>"word" , 1-2 n-gram</cell><cell>'C': 10, 'loss': 'hinge', 'tol': 0.1</cell></row></table><note>, 4-5 n-gram 'C': 100, 'loss': 'hinge', 'tol': 0.01 SVM-3 1 Yes "char_wb" , 4-5 n-gram 'C': 10, 'loss': 'hinge', 'tol': 0.1 SVM-4 2 Yes "char_wb" , 4-5 n-gram 'C': 10, 'loss': 'hinge', 'tol': 0.1</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 6</head><label>6</label><figDesc>Results of the different configurations of the SVM classifiers on development partition. In bold, the best result for each metric.</figDesc><table><row><cell></cell><cell cols="3">Precision Recall F1-score</cell></row><row><cell>SVM-1</cell><cell>0.92</cell><cell>0.89</cell><cell>0.91</cell></row><row><cell>SVM-2</cell><cell>0.86</cell><cell>0.84</cell><cell>0.85</cell></row><row><cell>SVM-3</cell><cell>0.91</cell><cell>0.85</cell><cell>0.88</cell></row><row><cell>SVM-4</cell><cell>0.84</cell><cell>0.83</cell><cell>0.83</cell></row><row><cell>SVM-5</cell><cell>0.91</cell><cell>0.83</cell><cell>0.87</cell></row><row><cell>SVM-6</cell><cell>0.86</cell><cell>0.81</cell><cell>0.83</cell></row><row><cell>SVM-7</cell><cell>0.89</cell><cell>0.82</cell><cell>0.83</cell></row><row><cell>SVM-8</cell><cell>0.84</cell><cell>0.80</cell><cell>0.82</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 7</head><label>7</label><figDesc>Parameters for the fine-tuning process.</figDesc><table><row><cell>parameter</cell><cell>value</cell></row><row><cell>optimizer</cell><cell>AdamW</cell></row><row><cell>learning rate</cell><cell>7e-5</cell></row><row><cell>lr scheduler type</cell><cell>linear</cell></row><row><cell>weight decay</cell><cell>0.01</cell></row><row><cell>number of epochs</cell><cell>10</cell></row><row><cell>training batch size</cell><cell>16</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Table 8</head><label>8</label><figDesc>displays the results of each model on the development partition. The results indicate that RoBERTa-2 obtained the best performance, a fine-tuned model with data augmentation. Consequently, we used this model for Run1 in Task 2 of our participation.</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Table 8</head><label>8</label><figDesc>RoBERTa's result for Task 2 on development partition.</figDesc><table><row><cell></cell><cell cols="4">Data Augmentation Precision Recall F1-score</cell></row><row><cell>RoBERTa-1</cell><cell>No</cell><cell>0.88</cell><cell>0.85</cell><cell>0.86</cell></row><row><cell>RoBERTa-2</cell><cell>Yes</cell><cell>0.92</cell><cell>0.90</cell><cell>0.91</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_10"><head>Table 9</head><label>9</label><figDesc>LongFormer's results for Task 2 on development partition.</figDesc><table><row><cell></cell><cell cols="4">Data Augmentation Precision Recall F1-score</cell></row><row><cell>LongFormer-1</cell><cell>No</cell><cell>0.91</cell><cell>0.89</cell><cell>0.89</cell></row><row><cell>LongFormer-2</cell><cell>Yes</cell><cell>0.96</cell><cell>0.92</cell><cell>0.94</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_11"><head>Table 10</head><label>10</label><figDesc>Summary of the approaches chosen for each run. Also, the performance achieved by each system in the development partition.</figDesc><table><row><cell></cell><cell>Task</cell><cell>Model</cell><cell cols="3">Precision Recall F1-score</cell></row><row><cell>Run0</cell><cell>1</cell><cell>SVM-1</cell><cell>0.92</cell><cell>0.89</cell><cell>0.91</cell></row><row><cell>Run1</cell><cell>1</cell><cell>RoBERTa-2</cell><cell>0.92</cell><cell>0.90</cell><cell>0.91</cell></row><row><cell>Run2</cell><cell>1</cell><cell>LongFormer-1</cell><cell>0.91</cell><cell>0.89</cell><cell>0.89</cell></row><row><cell>Run3</cell><cell>2</cell><cell>LongFormer-2</cell><cell>0.96</cell><cell>0.92</cell><cell>0.94</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_12"><head>Table 11</head><label>11</label><figDesc>Results for the 4 runs on Task 2. Highest refers to the highest values achieved in the competition. The values inside the parenthesis indicate our position in the ranking.</figDesc><table><row><cell></cell><cell>Model</cell><cell cols="2">Precision Recall</cell><cell>F1-score</cell></row><row><cell>Run0</cell><cell>SVM</cell><cell>0.43 (15)</cell><cell>0.99</cell><cell>0.60 (8)</cell></row><row><cell>Run1</cell><cell>RoBERTa</cell><cell>0.41</cell><cell>1.00 (1)</cell><cell>0.58</cell></row><row><cell>Run2</cell><cell>LongFormer-1</cell><cell>0.32</cell><cell>0.99</cell><cell>0.49</cell></row><row><cell>Run3</cell><cell>LongFormer-2</cell><cell>0.43 (15)</cell><cell>0.99</cell><cell>0.60 (8)</cell></row><row><cell>Highest</cell><cell>-</cell><cell>0.73</cell><cell>1.00</cell><cell>0.790</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work is partially supported by MCIN/AEI/10.13039/501100011033, by the "European Union" and "NextGenerationEU/MRR", and by "ERDF A way of making Europe" under grants PDC2021-120846-C44 and PID2021-126061OB-C41. Partially supported by the Vicerrectorado de Investigación de la Universitat Politècnica de València PAID-01-23. It is also partially supported by the Spanish Ministerio de Universidades under the grant FPU21/05288 for university teacher training.</p></div>
			</div>


			<div type="availability">
<div xmlns="http://www.tei-c.org/ns/1.0"><p>https://vrain.upv.es/elirf/ (A. Casamayor); https://vrain.upv.es/elirf/ (V. Ahuir); https://vrain.upv.es/elirf/ (A. Molina); https://vrain.upv.es/elirf/ (L. Hurtado)</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">Anorexia</forename><surname>Feacab</surname></persName>
		</author>
		<ptr target="https://feacab.org/anorexia/" />
		<imprint>
			<date type="published" when="2015">2015. 2024-05-28</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">Overview of erisk 2024: Early risk prediction on the internet</title>
		<author>
			<persName><forename type="first">J</forename><surname>Parapar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Martín Rodilla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Losada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Crestani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. 15th International Conference of the CLEF Association, CLEF 2024</title>
				<meeting><address><addrLine>Grenoble, France</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International</publisher>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Overview of erisk 2024: Early risk prediction on the internet (extended overview)</title>
		<author>
			<persName><forename type="first">J</forename><surname>Parapar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Martín Rodilla</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Losada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Crestani</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Working Notes of the Conference and Labs of the Evaluation Forum CLEF 2024</title>
		<title level="s">CEUR Workshop Proceedings</title>
		<meeting><address><addrLine>Grenoble, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Attention is all you need</title>
		<author>
			<persName><forename type="first">A</forename><surname>Vaswani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Shazeer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Parmar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Uszkoreit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Jones</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">N</forename><surname>Gomez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Ł</forename><surname>Kaiser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Polosukhin</surname></persName>
		</author>
		<ptr target="https://arxiv.org/abs/1706.03762" />
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<date type="published" when="2017">2017. 2024-05-15</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1907.11692</idno>
		<ptr target="https://arxiv.org/abs/1907.11692" />
		<title level="m">RoBERTa: A robustly optimized BERT pretraining approach</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">I</forename><surname>Beltagy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">E</forename><surname>Peters</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Cohan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2004.05150</idno>
		<ptr target="https://arxiv.org/abs/2004.05150" />
		<title level="m">Longformer: The long-document transformer</title>
				<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Overview of mentalriskes at iberlef 2024: Early detection of mental disorders risk in spanish</title>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Mármol-Romero</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moreno-Muñoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">M P</forename><surname>Del Arco</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">D</forename><surname>Molina-González</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-T</forename><surname>Martín-Valdivia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">A</forename><surname>Ureña-López</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Montejo-Ráez</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procesamiento del Lenguaje Natural</title>
		<imprint>
			<biblScope unit="volume">73</biblScope>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">A test collection for research on depression and language use</title>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">E</forename><surname>Losada</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Crestani</surname></persName>
		</author>
		<idno type="DOI">10.1007/978-3-319-44564-9_3</idno>
		<ptr target="https://doi.org/10.1007/978-3-319-44564-9_3.doi:10.1007/978-3-319-44564-9_3" />
	</analytic>
	<monogr>
		<title level="m">Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the 7th International Conference of the CLEF Association</title>
				<meeting><address><addrLine>CLEF</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2016">2016. 2016</date>
			<biblScope unit="page" from="28" to="39" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Scikit-learn: Machine learning in python</title>
		<author>
			<persName><forename type="first">F</forename><surname>Pedregosa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Varoquaux</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gramfort</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Michel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Thirion</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Grisel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Blondel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Prettenhofer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Weiss</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Dubourg</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Vanderplas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Passos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cournapeau</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Brucher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Perrot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">É</forename><surname>Duchesnay</surname></persName>
		</author>
		<ptr target="https://jmlr.org/papers/v12/pedregosa11a.html" />
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="2825" to="2830" />
			<date type="published" when="2011">2011</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><forename type="middle">X</forename><surname>Yuan</surname></persName>
		</author>
		<idno type="DOI">10.57967/hf/1422</idno>
		<ptr target="https://huggingface.co/lxyuan/distilbert-base-multilingual-cased-sentiments-student.doi:10.57967/hf/1422" />
		<title level="m">distilbert-base-multilingual-cased-sentiments-student (revision 2e33845</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<monogr>
		<author>
			<persName><forename type="first">N</forename><forename type="middle">I</forename><surname>Of</surname></persName>
		</author>
		<ptr target="https://www.nimh.nih.gov/health/statistics/eating-disorders" />
		<title level="m">Mental Health, Eating disorders</title>
				<imprint>
			<date type="published" when="2024-05-30">2024-05-30</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Harnessing the power of hugging face transformers for predicting mental health disorders in social networks</title>
		<author>
			<persName><forename type="first">A</forename><surname>Pourkeyvan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Safa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sorourkhah</surname></persName>
		</author>
		<idno type="DOI">10.1109/access.2024.3366653</idno>
		<ptr target="http://dx.doi.org/10.1109/ACCESS.2024.3366653.doi:10.1109/access.2024.3366653" />
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<biblScope unit="volume">12</biblScope>
			<biblScope unit="page" from="28025" to="28035" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">MentalBERT: Publicly available pretrained language models for mental healthcare</title>
		<author>
			<persName><forename type="first">S</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Ansari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Tiwari</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Cambria</surname></persName>
		</author>
		<ptr target="https://aclanthology.org/2022.lrec-1.778" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the Thirteenth Language Resources and Evaluation Conference, European Language Resources Association</title>
				<editor>
			<persName><forename type="first">N</forename><surname>Calzolari</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">F</forename><surname>Béchet</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">P</forename><surname>Blache</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">K</forename><surname>Choukri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">C</forename><surname>Cieri</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">T</forename><surname>Declerck</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Goggi</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Isahara</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">B</forename><surname>Maegaard</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Mariani</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">H</forename><surname>Mazo</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">J</forename><surname>Odijk</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">S</forename><surname>Piperidis</surname></persName>
		</editor>
		<meeting>the Thirteenth Language Resources and Evaluation Conference, European Language Resources Association<address><addrLine>Marseille, France</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="7184" to="7190" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<monogr>
		<title level="m" type="main">Mentalroberta: A robustly optimized bert pretraining approach for mental health</title>
		<author>
			<persName><surname>Aimh</surname></persName>
		</author>
		<ptr target="https://huggingface.co/AIMH/mental-roberta-large" />
		<imprint>
			<date type="published" when="2024-05-15">2024. 2024-05-15</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Transformers: State-of-the-art natural language processing</title>
		<author>
			<persName><forename type="first">T</forename><surname>Wolf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Debut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Sanh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Chaumond</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Delangue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Moi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Cistac</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Rault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Louf</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Funtowicz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Davison</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Shleifer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Von Platen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Jernite</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Plu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">L</forename><surname>Scao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Gugger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Drame</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Lhoest</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">M</forename><surname>Rush</surname></persName>
		</author>
		<ptr target="https://www.aclweb.org/anthology/2020.emnlp-demos.6" />
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</title>
				<meeting>the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<monogr>
		<author>
			<persName><surname>Aimh</surname></persName>
		</author>
		<ptr target="https://huggingface.co/AIMH/mental-longformer-base-4096" />
		<title level="m">Mentallongformer: A long-document transformer model for mental health</title>
				<imprint>
			<date type="published" when="2024-05-15">2024. 2024-05-15</date>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
