<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Arthur Schopenhauer at Touché 2024: Multi-Lingual Text Classification Using Ensembles of Large Language Models</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hamza Yunis</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>This paper describes the submitted approach of Team Arthur Schopenhauer to Task 1 of the Touché lab at CLEF 2024. The goal of this task is twofold: detecting human values in texts (Subtask 1), and recognizing whether these values are attained or constrained (Subtask 2). The approach described in this paper simplifies Subtask 1 by restricting the detected values in a text to a maximum of one value. It also simplifies Subtask 2 by handling it separately from Subtask 1; that is, human values and attainment are detected independently of each other. This simplification strategy proved successful, as the submitted approach was ranked 2nd among the participating teams' best submissions (a single team can make multiple submissions) in Subtask 1 and was ranked 1st in Subtask 2. The described simplification results in two text-classification tasks, which are handled by fine-tuning and ensembling multiple BERT-based models.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Touché</kwd>
        <kwd>Human Value Detection</kwd>
        <kwd>BERT</kwd>
        <kwd>Large Language Models</kwd>
        <kwd>Ensembling</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>
        The dataset provided by the organizers consists of the training set (labeled), the validation set
(labeled), and the test set (not labeled). It stems from the ValuesML project, itself part of a
broad JRC initiative that aims for a deep insight into values and identities [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Once the models
of our approach were developed, they were applied to the test set to predict its labels. The
predicted labels were then submitted to the organizers via the TIRA platform [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] for evaluation.
      </p>
      <p>The labeled part of the dataset contains zero-one labels for 19 human values, with two label
columns for each value corresponding to constrained and attained, totaling 38 columns. A text
may constrain or attain a specific value, but not both. However, there are texts where it is
unclear whether the referenced value is attained or constrained, in which case both columns
corresponding to the value are filled with 0.5.</p>
      <p>The dataset contains texts from 9 languages. In addition, the organizers provide automated
English translations for non-English texts. However, due to concerns regarding the accuracy
of the translations, our approach uses the original texts and relies on multi-lingual language
models.</p>
    </sec>
    <sec id="sec-3">
      <title>3. System Overview</title>
      <p>Our submitted approach tackles the two described subtasks independently. Furthermore, the
labeled datasets were divided into English and non-English texts, for each of which a diferent
set of models was fine-tuned. Upon applying the fine-tuned models to the test set, which was
used for the final submission, the texts in the test set were split in the same manner and the
appropriate models were applied to each part.</p>
      <sec id="sec-3-1">
        <title>3.1. Task Simplification</title>
        <p>This section describes how the original subtasks were transformed in order to simplify the model
ifne-tuning process.</p>
        <sec id="sec-3-1-1">
          <title>3.1.1. Simplifying Subtask 1</title>
          <p>Subtask 1, in its given form, corresponds to a multi-label classification problem, because a single
text may refer to multiple human values; that is, a single data instance may belong to multiple
classes simultaneously. However, preliminary data analysis showed that approximately 94% of
the labeled texts have either one label or no label. Therefore, for the sake of simplicity, it was
decided to restrict the fine-tuning process to these instances, which turns the problem into a
single-label classification problem. This simplification required introducing the no-label class for
texts that have no label.</p>
        </sec>
        <sec id="sec-3-1-2">
          <title>3.1.2. Simplifying Subtask 2</title>
          <p>Subtask 2 was tackled independently of Subtask 1, which means the models of Subtask 2
were fine-tuned to predict a given text’s attainment, regardless of the human value that the
text references. Accordingly, the simplified version of Subtask 2 corresponds to a single-label
classification problem with two classes, namely attained and constrained.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data Preprocessing</title>
        <p>The major steps of data preprocessing are shown in Figure 1. It begins by merging together the
training and validation sets, then unuseful data is cleaned from the merged set, after which the
dataset is reshaped to reflect the task simplification described in Section 3.1, and finally a new
validation set is created within which the diferent strata of the labeled data are proportionally
represented.</p>
        <p>Original training Merge</p>
        <p>set</p>
        <p>Original
validation set</p>
        <p>Merged dataset</p>
        <p>Filter out
unuseful data</p>
        <p>Reshape
dataset</p>
        <p>Create new split</p>
        <p>New training set
New validation
set</p>
        <sec id="sec-3-2-1">
          <title>3.2.1. Filtering Out Unuseful Data</title>
          <p>
            After merging the training and validation sets, the following rows were removed from the merged
set with the help of the pandas [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ] library:
• Rows with duplicate texts (first occurrence kept).
• Rows with more than one label (in accordance with task simplification from Section 3.1.1).
• Rows with two words or less (believed to be noisy).
          </p>
        </sec>
        <sec id="sec-3-2-2">
          <title>3.2.2. Reshaping the Dataset</title>
          <p>To reflect the task simplification described in Section 3.1, the original 38 label columns were
replaced by the following 2 columns:
hv_value A numeric code for the human value referenced by the text (including no-label).
attainment A numeric code for attainment (constrained, attained, or unknown). The unknown
code is assigned to texts that do not have a human value label, or for which the attainment
was unclear in the original dataset.</p>
          <p>In addition, rows with the human value Humility were removed from the dataset. The reason for
this additional filtering is that such rows are rare in the dataset and, after initial experiments,
the fine-tuned models could not predict Humility with any accuracy.</p>
        </sec>
        <sec id="sec-3-2-3">
          <title>3.2.3. Creating a New Split</title>
          <p>
            The last step of data preprocessing was creating a new train-validation split. The validation set
was created using the proportional allocation strategy with a sampling rate of 0.1, whereby each
stratum is specified by a combination of language and label, for example all rows with language
“EN” and label “Conformity: interpersonal” form one stratum. Splitting was achieved using the
function train_test_split from scikit-learn [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] using the fixed random state 66 (not related
to the random seed used when fine-tuning the models). This way, the validation set could be
reproduced in diferent Python sessions.
          </p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Fine-Tuning the Models</title>
        <p>
          For both subtasks, the approach relies on the pretrained models microsoft/deberta-v2-xxlarge
[
          <xref ref-type="bibr" rid="ref10">10</xref>
          ] for English texts and FacebookAI/xlm-roberta-large [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] for non-English texts, both obtained
from the Hugging Face Hub [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ]. The process of producing the fine-tuned models is depicted in
Figure 2.
        </p>
        <p>Training set
Validation set</p>
        <p>Language-based split</p>
        <p>English training set
English validation set
Non-English training</p>
        <p>set
Non-English
validation set</p>
        <p>Fine-tuning</p>
        <p>Pretrained 
deberta-v2-xxlarge</p>
        <p>Pretrained 
xlm-roberta
Fine-tuning</p>
        <p>English fine-tuned</p>
        <p>model
Non-English
finetuned model</p>
        <p>
          For Subtask 1, bagging [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] was applied using two four-model ensembles, one for each language
subset. For Subtask 2, only one model was fine-tuned for each language subset, because using
multiple models ofered no improvement of predicative performance during experimentation.
The classification heads of the fine-tuned models had 19 outputs 1 for Subtask 1 and 2 outputs
for Subtask 2. It should be noted that the models for Subtask 2 were fine-tuned only on the
data with known attainment; that is, rows with the unknown value in the attainment column
were excluded.
        </p>
        <p>
          Our approach applies the commonly used cross-entropy loss function [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. However, due to
observed class imbalance in Subtask 1, the use of the weighted cross-entropy loss function [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] was
contemplated. Our experiments showed that using the weighted cross-entropy loss function (using
inverse class frequencies as weights) delivers higher performance for some low-frequency classes,
but lower performance overall; therefore, a combination of both weighted and non-weighted
cross-entropy loss functions was used in each ensemble.
        </p>
        <p>All ten models of our approach were fine-tuned using the same train-validation split, but
with diferent hyperparameters, as described in Table 1. The remaining hyperparameters are
described in Appendix A.</p>
        <p>
          Fine-tuning was performed using PyTorch [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] directly, rather than the Hugging Face Trainer
API. During fine-tuning, checkpointing was used, so the model checkpoint with the best F1-score
(macro) was kept.
        </p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Ensembling Strategy</title>
        <p>Ensembling is relevant only to Subtask 1, because for Subtask 2, only one model is used with
each language subset.</p>
        <p>Each of the models in Table 1 produces a predicted label2, along with a probability of that
1The Humility class was removed from the original 19 classes and the no-label class was added.
2For details on extracting predictions from neural network outputs, see https://www.learnpytorch.io/02_pytorch_
classification/
label. One common way to ensemble predictions is soft voting3. Our approach adjusts the
original soft voting strategy by employing the concept of safe prediction, for want of a better
term, which will be used to denote a prediction whose probability exceeds a certain threshold.
With this definition of a safe prediction, ensembling was achieved using Algorithm 1 ( pruned
soft voting). The rationale behind this algorithm is as follows: If one of the predictions is
safe, while the others are not, then it should be chosen as the final prediction, regardless of the
remaining predictions.</p>
        <p>The threshold used in the final submission was obtained by repeatedly applying Algorithm 1
with a diferent threshold to the validation set and selecting the threshold that produced the best
macro F1-score. For the English ensemble, the optimal threshold was 0.44, for the non-English
ensemble: 0.49.</p>
        <p>Table 2 displays a performance comparison between pruned soft voting and ordinary soft
voting using the validation set and shows that pruned soft voting ofers a marginal improvement.
However, it should be noted that, since the threshold for pruned soft voting was optimized using
the validation set itself, the evaluation scores of pruned soft voting will be at least as high as
those of soft voting, because soft voting is equivalent to pruned soft voting with threshold 0.</p>
        <p>One point to consider when evaluating the ensembling strategies is that the no-label class (see
3.1.1) is included in the calculation of the F1-score (macro). This class will not be used in the
ifnal evaluation of the approach by the organizers, so for each evaluation that we performed, a
corresponding adjusted F1-score which does not include the no-label class was calculated.
3For details on soft voting, see https://machinelearningmastery.com/voting-ensembles-with-python/
Algorithm 1 Pruned Soft Voting
Input:</p>
        <p>Sequence  of pairs ( 1,  1), . . . , (  ,   ) of predicted values for the label of one data instance,
coupled with their prediction probabilities</p>
        <p>Probability threshold 
Output:</p>
        <p>One final label prediction
if exists at least one probability   in  such that   ≥  then
return the final label prediction by applying soft voting only to those pairs (  ,   ) in  with
  ≥ 
else</p>
        <p>return the final label prediction by applying soft voting to the entire sequence 
end if</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>Upon submitting the approach, the fine-tuned models were applied to the test set and the results
were exported to a .tsv file in the required format, which was submitted to the organizers. As
texts with two words or less are believed to be noisy, the models of Subtask 1 were not applied
to these texts; rather, no-label was manually predicted.</p>
      <p>The evaluation results that were reported by the organizers are shown in Table 3 and Table
4. The reported F1-score for Subtask 1 (0.35) is significantly lower than the adjusted F1-score
(0.3902) produced during our evaluation using the validation set (Table 2). This was expected
for three reasons:
1. The the Humility label was not included when calculating the F1-scores in our evaluations.</p>
      <p>Since our submission never predicts the Humility label, the F1-score for this label was 0
when our submission was evaluated by the organizers, thus reducing the overall
macroaveraged F1-score.
2. The filtering described in Section 3.2.1 was not applied to the test set. In particular, the
test set does contain texts with multiple labels, which our approach cannot cope with.
3. In the process of fine-tuning the models, the validation set was used for checkpointing;
therefore, the models have a degree of overfitness to the validation set.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion and Future Work</title>
      <p>This paper presented the approach of Team Arthur Schopenhauer to Task 1 of the Touché lab
at CLEF 2024. The main idea of the approach is simplifying the given subtasks. It simplifies
Subtask 1 by eliminating the possibility of detecting multiple human values in a single text,
and simplifies Subtask 2 by eliminating the possibility of detecting any dependence between
referenced human values and attainment in a text. The source code for the approach is available
under the following like: https://github.com/h-uns/clef2024-human-value-detection.</p>
      <p>For future work, there are two notable areas of experimentation for improving the submitted
approach:
• Using larger or newer model architectures than the ones used in the approach.
• Developing separate, specialized models that detect only certain subsets of human values,
rather than all 19 values. Reducing the number of detectable human values is expected
to improve the training eficiency of each model. In addition, combining such models can
facilitate detecting multiple human values in a single text.
ii:tttrceoogndhuh ii:i-ttrcceaoondn iltaon isnm teevnm :irceaodnnm :ssrrrceeou :iltsrreaoypn i:ilttsrceaoy iiton i:ltsrreoyum i:ilfttsrrreeaooypnnm ility l:ircceeaogvnn l:iiltceeeeaovynpdndb li:ssrrcceeaonnm li:tssrreeaaunm li:ltssrrceeeaaonm</p>
      <p>EN llA lf-eS lfeS itSum eodH icehA eoPw eoPw ceaF ceuS ceSu radT fonC onC uHm eenB eenB ivnU ivnU ivnU
Submission
valueeval24-arthur-schopenhauer 83 77 83 85 88 87 73 84 80 82 84 78 80 79 74 91 89 86 85 81
valueeval24-bert-baseline-en ✓ 81 83 79 86 88 84 77 80 74 84 81 78 78 79 87 89 86 85 81 78
valueeval24-random-baseline ✓ 52 51 47 54 52 53 55 53 52 52 50 54 53 49 45 53 56 52 49 56</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>
        The approaches [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] from SemEval-2023 provided a valuable kickstart for this approach.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. H.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Cieciuch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vecchione</surname>
          </string-name>
          , E. Davidov,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fischer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Beierlein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Verkasalo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-E.</given-names>
            <surname>Lönnqvist</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Demirutku</surname>
          </string-name>
          , et al.,
          <source>Refining the Theory of Basic Individual Values, Journal of personality and social psychology 103</source>
          (
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .1037/a0029393.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiesel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alshomary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Handke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wachsmuth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <article-title>Identifying the Human Values behind Arguments</article-title>
          , in: S. Muresan,
          <string-name>
            <given-names>P.</given-names>
            <surname>Nakov</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Villavicencio (Eds.),
          <article-title>60th Annual Meeting of the Association for Computational Linguistics (ACL</article-title>
          <year>2022</year>
          ), Association for Computational Linguistics,
          <year>2022</year>
          , pp.
          <fpage>4459</fpage>
          -
          <lpage>4471</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          .
          <article-title>acl-long</article-title>
          .
          <volume>306</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiesel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alshomary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mirzakhmedova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Heinrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Handke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wachsmuth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          , SemEval
          <article-title>-2023 Task 4: ValueEval: Identification of Human Values behind Arguments</article-title>
          , in: R.
          <string-name>
            <surname>Kumar</surname>
            ,
            <given-names>A. K.</given-names>
          </string-name>
          <string-name>
            <surname>Ojha</surname>
            ,
            <given-names>A. S.</given-names>
          </string-name>
          <string-name>
            <surname>Doğruöz</surname>
            ,
            <given-names>G. D. S.</given-names>
          </string-name>
          <string-name>
            <surname>Martino</surname>
          </string-name>
          , H. T. Madabushi (Eds.),
          <source>17th International Workshop on Semantic Evaluation (SemEval</source>
          <year>2023</year>
          ),
          <article-title>Association for Computational Linguistics</article-title>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>2287</fpage>
          -
          <lpage>2303</lpage>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .semeval-
          <volume>1</volume>
          .
          <fpage>313</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Kiesel</surname>
          </string-name>
          , Ç. Çöltekin,
          <string-name>
            <given-names>M.</given-names>
            <surname>Heinrich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Alshomary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. D.</given-names>
            <surname>Longueville</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Erjavec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Handke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kopp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ljubešić</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Meden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Mirzakhmedova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Morkevičius</surname>
          </string-name>
          , T. ReitisMünstermann, M. Scharfbillig,
          <string-name>
            <given-names>N.</given-names>
            <surname>Stefanovitch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wachsmuth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          , Overview of Touché 2024:
          <article-title>Argumentation Systems</article-title>
          , in: L.
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Mulhem</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <string-name>
            <surname>Quénot</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Schwab</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>G. M. D. Nunzio</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          <string-name>
            <surname>Galuščáková</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. G. S. de Herrera</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fifteenth International Conference of the CLEF Association (CLEF</source>
          <year>2024</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          , CoRR abs/
          <year>1810</year>
          .04805 (
          <year>2018</year>
          ). URL: http://arxiv. org/abs/
          <year>1810</year>
          .04805. arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Scharfbillig</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Smillie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mair</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sienkiewicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Keimer</surname>
          </string-name>
          , R. Pinho Dos Santos,
          <string-name>
            <given-names>H. Vinagreiro</given-names>
            <surname>Alves</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Vecchione</surname>
          </string-name>
          , L. Scheunemann, Values and Identities - a
          <string-name>
            <surname>Policymaker's Guide</surname>
          </string-name>
          ,
          <source>Technical Report KJ-NA-30800-EN-N, European Commission's Joint Research Centre, Luxembourg</source>
          ,
          <year>2021</year>
          . doi:
          <volume>10</volume>
          .2760/349527.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolyada</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Grahm</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Elstner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Loebe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Hagen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <article-title>Continuous Integration for Reproducible Shared Tasks with TIRA.io</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Caputo (Eds.),
          <source>Advances in Information Retrieval. 45th European Conference on IR Research (ECIR</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York,
          <year>2023</year>
          , pp.
          <fpage>236</fpage>
          -
          <lpage>241</lpage>
          . doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>031</fpage>
          -28241-6_
          <fpage>20</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>pandas development team</article-title>
          , pandas-dev/pandas: Pandas,
          <year>2020</year>
          . URL: https://doi.org/10. 5281/zenodo.3509134. doi:
          <volume>10</volume>
          .5281/zenodo.3509134.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , E. Duchesnay,
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>P.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          , W. Chen,
          <article-title>Deberta: decoding-enhanced bert with disentangled attention</article-title>
          ,
          <source>in: 9th International Conference on Learning Representations (ICLR</source>
          <year>2021</year>
          ), OpenReview.net,
          <year>2021</year>
          . URL: https://openreview.net/forum?id=XPZIaotutsD.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>A.</given-names>
            <surname>Conneau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Khandelwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Wenzek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Guzmán</surname>
          </string-name>
          , E. Grave,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ott</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zettlemoyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Stoyanov</surname>
          </string-name>
          ,
          <article-title>Unsupervised cross-lingual representation learning at scale</article-title>
          , in: D.
          <string-name>
            <surname>Jurafsky</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chai</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          <string-name>
            <surname>Schluter</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          <string-name>
            <surname>Tetreault</surname>
          </string-name>
          (Eds.),
          <source>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL</source>
          <year>2020</year>
          ), ACL,
          <year>2020</year>
          , pp.
          <fpage>8440</fpage>
          -
          <lpage>8451</lpage>
          . doi:
          <volume>10</volume>
          .18653/V1/
          <year>2020</year>
          .ACL-MAIN.
          <year>747</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wolf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Delangue</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Moi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Cistac</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rault</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Louf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Funtowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Davison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shleifer</surname>
          </string-name>
          , P. von Platen, C. Ma,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jernite</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Plu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. L.</given-names>
            <surname>Scao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gugger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Drame</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Lhoest</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. M.</given-names>
            <surname>Rush</surname>
          </string-name>
          ,
          <article-title>Huggingface's transformers: State-of-the-art natural language processing</article-title>
          ,
          <year>2020</year>
          . arXiv:
          <year>1910</year>
          .03771.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>L.</given-names>
            <surname>Breiman</surname>
          </string-name>
          , Bagging predictors,
          <source>Machine learning 24</source>
          (
          <year>1996</year>
          )
          <fpage>123</fpage>
          -
          <lpage>140</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mohri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <article-title>Cross-entropy loss functions: Theoretical analysis and applications, 2023</article-title>
          . URL: https://arxiv.org/abs/2304.07288. arXiv:
          <volume>2304</volume>
          .
          <fpage>07288</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>T. H.</given-names>
            <surname>Phan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yamamoto</surname>
          </string-name>
          ,
          <article-title>Resolving class imbalance in object detection with weighted cross entropy losses</article-title>
          , arXiv preprint arXiv:
          <year>2006</year>
          .
          <volume>01413</volume>
          (
          <year>2020</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>A.</given-names>
            <surname>Paszke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gross</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Massa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lerer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bradbury</surname>
          </string-name>
          , G. Chanan,
          <string-name>
            <given-names>T.</given-names>
            <surname>Killeen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Gimelshein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Antiga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Desmaison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Köpf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>DeVito</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Raison</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Tejani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chilamkurthy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Steiner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Fang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Bai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chintala</surname>
          </string-name>
          ,
          <string-name>
            <surname>Pytorch:</surname>
          </string-name>
          <article-title>An imperative style, high-performance deep learning library</article-title>
          ,
          <year>2019</year>
          . arXiv:
          <year>1912</year>
          .01703.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>D.</given-names>
            <surname>Schroter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Dementieva</surname>
          </string-name>
          , G. Groh,
          <article-title>Adam-smith at SemEval-2023 task 4: Discovering human values in arguments with ensembles of transformer-based models</article-title>
          , in: A.
          <string-name>
            <surname>K. Ojha</surname>
            ,
            <given-names>A. S.</given-names>
          </string-name>
          <string-name>
            <surname>Doğruöz</surname>
            , G. Da San Martino, H. Tayyar Madabushi,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          , E. Sartori (Eds.),
          <source>Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>532</fpage>
          -
          <lpage>541</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .semeval-
          <volume>1</volume>
          .74. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .semeval-
          <volume>1</volume>
          .
          <fpage>74</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>G.</given-names>
            <surname>Balikas</surname>
          </string-name>
          , John-arthur at SemEval
          <article-title>-2023 task 4: Fine-tuning large language models for arguments classification</article-title>
          , in: A.
          <string-name>
            <surname>K. Ojha</surname>
            ,
            <given-names>A. S.</given-names>
          </string-name>
          <string-name>
            <surname>Doğruöz</surname>
            , G. Da San Martino, H. Tayyar Madabushi,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Kumar</surname>
          </string-name>
          , E. Sartori (Eds.),
          <source>Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)</source>
          ,
          <article-title>Association for Computational Linguistics</article-title>
          , Toronto, Canada,
          <year>2023</year>
          , pp.
          <fpage>1428</fpage>
          -
          <lpage>1432</lpage>
          . URL: https://aclanthology.org/
          <year>2023</year>
          .semeval-
          <volume>1</volume>
          .197. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2023</year>
          .semeval-
          <volume>1</volume>
          .
          <fpage>197</fpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>