<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Profiling Cryptocurrency Influencers with Few-shot Learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Isabel Ferri-Molla</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jaume Santamaria-Jorda</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universitat Politècnica de Valencia</institution>
          ,
          <addr-line>Camí de Vera, s/n, 46022 València, Valencia</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>In this paper, we describe our systems for participating in the “Profiling Cryptocurrency Influencers with Few-shot Learning” task shared on PAN 2023. This work focuses on profiling cryptocurrency influencers from limited data obtained from social networks. We employ sparse data learning techniques to classify cryptocurrency influencers into diferent categories. During the work, diferent subtasks will be tackled. On the one hand, influencers will be classified according to their number of followers. On the other hand, influencers will be classified by their interests. Finally, in the third subtask, the classification will be based on the influencer's intent. Our approach is to compare the performance of statistical models and pre-trained linguistic models, taking into account the limitations of the data. Our approach is to compare the performance of statistical models and pre-trained linguistic models, taking into account the limitations of the data. Furthermore, we will focus on an in-depth exploration of the best parameters that can be used in the training process of the selected model to obtain the best possible metrics. Experimental results show that the pre-trained models in all tasks obtain better global metrics even with the poor amounts of data available.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;author profiling</kwd>
        <kwd>cryptocurrency influencers</kwd>
        <kwd>language models</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The interest in cryptocurrencies has experienced a significant surge in recent years [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. With
their decentralized nature and independence from any authority, various cryptocurrency projects
have gained popularity among the general public. This rise has given birth to influencers
who propagate their viewpoints on social media platforms. Consequently, the profiling of
cryptocurrency influencers has become a topic of increasing interest due to the substantial
influence they can wield over investments and the overall market. Identifying and understanding
the characteristics of these influencers can provide valuable insights for investors and companies
involved in cryptocurrency.
      </p>
      <p>Presently, a large segment of the population spends a considerable amount of time on social
networks, particularly platforms that allow people to post short messages, such as Twitter,
have gained prominent popularity. Within this social media landscape, it is evident that some
users possess greater influence than others. Certain individuals’ popularity can be so significant
that their opinions and messages have the power to shape the views of other users. These
influential users amass a substantial following and exert a wide-ranging impact on online social
interactions. Their tweets can reach diverse audiences and stimulate discussions and debates
on the topics they address. Therefore, it is crucial to determine if there exists a relationship
between users’ level of influence and the type of tweets they post.</p>
      <p>
        In this paper, we undertake the task of profiling cryptocurrency influencers using a dataset
with limited data. Our work revolves around the shared task titled "Profiling Cryptocurrency
Influencers with Few-Shot Learning at PAN 2023 [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], which falls within the PAN 2023 lab
on digital text forensics and stylometry [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. This study addresses the challenge of classifying
cryptocurrency influencers into diferent categories by employing few-shot learning. This
methodology proves particularly valuable when the available dataset is limited, and we aim
to generalize knowledge to new samples. Our approach relies on applying machine learning
techniques and comparing the performance of statistical models and pre-trained language
models in this specific case. We explore various parameter combinations of the latter to identify
the ones that yield superior accuracy and generalization, considering the data constraints we
encounter.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Related word</title>
      <p>This section provides an overview of similar solutions adopted in related problems within the
ifeld of author profiling. Author profiling focuses on analyzing and extracting key characteristics
of an author based on their linguistic usage and style in text. Techniques employed in this field
include natural language processing (NLP), deep learning (DL), and data analytics.</p>
      <p>
        With the increasing popularity of social networking, author profiling has found significant
application in these platforms. Various areas within author profiling in social networks are
dedicated to predicting attributes of authors, such as gender [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], age [
        <xref ref-type="bibr" rid="ref4 ref6">4, 6</xref>
        ], or personality
[
        <xref ref-type="bibr" rid="ref6 ref7">6, 7</xref>
        ].
      </p>
      <p>
        A comprehensive review of technologies used in author profiling can be found in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In previous years, statistical models were widely employed in author profiling, as evidenced
in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], where techniques like Decision Trees, Random Forest, and Support Vector Machines
(SVM) are utilized to discern demographic and psychometric traits based on English emails.
Another notable example can be found in [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], where statistical models were utilized to profile
gender and age from both English and Spanish texts.
      </p>
      <p>
        Nevertheless, there has been a noticeable shift towards deep learning techniques, specifically
the utilization of large language models (LLMs). These techniques have gained significant
momentum and popularity in recent times. Moreover, these approaches have exhibited promising
metrics, further reinforcing their appeal and potential, as evidenced in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. Additionally, the
utilization of multi-model ensembles, as exemplified in [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], has become a popular approach.
      </p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], a transformer-based approach is employed for author profiling, utilizing vector
representations of contextualized words and hand-crafted features. This approach incorporates
a self-attention mechanism and a novel coding technique that integrates stylistic, thematic, and
personal information of the author. Another innovative approach explored in this field is the
use of Convolutional Neural Networks (cnns), as demonstrated in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ].
      </p>
      <p>
        When specifically considering author profiling in tweets, several studies have been conducted.
For instance, [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] applies a product-based fusion strategy to combine encoded text
representations from BERT_base and image features from EficientNet. Similarly, [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] investigates the
authorship of tweets related to COVID-19 in Portuguese. Further examples can be found in
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], where language aggressiveness is detected in Spanish tweets using diverse approaches
such as Bag of Terms, Second Order Attributes representation, Convolutional Neural Networks,
and Ensemble of N-grams.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Proposal approaches</title>
      <p>The task we conducted our work for consists of three subtasks, the first subtask is about
low-resource influencer profiling, the second one pertains to low-resource influencer interest
identification and finally, the third subtask deals with low-resource influencer intent
identification.</p>
      <p>In the first subtask the objective was to profile influencers among 5 diferent categories based
on their number of followers. The categories were “null”, “nano”, “micro”, “macro” and “mega”
influencers. To carry out this objective, a dataset of 160 tweeters with a list of a maximum of 10
English-language tweets was used as a training dataset. In addition, there was a truth file with
the corresponding tag class for each of the tweeters.</p>
      <p>On the other hand in the second task, the aim is to classify tweets into 5 possible areas of
interest, which are “technical information”, “price update”, “trading matters”, “gaming”, and
“other”. The provided dataset consists on 64 tweets per label with one tweet each, all of them in
English. It was accompanied by a truth file with the corresponding tag class for each user.</p>
      <p>Finally, regarding the third subtask the data followed a similar format as in the previous task.
However, the goal was to classify the influencer into one of the following categories: “subjective
opinion”, “financial information”, “advertising”, or “announcement”. The given dataset was
similar to the previous one, same size and characteristics, but with 4 labels.</p>
      <p>The shared-task comprises three distinct subtasks, each focusing on low-resource influencer
profiling. The first subtask involves categorizing influencers into five diferent categories based
on their number of followers, namely "null," "nano," "micro," "macro," and "mega" influencers.
To achieve this objective, a training dataset consisting of 160 tweeters was utilized, with each
tweeter having a maximum of 10 English-language tweets. A corresponding truth file was
provided, containing the tag class for each tweeter.</p>
      <p>Moving on to the second subtask, the objective is to classify tweets into five possible areas of
interest: "technical information," "price update," "trading matters," "gaming," and "other." The
dataset provided for this task consisted of 64 tweets per label, with a single tweet per user
written in English. Similar to the first subtask, a truth file accompanied the dataset, indicating
the corresponding tag class for each user.</p>
      <p>Lastly, in the third subtask, the dataset followed a similar format as the previous task. However,
the goal was to classify the influencer into one of the following four categories: "subjective
opinion," "financial information," "advertising," or "announcement".</p>
      <p>In relation to subtask 1, we experimented with two diferent approaches due to the varying
number of tweets assigned to each user. Initially, we attempted to profile each influencer by
merging all their tweets into a single string, which served as input for our model. However, this
approach yielded poor results during our tests. Consequently, we pursued a second approach.
In this alternative approach, we split the list of tweets corresponding to each tweeter so that
each tweet was individually associated with the tweeter’s category. By doing so, we determined
the class mode assigned to all tweets from the same influencer, and this label was then assigned
to the user.</p>
      <p>The second approach proved to be more successful in achieving desirable results for subtask
1. As a result, we directly utilized this approach for subtask 2 and subtask 3.</p>
      <p>Throughout our experimentation, we explored various models and solutions for the
classification task. Additionally, we aimed to compare these models and evaluate their performance,
taking into consideration potential variations in accuracy based on the train-test partition. To
achieve this, we implemented a 5-fold cross-validation technique, which allowed us to obtain
f1 and accuracy scores for each model. We strived to maintain balanced partitions during
cross-validation, ensuring that samples from the same user did not appear in diferent partitions
and aiming to have equal representation of the classes.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Experimental setup</title>
      <p>This section presents the experimentation conducted for both approaches in subtask 1, as well
as the experiments carried out for subtask 2 and 3.</p>
      <p>Regarding subtask 1, each approach involved the utilization of two diferent methods:
statistical methods and language models (LM) specifically pre-trained for the task at hand.</p>
      <p>
        For the LM approach, we used tensorflow [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], to finetune and evaluate the performance of
some hugging-face models, specifically the BERT-base-uncased model [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], BERTweet-base (a
BERT model fine-tuned for English tweets) [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], and a RoBERTa model fine-tuned specifically
for English tweets [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ].
      </p>
      <p>
        On the other hand, for the statistical approach, we employed various models from the
scikitlearn library [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ]. Specifically, we conducted experiments with Support Vector Machines (SVM)
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], K-means clustering [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], Perceptron [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], and logistic regression [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ]. Tokenization was
performed for the statistical models, wherein special characters such as @, #, etc. were replaced
with corresponding keywords.
      </p>
      <p>Table 1 demonstrates the outcomes obtained using the first approach. It reveals that superior
results were achieved through the application of statistical methods, with logistic regression
yielding the highest macro F1 score, closely followed by Support Vector Machines (SVM). Among
the fine-tuned methods, the BERT-base-uncased classifier emerged as the top-performing model.
These findings highlight the eficacy of logistic regression and SVM in the context of subtask 1,
while also showcasing the competitive performance of the BERT-base-uncased classifier among
the fine-tuned methods.</p>
      <p>On the other hand, Table 2 presents the results obtained using the second approach. It is
evident that, overall, higher metrics were achieved for all the models compared to the first
approach. Notably, the finest outcomes were attained through the fine-tuning of the
BERT-baseuncased model. Conversely, the statistical models exhibited slightly lower F1 scores in this case.
These findings highlight the superior performance of the BERT-based approach in subtask 1 of
influencer profiling, further emphasizing the potential of fine-tuned language models in this
domain.</p>
      <p>
        After conducting multiple tests, we made a decision to explore a diferent approach inspired
by the existing literature. Our main goal was to utilize Convolutional Neural Networks (CNNs)
[
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] for author classification, as CNNs have demonstrated their efectiveness in capturing local
patterns and extracting relevant features.
      </p>
      <p>To begin, we partitioned the data using the second approach described previously, then we
preprocessed and normalized the text data. This involved removing HTML tags, normalizing
characters, converting text to lowercase, and applying other necessary transformations.
Subsequently, we performed tokenization, padding, and feature extraction to prepare the data for the
CNN.</p>
      <p>The CNN architecture incorporated Conv1D layers, which utilized filters to capture local
patterns and extract word features from the embedded representations of the tweets. To optimize
the performance of the CNN, we conducted experiments to determine the best parameters.
After thorough exploration, we selected 180 epochs, a batch size of 128, and an embed_size of
300, Upon evaluating the model using the F1 score metric, we obtained a value of 0.45, which
did not rank among the top-performing models.</p>
      <p>As the results with the CNN were not as expected, we reverted to the approach of using
pre-trained LLMs, in this case, we created an ensemble of LM using the BERT base, RoBERTa
and BERTweet fine-tuned models explained above. To do so, we first pre-trained the diferent
LMs with the Subtask 1 data, previously separated following the approach 2. Then, to classify a
new sample, we combined the prediction of the 3 models so that the class finally predicted is
the mode of the predictions of all the LLMs used in the ensemble.</p>
      <p>
        Although the ensemble was not a bad approach and good results were obtained, it is true that
there was quite a large diference between the individual BERT base uncased and BERTweet
metrics, so we wanted to test whether the results obtained with the individual BERT model by
exploring its training parameters could be better than those we obtained with the assembly. We
found the DistilBERT [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] variant, which is a distilled version of the BERT model, lighter and
faster, but maintaining its understanding capabilities.
      </p>
      <p>
        In order to try to improve the results of the ensemble, we tested diferent parameters when
training DistilBERT. The initial parameters used in previous tests of BERT are based on those
recommended by [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], but adapted to the size of the task. These were a learning rate of 2e-5, 6
epochs and a batch size of 16. After testing these parameters, an exhaustive exploration was
carried out to find the best possible combination, from the learning rate and bach size to the
the number of epochs, limiting their values to specific ranges. As results were obtained, we did
diferent iterations in which we explored diferent parameters ranges to narrow down the error
rate. After testing various combinations, we determined that the optimal parameters for this
approach were a learning rate of 5e-5, 3 epochs, and a batch size of 4, with this parameters we
obtained the best F1 score in our experiments for this task with a 0.61 of macro F1.
      </p>
      <p>In order to achieve the objective of subtask 2, diferent models were also trained. To ensure
optimal training and evaluation, a test partition was created by splitting the original training
data. Cross-validation was employed by equally dividing the number of samples into folds.</p>
      <p>As in subtask 1, we have compared diferent methods. The first one involved statistical models,
including the ones used in subtask 1, with the addition of multilayer perceptron [29], Naive
Bayes [30], Random Forest [31], and ridge classifier [ 32]. These models have demonstrated
competitive results even with limited data. The second approach involved fine-tuning
pretrained models, which, akin to Subtask 1, demonstrated improved performance. These results
can be observed in Table 3. The best parameters found for this task were also the same as for
the previous one. The second approach focused on fine-tuning pre-trained models, proved to be
more efective as observed in subtask 1. The results of these diferent models can be observed
in Table 3.</p>
      <p>In Subtask 3, we employed a similar methodology as in Subtask 2. Initially, we conducted
experiments using diferent statistical models. However, these models did not attain the desired
level of accuracy. Consequently, we focused our eforts primarily on testing and fine-tuning the
DistilBERT model, as it had exhibited the most promising outcomes in previous tasks.</p>
      <p>To evaluate the performance of the models, we compared their accuracy and F1 metrics. The
results are summarized in Table 4.</p>
      <p>The table 4 provided illustrates the accuracy and F1 scores attained by various models. Among
the statistical models, accuracy scores ranged from 0.56 to 0.62, while F1 scores fell between
0.58 and 0.61. However, the pre-trained language models demonstrated superior performance
compared to the statistical models, achieving accuracy scores of up to 0.83 and F1 scores of 0.84.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results</title>
      <p>This section presents the final results obtained in TIRA 1.</p>
      <p>In subtask 1, two diferent models were tested on the platform. Firstly, the ensemble of the
three diferent language models (LM) discussed in section 4 was evaluated, achieving an F1
score of 0.45.</p>
      <p>Additionally, the DistilBERT model explained in section 4 was employed too. Regarding
DistilBERT models after the parameter exploration, we determined that the optimal parameters
for this approach were a learning rate of 5e-5, 3 epochs, and a batch size of 4. Notably, this
approach outperformed the ensemble approach in TIRA, attaining a macro F1 score of 0.57.</p>
      <p>These results highlight the efectiveness of the DistilBERT model for low-resource influencer
profiling in subtask 1, surpassing the performance of the ensemble model on the TIRA platform.</p>
      <p>Regarding second subtask after the experimentation explained in section 4 we got the
conclusion that DistilBERT fine-tuned obtained the best F1 metric so this one was the one presented
in TIRA. This model achieved final results on the platform of 0.55 macro F1 score.</p>
      <p>In Subtask 3, our experimental findings revealed that the DistilBERT fine-tuned model yielded
1TIRA is a platform for reproducible participation in shared tasks from information retrieval, natural language
processing, and machine learning, where organizers can provide datasets to participants and manage their submissions.
https://www.tira.io/
the most favorable outcomes that is the reason why this one whas the model tested in the
platform. It consistently achieved results in the TIRA evaluation that surpassed the average
performance, exhibiting a noteworthy macro F1 score of 0.61.</p>
      <p>Extensive experimentation was conducted for the DistilBERT models employed in both
Subtasks 2 and 3 to identify the optimal parameters. Surprisingly, the best parameters obtained
for both tasks were consistent: a learning rate of 5e-5, 3 epochs, and a batch size of 4, mirroring
the parameters used in Subtask 1. This observation suggests that the tasks may share a certain
level of similarity, leading to the convergence of optimal parameter values across them.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion and Future Works</title>
      <p>In this work we have trained several models with few data in order to profile criptocurrency
influencers. Throughout our experiments, we observed that fine-tuning pre-trained models
generally yielded superior results than using statistical models. Specifically, for Subtask 1 we
found that splitting the tweet list of each influencer and individually associating each tweet
with the corresponding label proved to be a more efective approach. Although the performance
improvement over statistical models was not as substantial compared to other subtasks, the
ifne-tuned neural models demonstrated better performance. Through an ensemble of neural
models in TIRA, we achieved an F1 score of 0.45. Furthermore, after extensive parameter testing,
we obtained the best results using a pre-trained DistilBERT model, achieving an F1 score of 0.57.</p>
      <p>In relation to Subtask 2, a more pronounced disparity was observed between statistical and
neural models. Notably, the best outcomes were achieved using a DistilBERT model, which
yielded an F1 score of 0.55. Finally, in relation with Subtask 3, we once again experimented
with both statistical and neural models. After exploring various training parameters, it was
determined that a fine-tuned DistilBERT model emerged as the superior choice, resulting in an
F1 metric of 0.61.</p>
      <p>Although the best results have been obtained with pre-trained models, there is still room
for improvement and it would be of interest to explore other models, as well as alternative
structures and ways of assembling them Given that the statistical techniques have given a good
overall result, it would be interesting to further explore them and to test with ensembles of
models, as well as experiment with new ways of preprocessing data.
faster, cheaper and lighter, 2020. arXiv:1910.01108.
[29] M. W. Gardner, S. Dorling, Artificial neural networks (the multilayer perceptron)—a
review of applications in the atmospheric sciences, Atmospheric environment 32 (1998)
2627–2636.
[30] I. Rish, et al., An empirical study of the naive bayes classifier, in: IJCAI 2001 workshop on
empirical methods in artificial intelligence, volume 3, 2001, pp. 41–46.
[31] L. Breiman, Random forests, Machine learning 45 (2001) 5–32.
[32] J. He, L. Ding, L. Jiang, L. Ma, Kernel ridge regression classification, in: 2014 International
Joint Conference on Neural Networks (IJCNN), IEEE, 2014, pp. 2263–2267.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sawhney</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Mittal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Nanda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Chava</surname>
          </string-name>
          ,
          <article-title>Cryptocurrency bubble detection: A new stock market dataset, financial task &amp; hyperbolic models, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Association for Computational Linguistics</article-title>
          , Seattle, United States,
          <year>2022</year>
          , pp.
          <fpage>5531</fpage>
          -
          <lpage>5545</lpage>
          . URL: https: //aclanthology.org/
          <year>2022</year>
          .naacl-main.
          <volume>405</volume>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <year>2022</year>
          .naacl-main.
          <volume>405</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Rios</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Borrego-Obrador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <article-title>Profiling Cryptocurrency Influencers with Few shot Learning at PAN 2023, in: CLEF 2022 Labs and Workshops</article-title>
          , Notebook Papers,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Borrego-Obrador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Ríos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Fröbe</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Heini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kredens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pęzik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolska</surname>
          </string-name>
          , , E. Zangerle, Overview of PAN 2023:
          <article-title>Authorship Verification, Multi-Author Writing Style Analysis, Profiling Cryptocurrency Influencers, and Trigger Detection</article-title>
          , in: A.
          <string-name>
            <surname>Arampatzis</surname>
            , E. Kanoulas,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Tsikrika</surname>
            ,
            <given-names>A. G.</given-names>
          </string-name>
          <string-name>
            <surname>Stefanos Vrochidis</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Aliannejadi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Vlachos</surname>
          </string-name>
          , G. Faggioli, N. Ferro (Eds.),
          <source>Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF</source>
          <year>2023</year>
          ), Lecture Notes in Computer Science, Springer,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Goswami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rustagi</surname>
          </string-name>
          ,
          <article-title>Stylometric analysis of bloggers' age and gender</article-title>
          ,
          <source>in: Proceedings of the International AAAI Conference on Web and Social Media</source>
          , volume
          <volume>3</volume>
          ,
          <year>2009</year>
          , pp.
          <fpage>214</fpage>
          -
          <lpage>217</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Flekova</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Gurevych</surname>
          </string-name>
          ,
          <article-title>Can we hide in the web? large scale simultaneous age and gender author profiling in social media</article-title>
          ,
          <source>in: CLEF 2012 Labs and Workshop</source>
          , Notebook Papers,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Schwartz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Eichstaedt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. L.</given-names>
            <surname>Kern</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Dziurzynski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. M.</given-names>
            <surname>Ramones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Agrawal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Shah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kosinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Stillwell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Seligman</surname>
          </string-name>
          , et al.,
          <article-title>Personality, gender, and age in the language of social media: The open-vocabulary approach</article-title>
          ,
          <source>PloS one 8</source>
          (
          <year>2013</year>
          )
          <article-title>e73791</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bachrach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kosinski</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Graepel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Kohli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Stillwell</surname>
          </string-name>
          ,
          <article-title>Personality and patterns of facebook usage</article-title>
          ,
          <source>in: Proceedings of the 4th annual ACM web science conference</source>
          ,
          <year>2012</year>
          , pp.
          <fpage>24</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Bevendorf</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chinea-Ríos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Franco-Salvador</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Heini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Körner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Kredens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mayerl</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Pęzik</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Potthast</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rangel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Rosso</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Stamatatos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Stein</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wiegmann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wolska</surname>
          </string-name>
          , E. Zangerle, Overview of pan 2023:
          <article-title>Authorship verification, multi-author writing style analysis, profiling cryptocurrency influencers, and trigger detection</article-title>
          , in: J.
          <string-name>
            <surname>Kamps</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Goeuriot</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Crestani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Maistro</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Joho</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          <string-name>
            <surname>Davis</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Gurrin</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          <string-name>
            <surname>Kruschwitz</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Caputo (Eds.),
          <source>Advances in Information Retrieval</source>
          , Springer Nature Switzerland, Cham,
          <year>2023</year>
          , pp.
          <fpage>518</fpage>
          -
          <lpage>526</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>D.</given-names>
            <surname>Estival</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Gaustad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. B.</given-names>
            <surname>Pham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hutchinson</surname>
          </string-name>
          ,
          <article-title>Author profiling for english emails</article-title>
          ,
          <source>in: Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics</source>
          , volume
          <volume>263</volume>
          ,
          <string-name>
            <surname>Citeseer</surname>
          </string-name>
          ,
          <year>2007</year>
          , p.
          <fpage>272</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <surname>M. De-Arteaga</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Jimenez</surname>
            , G. Duenas,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Mancera</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Baquero</surname>
          </string-name>
          ,
          <article-title>Author profiling using corpus statistics, lexicons and stylistic features, Online Working Notes of the 10th PAN evaluation lab on uncovering plagiarism, authorship. and social misuse</article-title>
          ,
          <source>CLEF</source>
          (
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fabien</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Villatoro-Tello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Motlicek</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Parida</surname>
          </string-name>
          , Bertaa:
          <article-title>Bert fine-tuning for authorship attribution</article-title>
          ,
          <source>in: Proceedings of the 17th International Conference on Natural Language Processing (ICON)</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>127</fpage>
          -
          <lpage>137</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>J. P.</given-names>
            <surname>Delmondes Neto</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <article-title>Paraboni, Multi-source bert stack ensemble for cross-domain author profiling</article-title>
          ,
          <source>Expert Systems</source>
          <volume>39</volume>
          (
          <year>2022</year>
          )
          <article-title>e12869</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>López-Santillán</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>González</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Montes-y Gómez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>López-Monroy</surname>
          </string-name>
          ,
          <article-title>When attention is not enough to unveil a text's author profile: Enhancing a transformer with a wide branch</article-title>
          ,
          <source>Neural Computing and Applications</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Aragón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.-P.</given-names>
            <surname>López-Monroy</surname>
          </string-name>
          ,
          <article-title>A straightforward multimodal approach for author profiling</article-title>
          ,
          <source>in: Proceedings of the Ninth International Conference of the CLEF Association (CLEF</source>
          <year>2018</year>
          ),
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Suman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Naman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Saha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bhattacharyya</surname>
          </string-name>
          ,
          <article-title>A multimodal author profiling system for tweets</article-title>
          ,
          <source>IEEE Transactions on Computational Social Systems</source>
          <volume>8</volume>
          (
          <year>2021</year>
          )
          <fpage>1407</fpage>
          -
          <lpage>1416</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P. V.</given-names>
            <surname>Brum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. C.</given-names>
            <surname>Teixeira</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Miranda</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Vimieiro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Meira</given-names>
            <surname>Jr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. L.</given-names>
            <surname>Pappa</surname>
          </string-name>
          ,
          <article-title>A characterization of portuguese tweets regarding the covid-19 pandemic, in: Anais do VIII Symposium on Knowledge Discovery, Mining and Learning</article-title>
          , SBC,
          <year>2020</year>
          , pp.
          <fpage>177</fpage>
          -
          <lpage>184</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>M. E.</given-names>
            <surname>Aragón</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. P.</given-names>
            <surname>López-Monroy</surname>
          </string-name>
          ,
          <article-title>Author profiling and aggressiveness detection in spanish tweets: Mex-a3t</article-title>
          <year>2018</year>
          ., in: IberEval@ SEPLN,
          <year>2018</year>
          , pp.
          <fpage>134</fpage>
          -
          <lpage>139</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abadi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Agarwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Barham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Brevdo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Citro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Devin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ghemawat</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Goodfellow</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harp</surname>
          </string-name>
          , G. Irving,
          <string-name>
            <given-names>M.</given-names>
            <surname>Isard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jozefowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Kaiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kudlur</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Levenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mané</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Monga</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Moore</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Murray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Olah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Shlens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Steiner</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Talwar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Tucker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vanhoucke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vasudevan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Viégas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Vinyals</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Warden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wattenberg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wicke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <source>TensorFlow: Large-scale machine learning on heterogeneous systems</source>
          ,
          <year>2015</year>
          . URL: https://www.tensorflow.org/, software available from tensorflow.
          <source>org.</source>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of deep bidirectional transformers for language understanding</article-title>
          ,
          <source>in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</source>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <source>Association for Computational Linguistics</source>
          , Minneapolis, Minnesota,
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          . URL: https://aclanthology.org/ N19-1423. doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>N19</fpage>
          -1423.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>D. Q.</given-names>
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Vu</surname>
          </string-name>
          , A. T. Nguyen,
          <article-title>BERTweet: A pre-trained language model for English Tweets</article-title>
          ,
          <source>in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>9</fpage>
          -
          <lpage>14</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ushio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Camacho-Collados</surname>
          </string-name>
          ,
          <string-name>
            <surname>T-NER</surname>
          </string-name>
          :
          <article-title>An all-round python library for transformer-based named entity recognition, in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics</article-title>
          , Online,
          <year>2021</year>
          , pp.
          <fpage>53</fpage>
          -
          <lpage>62</lpage>
          . URL: https://aclanthology.org/
          <year>2021</year>
          . eacl-demos.7. doi:
          <volume>10</volume>
          .18653/v1/
          <year>2021</year>
          .eacl-demos.
          <volume>7</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F.</given-names>
            <surname>Pedregosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Varoquaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gramfort</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Michel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Thirion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Grisel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Blondel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Prettenhofer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Dubourg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Vanderplas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Passos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Cournapeau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Brucher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrot</surname>
          </string-name>
          , E. Duchesnay,
          <article-title>Scikit-learn: Machine learning in Python</article-title>
          ,
          <source>Journal of Machine Learning Research</source>
          <volume>12</volume>
          (
          <year>2011</year>
          )
          <fpage>2825</fpage>
          -
          <lpage>2830</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Hearst</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Dumais</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Osuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Platt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Scholkopf</surname>
          </string-name>
          ,
          <article-title>Support vector machines</article-title>
          ,
          <source>IEEE Intelligent Systems and their applications 13</source>
          (
          <year>1998</year>
          )
          <fpage>18</fpage>
          -
          <lpage>28</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Hartigan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Wong</surname>
          </string-name>
          , Algorithm as 136:
          <article-title>A k-means clustering algorithm</article-title>
          ,
          <source>Journal of the royal statistical society</source>
          . series c (applied statistics)
          <volume>28</volume>
          (
          <year>1979</year>
          )
          <fpage>100</fpage>
          -
          <lpage>108</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>F.</given-names>
            <surname>Rosenblatt</surname>
          </string-name>
          ,
          <article-title>The perceptron: a probabilistic model for information storage and organization in the brain</article-title>
          .,
          <source>Psychological review 65</source>
          (
          <year>1958</year>
          )
          <fpage>386</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Wright</surname>
          </string-name>
          , Logistic regression. (
          <year>1995</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>S.</given-names>
            <surname>Albawi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Mohammed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Al-Zawi</surname>
          </string-name>
          ,
          <article-title>Understanding of a convolutional neural network</article-title>
          ,
          <source>in: 2017 international conference on engineering and technology (ICET)</source>
          , Ieee,
          <year>2017</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>V.</given-names>
            <surname>Sanh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Debut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chaumond</surname>
          </string-name>
          , T. Wolf,
          <article-title>Distilbert, a distilled version of bert: smaller,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>