<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Language #tweet
English</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Text and Image Synergy with Feature Cross Technique for Gender Identification</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Fuji Xerox Co., Ltd</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Takumi Takahashi</institution>
          ,
          <addr-line>Takuji Tahara, Koki Nagatani, Yasuhide Miura, Tomoki Taniguchi, and Tomoko Ohkuma</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>10</volume>
      <issue>72</issue>
      <abstract>
        <p>This paper describes a neural network model for the author profiling task of PAN@CLEF 2018. Traditional machine learning models have shown superior performances for the author profiling task in past PAN series. However, these models often require careful feature-engineering to improve their performance. On the other hand, neural network approaches have recently shown advanced performances in both natural language processing (NLP) and computer vision (CV) tasks. We tackle the author profiling task using neural networks for texts and images. In order to leverage the synergy of the texts and images, we propose Text Image Fusion Neural Network (TIFNN), which considers their interaction. In an in-house experiment, TIFNN achieved accuracies of 84-90% for different languages when used for gender identification.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Author profiling technologies that extract author profile traits from social media can
be applied to some applications, e.g., advertisement, recommendation, and marketing.
PAN 2018: Author Profiling Task [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] is identifying the user’s gender from tweets that
are contained texts and images in three languages (English, Spanish, and Arabic).
      </p>
      <p>
        In PAN 2017 Author Profiling Task, various approaches based on a deep neural
network (DNN) were presented [
        <xref ref-type="bibr" rid="ref16 ref6 ref7 ref9">6,7,9,16,18</xref>
        ]. However, such approaches could not
outperform traditional machine learning models that were carefully modeled, such as
support vector machine. In contrast, neural network approaches have shown superior
performances on various NLP tasks, e.g., machine translation, summarization, and
information retrieval. In addition, DNN approaches have shown advanced performances
in various CV tasks.
      </p>
      <p>Because PAN 2018 Author Profiling Task includes both texts and images, using
both texts and images in a neural network will improve the performances. Therefore,
we tackle this task using both texts and images in a DNN-based approach.</p>
      <p>In order to leverage the synergy of the texts and images, we propose Text Image
Fusion Neural Network (TIFNN), which considers their interaction. This paper makes
the following contributions.
1. We propose an effective fusion strategy for a neural network to utilize texts and
images for gender identification.
2. We show that TIFNN has drastically improved accuracies (3-8pt) compared with
both a text-based neural network and an image-based neural network.</p>
      <p>In the following section of this paper, we first explain the related work in Section 2.
Our neural network model is described in Section 3. The details of the experiments used
to confirm the model’s performances are described in Section 4. Finally, we conclude
the paper and outline future work in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        PAN Author Profiling Task was to identify both age and gender from social media text
in the past PAN series before 2017 edition [
        <xref ref-type="bibr" rid="ref12 ref15">12,15</xref>
        ]. In the last year, the task included
language variety identification instead of age identification [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. In PAN 2017 Author
Profiling Task, various models that used not only traditional machine learning but also
deep neural networks were presented.
      </p>
      <p>
        Basile et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] used linear support vector machine (SVM) with character 3- to
5grams and word 1- to 2-grams features and showed that it outperforms other approaches.
Martinc et al. [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] explored many approaches (e.g. linear SVM, logistic regression,
random forest, XGBoost, and voting classifier combining these models) with various
parameters for this task. They finally tested logistic regression because it showed the best
performance. Tellez et al. [20] used a generic framework for text classification, as called
MicroTC. As shown in these researches, the approaches of traditional machine learning
that were carefully designed showed the superior performances in this task.
      </p>
      <p>
        On the other hand, the approaches based on deep neural networks were also
presented [
        <xref ref-type="bibr" rid="ref16 ref6 ref7 ref9">6,7,9,16,18</xref>
        ]. Miura et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] used both bi-directional GRU with an attention
mechanism to capture the word representations and convolutional neural network (CNN)
to capture the character representations. Sierra et al. [18] applied CNN that has a set
of convolutional filters of different sizes to capture n-gram features. Although the
approaches using deep neural networks are strong model for many NLP tasks, the above
approaches could not outperform traditional machine learning approaches in this task.
      </p>
      <p>In author profiling tasks outside of PAN, researches utilizing images or
multimodality also exist. The research of [17] utilized images to identify the gender of users and
the object of images with a multi-task bilinear model. In addition, the research of [21]
presented a state-of-the-art model that utilized both texts and images to predict users’
traits such as gender, age, political orientation, and location.</p>
      <p>As overviewed in this section, the approaches using traditional machine learning
showed the superior performances in past PAN series. Although the approaches based
on deep neural networks utilizing only text could not outperform traditional machine
learning approaches, the researches of [17,21] indicated that utilizing images is effective
in the prediction of author profile traits. Because using images is possible in PAN 2018
Author Profiling Task, utilizing both texts and images would be effective for gender
identification.
label
FC2</p>
      <p>FC1</p>
      <sec id="sec-2-1">
        <title>Fusion</title>
      </sec>
      <sec id="sec-2-2">
        <title>Component</title>
        <p>Column-wise
Pooling
Row-wise
Pooling</p>
        <sec id="sec-2-2-1">
          <title>Text Component</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Image Component</title>
          <p>FC1UT
PoolingT
PoolingW</p>
          <p>RNNW
Word Embedding
words</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>FCUI</title>
        </sec>
        <sec id="sec-2-2-4">
          <title>PoolingI</title>
        </sec>
        <sec id="sec-2-2-5">
          <title>CNNI</title>
          <p>images</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Model</title>
      <p>
        Proposed Model
This section describes the text component of the model, which is the “Text Component”
division in Figure 1. The component is implemented based on the previous models[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Figure 2 provides an overview of the text component. This component is constructed of
word embedding, recurrent neural network (RNN), pooling, and fully connected (FC)
layers. For the RNN, we used Gated Recurrent Unit (GRU) [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] with a bi-directional
setting.
      </p>
      <p>First, the input words are embedded to kw dimensional word embeddings with
embedding matrix Ew to obtain x with xt 2 Rkw . x are then fed to RNNW with the
following transition functions:
zt =
rt =
(W zxt + U zht 1 + bz)
(W rxt + U rht 1 + br)
h~t = tanh (W hxt + U h (rt ⊙ ht 1) + bh)
ht = (1 zt) ⊙ ht 1 + zt ⊙ h~t
(1)
(2)
(3)
(4)
where zt is an update gate, rt is a reset gate, h~t is a candidate state, ht is a state, W z,
W r, W h, U z, U r, U h are weight matrices, bz, br, bh are bias vectors, is a logistic
sigmoid function, and ⊙ is an element-wise multiplication operator. The output vectors
!h and h are concatenated to obtain g as gt = [ !ht; h t] and are then fed to PoolingW.
In PoolingW, g are processed to obtain i-th tweet feature mit with max pooling or
average pooling over time and are fed to PoolingT. mit are processed to obtain the j-th
user feature mju, as well as PoolingW. Finally, mu of the user representations are fed
to FC1UT.
3.3 Image Component
This section describes the image component of the model, which is the “Image
Component” division in Figure 1. Figure 3 provides an overview of the image component.</p>
      <p>CNNI</p>
      <p>FC7
FC6</p>
      <p>Pool5
Conv. Layers 5</p>
      <p>Pool4
Conv. Layers 4</p>
      <p>Pool3
Conv. Layers 3</p>
      <p>Pool2
Conv. Layers 2</p>
      <p>Pool1
Conv. Layers 1</p>
      <p>Image1</p>
      <p>Image
representation</p>
      <p>FCUI
PoolingI</p>
      <p>average over images
・・・
FC7
FC6</p>
      <p>Pool5
Conv. Layers 5</p>
      <p>Pool4
Conv. Layers 4</p>
      <p>Pool3
Conv. Layers 3</p>
      <p>Pool2
Conv. Layers 2</p>
      <p>Pool1
Conv. Layers 1</p>
      <p>Image2</p>
      <p>FC7
FC6</p>
      <p>Pool5
Conv. Layers 5</p>
      <p>Pool4
Conv. Layers 4</p>
      <p>Pool3
Conv. Layers 3</p>
      <p>Pool2
Conv. Layers 2</p>
      <p>Pool1
Conv. Layers 1
Image10</p>
      <p>Pool3
Conv3-3
Conv3-2
Conv3-1
Pool1
Conv1-2
Conv1-1</p>
      <p>This component is constructed of a convolutional neural network architecture (CNNI),
pooling (PoolingI), and a fully connected layer (FCUI).</p>
      <p>It takes the following three steps to utilize multiple posted images:
step 1 The feature representation of each image is extracted using a pre-trained CNN
architecture (CNNI).
step 2 The extracted features are fused (PoolingI).
step 3 The fused feature is processed with a fully connected layer (FCUI).</p>
      <p>CNNI in Figure 1 represents the layers from Conv:Layers1 to FC7 in Figure 3.
This architecture is implemented based on VGG16 [19]. CNNI utilizes the layers from
Conv:Layers1 to FC7 to extract each image feature.</p>
      <p>PoolingFI fuses the features extracted from images. The images posted by a single
author on social media can be regarded as a kind of time series. However, we cannot
know the ground truth of the images’ order in time steps and the interval of time between
posted images. Therefore, we simply use the average or max operation over image
features as PoolingI.
3.4</p>
      <p>Fusion Component
The fusion component is expected to complementarily capture the relationship between
the texts and images. User representation rtxt 2 RM is obtained via the text component
and rimg 2 RL is obtained via the image component. The relationship between the texts
and images is represented as a matrix G 2 RM L with the following equation:
G = rtxt
rimg
(5)
where is a direct-product operation. We apply column-wise and row-wise
maxpoolings over G to generate gtxt 2 RM and gimg 2 RL, respectively. Formally, the
j-th elements of the vector gtxt and the j-th elements of the vector gimg are computed
in the following operation:
[gtxt]j = max [Gj;l]</p>
      <p>1 l L
We can interpret the j-th element of the vector gtxt as an importance degree for the
j-th text feature with regard to image features. Finally, the vectors gtxt and gimg are
concatenated to obtain gcomb as gcomb = [gtxt; gimg] and passed to FC1.</p>
    </sec>
    <sec id="sec-4">
      <title>Experiment</title>
      <p>Data
This section describes two datasets: PAN@CLEF 2018 Author Profiling Training
Corpus and streaming tweets. PAN@CLEF 2018 Author Profiling Training Corpus was
utilized to train the proposed model and comparison models. Streaming tweets were
utilized to pre-train a word embedding matrix Ew.</p>
      <p>PAN@CLEF 2018 Author Profiling Training Corpus The first dataset we used to
train the proposed model was the official PAN@CLEF 2018 Author Profiling
Training Corpus. This dataset is constructed of users’ tweets in three languages: English,
Spanish, and Arabic. There are 3; 000 English language users, 3; 000 Spanish language
users, and 1; 500 Arabic language users, with a gender ratio of 1:1. We used random
sampling to divide this dataset into train8, dev1, and test1, with a ratio of 8:1:1, while
maintaining the gender ratio of 1:1.</p>
      <p>
        Streaming Tweets The second dataset we used to pre-train the word embeddings was
composed of tweets collected by Twitter Streaming APIs 1. We used the collected tweets
to pre-train the word embedding matrix Ew of the proposed model and the comparison
models. Table 1 lists the number of resulting tweets. The process of collecting tweets
was described in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. We will describe the process to pre-train the word embedding
matrix in Section 4.3.
4.2
      </p>
      <p>Model Initialization
We pre-trained each component for the proposed model. We used three steps to initialize
the proposed model for training according to the following procedure.
1 https://dev.twitter.com/streaming/overview</p>
      <p>
        Initialization of text component We first pre-trained a word embedding matrix Ew
for the text component. The details of the pre-training of the word embeddings will be
described in Section 4.3. The text component was trained using train8 and dev1.
Initialization of image component The image component was trained by fine-tuning
on train8 and dev1. First, the layers from Conv:Layers1 to FC7 described in Figure 3
(VGG16) were pre-trained on ImageNet [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. We then initialized CNNI, as described in
Figure 1, using the pre-trained VGG16. Finally, FCUI was then randomly initialized.
Initialization of TIFNN We described the pre-training procedure for each component
using train8 and dev1 above. This was done because TIFNN could be successfully
trained utilizing pre-trained text and image components. Thus, we used the pre-trained
text and image components to train TIFNN. Therefore, all of TIFNN parameters except
FC1 and FC2 were initialized with the parameters of the pre-trained components.
4.3
      </p>
      <p>
        Model Configurations
Text pre-processing We applied unicode normalization, user name normalization,
URL normalization, and HTML normalization. We used twokenizer [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for the
English text. We used WordPunctTokenizer in NLTK [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for the other languages for
tokenization.
      </p>
      <p>Image pre-processing We applied two resizing methods: direct resizing and
resizingcropping.</p>
      <p>– Direct resizing: We resized images to 224 pixels
– Resizing-cropping: We resized images to 256 pixels
the center of each image to 224 pixels 224 pixels.
224 pixels.</p>
      <p>256 pixels and then cropped
Direct resizing was applied to an image-based neural network and resizing-cropping to
TIFNN. After resizing, normalization was applied to all the images by subtracting the
average values of the RGB channels for each language.</p>
      <p>
        Initialization of word embeddings We used fastText [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] with the skip-gram algorithm
to pre-train a word embedding matrix Ew. The pre-training parameters were as follows:
dimension = 100, learning rate = 0:025, window size = 5, negative sample = 5, and
epoch = 5.
      </p>
      <p>Parameter
Word embedding dimension</p>
      <p>RNNW units</p>
      <p>FC1UT
FC2UT
CNNI
FCUI
FC1
FC2</p>
      <p>Parameters and pooling settings for proposed model Table 2 summarizes the
number of parameters in the proposed model. In addition, PoolingW was applied as a max
pooling layer for each language, PoolingI was applied as an average operation for each
language, and PoolingT was applied as a max pooling layer for Arabic or an average
pooling layer for the other languages.</p>
      <p>Optimization strategies We used cross-entropy loss as an objective function for the
models. The objective function of TIFNN was minimized over shuffled mini-batches
with SGD. We also used Adam for the text component and SGD for an image
component. The initial SGD learning rate for the image component was set at 1e 3. In
addition, we selected the best TIFNN learning rate for each language: 5e 3 for English
and 1e 2 for the other languages.</p>
      <p>Parameter selection The models had l2 regularization parameter . We selected the
best parameter of the text component from the following candidates. On the other
hand, the parameter of TIFNN was fixed at = 1e 5.</p>
      <p>2 f1e 3; 5e 4; 1e 4; 5e 5; 1e 5g
We explored the best parameter for each model using dev1.
4.4</p>
      <p>Comparison Models
We next describe the details of the comparison models used for the in-house experiment.
Figure 4 illustrates the following comparison models, except the baseline.
baseline The model was constructed of SVM using TF-IDF uni-gram features.
Text NN The text component in the figure is the same as that for Figure 2 (from
WordEmbedding to FC1UT). The parameter is set to 1e 3 for English, 1e 4
for Spanish, and 5e 5 for Arabic.</p>
      <p>Image NN The image component in the figure is the same as that for Figure 3 (from</p>
      <p>Conv:Layers1 to FCUI). The model does not apply l2 regularization.
Text NN</p>
      <p>Image NN
label
FC2UT
words</p>
      <p>Text NN + Image NN
label
FC2</p>
      <p>FC1</p>
      <p>FC2UT
Text Component
words</p>
      <p>Image
Component
images</p>
      <p>Text NN + Image NN The model combines the text and image components. Note that
the model is different from the proposed model in Figure 1. The details of this
model can be described as follows. The user representation rtxt 2 RM is obtained
via the text component and rimg 2 RL is obtained via the image component. These
features are then concatenated to obtain rcomb as rcomb = [rtxt; rimg]. Finally, the
concatenated feature rcomb is fed to FC1, and FC2 is passed to the feature via FC1.</p>
      <p>The parameter of FC1 is different from that listed in Table 2; we set it to 100.
4.5 In-house Experiment
We evaluated the proposed model and the comparison models using train8, dev1, and
test1. With the exception of the baseline, the models were trained using Titan X GPUs.
Table 3 summarizes the gender identification results.</p>
      <p>As listed in Table 3, Text NN and Image NN achieved accuracies of 80:0-82:3% for
each language. TIFNN drastically improved the accuracies (3-8pt) for each language
compared with Text NN and Image NN in this task. Furthermore, TIFNN has also
improved the accuracies for English and Spanish compared with Text NN + Image NN.
This indicated that obtaining a fusion synergy via the fusion component was an effective
approach for this task.
4.6</p>
      <p>
        Submission Run
We chose the best performing models, which were the Text NN, Image NN, and TIFNN,
as described in Table 3, for our submission run. The submission run was performed on
a TIRA virtual machine [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] with CPUs. Table 4 summarizes the performances of the
models in the submission run that are published as the official PAN results 2. Although
the models have lower accuracies compared with the in-house experiment, it is observed
that TIFNN has better accuracies for each language compared with Text NN and Image
NN. They ranked 1st in English ranking, 2nd in Spanish ranking, 7th in Arabic ranking,
and 1st in Global ranking.
5
      </p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this paper, we proposed Text Image Fusion Neural Network (TIFNN) for gender
identification. In order to leverage the synergy of texts and images, the model computes
the relationship between them using the direct-product. In-house experimental results
showed that Text NN and Image NN achieved accuracies of 80:0-82:3% for each
language in gender identification. TIFNN had drastically improved accuracies (+3-8pt)
compared with Text NN and Image NN. Furthermore, TIFNN also had improved
accuracies for English and Spanish compared with Text NN + Image NN. In addition to
the results of this in-house experiment, we confirmed that TIFNN could improve the
accuracy compared with individual models in a submission run.</p>
      <p>In future work, we would like to analyze how the proposed model interacts with
texts and images. We believe that understanding this interaction will make it possible to
improve TIFNN.
2 https://pan.webis.de/clef18/pan18-web/author-profiling.html
17. Shigenaka, R., Tsuboshita, Y., Kato, N.: Content-aware multi-task neural networks for user
gender inference based on social media images. 2016 IEEE International Symposium on
Multimedia (ISM) pp. 169–172 (2016)
18. Sierra, S., y Gómez, M.M., Solorio, T., González, F.A.: Convolutional neural networks for
author profiling in pan 2017. In: CLEF (2017)
19. Simonyan, K., Zisserman, A.: VERY DEEP CONVOLUTIONAL NETWORKS FOR
LARGE-SCALE IMAGE RECOGNITION. In: International Conference on Learning
Representations (ICLR) (2015)
20. Tellez, E.S., Miranda-Jiménez, S., Graff, M., Moctezuma, D.: Gender and language-variety
identification with microtc. In: CLEF (2017)
21. Vijayaraghavan, P., Vosoughi, S., Roy, D.: Twitter demographic classification using deep
multi-modal multi-task learning. In: Proceedings of the 55th Annual Meeting of the
Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30
August 4, Volume 2: Short Papers (2017)</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Basile</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dwyer</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Medvedeva</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rawee</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Haagsma</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nissim</surname>
            ,
            <given-names>M.:</given-names>
          </string-name>
          <article-title>N-gram: New groningen author-profiling model</article-title>
          .
          <source>CoRR abs/1707</source>
          .03764 (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Loper</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Klein</surname>
            ,
            <given-names>E.: Natural</given-names>
          </string-name>
          <string-name>
            <surname>Language Processing with Python. O'Reilly Media</surname>
            <given-names>Inc.</given-names>
          </string-name>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bojanowski</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grave</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joulin</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Enriching word vectors with subword information</article-title>
          .
          <source>arXiv preprint arXiv:1607.04606</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Cho</surname>
          </string-name>
          , K.,
          <string-name>
            <surname>van Merriënboer</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gülçehre</surname>
          </string-name>
          , Ç.,
          <string-name>
            <surname>Bahdanau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bougares</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schwenk</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bengio</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Learning phrase representations using rnn encoder-decoder for statistical machine translation</article-title>
          .
          <source>In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          . pp.
          <fpage>1724</fpage>
          -
          <lpage>1734</lpage>
          (
          <year>2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Deng</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dong</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Socher</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>L.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fei-Fei</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          :
          <article-title>ImageNet: A Large-Scale Hierarchical Image Database</article-title>
          .
          <source>In: CVPR09</source>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Franco-Salvador</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Plotnikova</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pawar</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Benajiba</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          :
          <article-title>Subword-based deep averaging networks for author profiling in social media</article-title>
          .
          <source>In: CLEF</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Kodiyan</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hardegger</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neuhaus</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cieliebak</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Author profiling with bidirectional rnns using attention with grus</article-title>
          .
          <source>In: CLEF</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Martinc</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Skrjanec</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zupan</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pollak</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Pan 2017:
          <article-title>Author profiling - gender and language variety prediction</article-title>
          .
          <source>In: CLEF</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Miura</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taniguchi</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Taniguchi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ohkuma</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          :
          <article-title>Author profiling with word+character neural attention network</article-title>
          .
          <source>In: CLEF</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Owoputi</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>O'Connor</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dyer</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gimpel</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schneider</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Smith</surname>
            ,
            <given-names>N.A.</given-names>
          </string-name>
          :
          <article-title>Improved part-of-speech tagging for online conversational text with word clusters</article-title>
          .
          <source>In: Proceedings of the</source>
          <year>2013</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT)</article-title>
          . pp.
          <fpage>380</fpage>
          -
          <lpage>390</lpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gollub</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stamatatos</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Improving the Reproducibility of PAN's Shared Tasks: Plagiarism Detection, Author Identification, and Author Profiling</article-title>
          . In: Kanoulas,
          <string-name>
            <given-names>E.</given-names>
            ,
            <surname>Lupu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Clough</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            ,
            <surname>Sanderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Hanbury</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ,
            <surname>Toms</surname>
          </string-name>
          , E. (eds.)
          <article-title>Information Access Evaluation meets Multilinguality, Multimodality, and Visualization</article-title>
          .
          <source>5th International Conference of the CLEF Initiative (CLEF 14)</source>
          . pp.
          <fpage>268</fpage>
          -
          <lpage>299</lpage>
          . Springer, Berlin Heidelberg New York (
          <year>Sep 2014</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>F.C.P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Overview of the 3rd author profiling task at pan 2015</article-title>
          . In:
          <article-title>CLEF 2015 Labs and Workshops, Notebook Papers</article-title>
          .
          <source>CEUR Workshop Proceedings, CEUR-WS.org (Sep</source>
          <year>2015</year>
          ), http://www.clef-initiative.eu/publication/working-notes (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Montes-</surname>
            y-Gómez,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Overview of the 6th Author Profiling Task at PAN 2018: Multimodal Gender Identification in Twitter</article-title>
          . In: Cappellato,
          <string-name>
            <given-names>L.</given-names>
            ,
            <surname>Ferro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            ,
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.Y.</given-names>
            ,
            <surname>Soulier</surname>
          </string-name>
          ,
          <string-name>
            <surname>L</surname>
          </string-name>
          . (eds.)
          <article-title>Working Notes Papers of the CLEF 2018 Evaluation Labs</article-title>
          .
          <source>CEUR Workshop Proceedings, CLEF and CEUR-WS.org (Sep</source>
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter</article-title>
          .
          <source>Working Notes Papers of the CLEF</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rangel</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rosso</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Verhoeven</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Daelemans</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Potthast</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stein</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          :
          <article-title>Overview of the 4th author profiling task at pan 2016: Cross-genre evaluations</article-title>
          .
          <source>In: Working Notes Papers of the CLEF 2016 Evaluation Labs. CEUR Workshop Proceedings</source>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Schaetti</surname>
          </string-name>
          , N.: Unine at clef 2017:
          <article-title>Tf-idf and deep-learning for author profiling</article-title>
          .
          <source>In: CLEF</source>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>