<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Spoken dialect identification in Twitter using a multi-filter architecture</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mohammadreza Banaei</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Re´mi Lebret EPFL</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Switzerland</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Karl Aberer</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>This paper presents our approach for SwissText &amp; KONVENS 2020 shared task 2, which is a multi-stage neural model for Swiss German (GSW) identification on Twitter. Our model outputs either GSW or non-GSW and is not meant to be used as a generic language identifier. Our architecture consists of two independent filters where the first one favors recall, and the second one filter favors precision (both towards GSW). Moreover, we do not use binary models (GSW vs. not-GSW) in our filters but rather a multi-class classifier with GSW being one of the possible labels. Our model reaches F1-score of 0.982 on the test set of the shared task.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Out of over 8000 languages in the world
        <xref ref-type="bibr" rid="ref4">(Hammarstro¨m
et al., 2020)</xref>
        , Twitter language identifier (LID) only
supports around 30 of the most used languages1, which
is not enough for NLP community needs. Furthermore,
it has been shown that even for these frequently used
languages, Twitter LID is not highly accurate, especially
when the tweet is relatively short
        <xref ref-type="bibr" rid="ref15">(Zubiaga et al., 2016)</xref>
        .
      </p>
      <p>
        However, Twitter data is linguistically diverse
and especially includes tweets in many low-resource
languages/dialects. Having a better performing Twitter
LID can help us to gather large amounts of (unlabeled)
text in these low-resource languages that can be used to
enrich models in many down-stream NLP tasks, such
as sentiment analysis
        <xref ref-type="bibr" rid="ref11">(Volkova et al., 2013)</xref>
        and named
entity recognition
        <xref ref-type="bibr" rid="ref9">(Ritter et al., 2011)</xref>
        .
      </p>
      <p>However, the generalization of state-of-the-art NLP
models to low-resource languages is generally hard
due to the lack of corpora with good coverage in these
languages. The extreme case is the spoken dialects,
where there might be no standard spelling at all. In
this paper, we especially focus on Swiss German as
our low-resource dialect. As Swiss German is a spoken
dialect, people might spell a certain word differently,
and even a single author might use different spelling
for a word between two sentences. There also exists
a dialect continuum across the German-speaking part
of Switzerland, which makes NLP for Swiss German
even more challenging. Swiss German has its own
pronunciation, grammar and also lots of its words are
different from German.</p>
      <p>
        There exists some previous efforts for discriminating
similar languages with the help of tweets metadata
such as geo-location
        <xref ref-type="bibr" rid="ref12">(Williams and Dagli, 2017)</xref>
        , but in
this paper, we do not use tweets metadata and restrict
our model to only use tweet content. Therefore, this
model can also be used for language identification in
sources other than Twitter.
      </p>
      <p>
        LIDs that support GSW like fastText
        <xref ref-type="bibr" rid="ref6">(Joulin et al.,
2016)</xref>
        LID model are often trained by using Alemannic
Wikipedia, which also contains other German dialects
such as Swabian, Walser German, and Alsatian
German; hence, these models are not able to discriminate
dialects that are close to GSW. Moreover, fastText LID
also has a pretty low recall (0.362) for Swiss German
tweets, as it identified many of them as German.
      </p>
      <p>In this paper, we use two independently trained
filters to remove non-GSW tweets. The first filter
is a classifier that favors recall (towards GSW), and
the second one favors precision. The exact same
idea can be extended to N consecutive filters (with
N 2), with the first N 1 favoring recall and the
last filter favoring precision. In this way, we make
sure that GSW samples are not filtered out (with high
probability) in the first N 1 iterations, and the whole
pipeline GSW precision can be improved by having a
filter that favors precision at the end (N -th filter). The
reason that we use only two filters is that adding more
filters improved the performance (measured by GSW
F1-score) negligibly on our validation set.</p>
      <p>We demonstrate that by using this architecture, we
can achieve F1-score of 0.982 on the test set, even with
a small amount of available data in the target domain
(Twitter data). Section 2 presents the architecture of
each of our filters and the rationale behind the chosen
training data for each of them. In section 3, we discuss
our LID implementation details and also discuss the
detailed description of used datasets. Section 4 presents
the performance of our filters on the held-out test
dataset. Moreover, we demonstrate the contribution
of each of the filters on removing non-GSW filters to
see their individual importance in the whole pipeline
(for this specific test dataset).
2</p>
    </sec>
    <sec id="sec-2">
      <title>Multi-filter language identification</title>
      <p>In this paper, we follow the combination of N 1
filters favoring recall, followed by a final filter that favors
more precision. We choose N = 2 in this paper to
demonstrate the effectiveness of the approach. As
discussed before, adding more filters improved the
performance of the pipeline negligibly for this specific dataset.
However, for more challenging datasets, it might be
needed to have N &gt; 2 to improve the LID precision.</p>
      <p>Both of our filters are multi-class classifiers with
GSW being one of the possible labels. We found it
empirically better to use roughly balanced classes for
training the multi-class classifier, rather than making
the same training data a highly imbalanced GSW vs.
non-GSW training data for a binary classifier, especially
for the first filter (section 2.1) which has much more
parameters compared to the second filter (section 2.2).
2.1</p>
      <sec id="sec-2-1">
        <title>First filter: fine-tuned BERT model</title>
        <p>The first filter should be designed in a way to favor
GSW recall, either by tuning inference thresholds or
by using training data that implicitly enforces this bias
towards GSW. Here we follow the second approach
for this filter by using different domains for training
different labels, which is further discussed below.
Moreover, we use a more complex (in terms of the
number of parameters) model for the first filter, so
that it does the main job of removing non-GSW inputs
while having reasonable GSW precision (further detail
in section 4). The second filter will be later used to
improve the pipeline precision by removing a relatively
smaller number of non-GSW tweets.</p>
        <p>
          Our first filter is a fine-tuned BERT
          <xref ref-type="bibr" rid="ref2">(Devlin et al.,
2018)</xref>
          model for the LID downstream task. As we
do not have a large amount of unsupervised GSW
data, it will be hard to train the BERT language model
(LM) from scratch on GSW itself. Hence, we use the
German pre-trained LM (BERT-base-cased model2),
which is the closest high-resource language to GSW.
        </p>
        <p>
          2Training details available at https://huggingface.
co/bert-base-german-cased
          <xref ref-type="bibr" rid="ref13">(Wolf et al., 2019)</xref>
          However, this LM has been trained using sentences
(e.g., German Wikipedia) that are quite different
from the Twitter domain. Moreover, lack of standard
spelling in GSW introduces many new words (unseen
in German LM training data) that their respective
subwords embedding should be updated in order to
improve the downstream task performance. In addition,
there are even syntactic differences between German
and GSW (and even among different variations of GSW
in different regions
          <xref ref-type="bibr" rid="ref5">(Honnet et al., 2017)</xref>
          ). For these
three reasons, we can conclude that freezing the BERT
body (and just training the classifier layer) might not
be optimal for this transfer learning between German
and our target language. Hence, we also let the whole
BERT body be trained during the downstream task,
which of course needs a large amount of supervised
data to avoid quick overfitting in the fine-tuning phase.
        </p>
        <p>
          For this filter, we choose the same eight classes for
training LID as Linder et al. (2019) (the dataset classes
and their respective sizes can be found in section 3.1).
These languages are similar in structure to GSW (such
as German, Dutch, etc.), and we try to train a model
that can distinguish GSW from similar languages
to decrease GSW false positives. For all classes
except GSW, we use sentences (mostly Wikipedia
and Newscrawl) from Leipzig text corpora
          <xref ref-type="bibr" rid="ref3">(Goldhahn
et al., 2012)</xref>
          . We also use the SwissCrawl
          <xref ref-type="bibr" rid="ref8">(Linder et al.,
2019)</xref>
          dataset for GSW sentences.
        </p>
        <p>Most GSW training samples (SwissCrawl data)
come from forums and social media, which are less
formal (in structure and also used phrases) than other
(non-GSW) classes samples (mostly from Wikipedia
and NewsCrawl). Moreover, as our target dataset
consist of tweets (mostly informal sentences), this
could make this filter having high GSW recall during
the inference phase. Additionally, our main reason for
using a cased tokenizer for this filter is to let the model
also use irregularities in writing, such as improper
capitalization. As these irregularities mostly occur in
informal writing, it will again bias the model towards
GSW (improving GSW recall) when tweets are passed
to it, as most of the GSW training samples are informal.
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Second filter: fastText classifier</title>
        <p>
          For this filter, we also train a multiclass classifier with
GSW being one of the labels. The other classes are
again close languages (in structure) to GSW such
as German, Dutch and Spanish (further detail in
section 3.1). Additionally, as mentioned before, our
second filter should have a reasonably high precision
to enhance the full pipeline precision. Hence, unlike
the first filter, we choose the whole training data
to be sampled from a similar domain to the target
test set. non-GSW samples are tweets from SEPLN
2014
          <xref ref-type="bibr" rid="ref14">(Zubiaga et al., 2014)</xref>
          and Carter et al. (2013)
dataset. GSW samples consist of this shared task
provided GSW tweets and also part of GSW samples
of Swiss SMS corpus
          <xref ref-type="bibr" rid="ref10">(Stark et al., 2015)</xref>
          dataset.
        </p>
        <p>
          As the described training data is rather small
compared to the first filter training, we should also train a
simpler architecture with significantly fewer parameters.
We take advantage of fastText
          <xref ref-type="bibr" rid="ref6">(Joulin et al., 2016)</xref>
          for
training this model, which is based on a bag of character
n-grams in our case. Moreover, unlike the first filter,
this model is not a cased model, and we make input
sentences lower-case to reduce vocab size. Our used
hyperparameters for this model can be found in section 3.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Experimental Setup</title>
      <p>In this section, we describe the datasets and the
hyperparameters for both filters in the pipeline. We also
describe our preprocessing method that is specifically
designed to handle inputs from social media.
3.1</p>
      <sec id="sec-3-1">
        <title>Datasets</title>
        <p>For both filters, we use 80% of data for training, 5%
for validation set and 15% for the test set.
3.1.1</p>
      </sec>
      <sec id="sec-3-2">
        <title>First filter</title>
        <p>
          The sentences are from Leipzig corpora
          <xref ref-type="bibr" rid="ref3">(Goldhahn
et al., 2012)</xref>
          and SwissCrawl
          <xref ref-type="bibr" rid="ref8">(Linder et al., 2019)</xref>
          dataset. The classes and the number of samples in
each class are shown in Table 1. We pick the proposed
classes by Linder et al. (2019) for training GSW LID.
The main differences of our first filter with their LID
are the GSW sentences and the fact that our fine-tuning
dataset is about three times larger than theirs. Each
of “other”3 and “GSW-like”4 classes are a group of
languages where their respective members cannot be
represented as a separate class due to having a small number
of samples. The GSW-like is included to make sure
that the model can distinguish other German dialects
from GSW (hence, reducing GSW false positives).
3.1.2
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>Second filter</title>
        <p>
          The sentences are mostly from Twitter (except for
some GSW samples from Swiss SMS corpus
          <xref ref-type="bibr" rid="ref10">(Stark
et al., 2015)</xref>
          ). In Table 2, we can see the distribution
of different classes. The GSW samples consist of 1971
tweets (provided by shared task organizers) and 3000
GSW samples from Swiss SMS corpus.
As the dataset sentences are mostly from social media,
we used a custom tokenizer that removes common
social media tokens (emoticons, emojis, URL, hashtag,
Twitter mention) that are not useful for LID. We also
normalize word elongation as it might be misleading
for LID. In the second filter, we also make the input
sentences lower-case before passing it to the model.
We train this filter by fine-tuning a German pre-trained
BERT-cased model on our LID task. As mentioned
before, we do not freeze the BERT body in the
fine-tuning phase. We train it for two epochs, with
a batch size of 64 and max-seq-length of 64. We
use Adam optimizer
          <xref ref-type="bibr" rid="ref7">(Kingma and Ba, 2014)</xref>
          with a
learning rate of 2e-5.
We train this filter using fastText
          <xref ref-type="bibr" rid="ref6">(Joulin et al., 2016)</xref>
          classifier for 30 epochs using character n-grams
as features (where 2 n 5) and the embedding
dimension set to 50. To favor precision during
inference, we label a tweet as GSW if the model
probability for GSW is greater than 64% (this threshold
is seen as a hyper-parameter and was optimized
according to validation set).
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Results</title>
      <p>In this section, we evaluate our two filters performance
(either in isolation or when present in the full pipeline)
on the held-out test dataset of the shared task. We also
evaluate the BERT filter on its test data (Leipzig and
SwissCrawl samples).
4.1</p>
      <sec id="sec-4-1">
        <title>BERT filter performance on Leipzig + SwissCrawl corpora</title>
        <p>We first evaluate our BERT filter on the test set of the
first filter (Leipzig corpora + SwissCrawl). In Table
3 we demonstrate the filter performance on different
labels. The filter has an F1-score of 99.8% on the
GSW test set. However, when this model is applied
to Twitter data, we expect a decrease in performance
due to having short and also informal messages.
In Table 4, we can see both filters performance either in
isolation or when they are used together. As shown in
this table, the model improvement by adding the second
filter is rather small. The main reason can be seen in
Table 5 as the majority of non-GSW filtering is done
by the first filter for the shared-task test set (Table 6).</p>
        <sec id="sec-4-1-1">
          <title>Model</title>
        </sec>
        <sec id="sec-4-1-2">
          <title>BERT filter fastText Filter BERT + fastText fastText Baseline</title>
          <p>Precision
0.9742
0.9076
0.9811
0.9915</p>
          <p>Recall
0.9896
0.9892
0.9834
0.3619</p>
          <p>F1-score
0.9817
0.9466
0.9823
0.5303
Our designed LID outperforms the baseline
significantly (Table 4) which underlines the importance of
having a domain-specific LID. Additionally, although
the positive effect of the second filter is quite small on
the test set, when we applied the same architecture on
randomly sampled tweets (German tweets according to
Twitter API), we observed that having the second filter
could reduce the number of GSW false positives
significantly. Hence, the number of used filters is indeed
totally dependent on the complexity of the target dataset.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <p>In this work, we propose an architecture for spoken
dialect (Swiss German) identification by introducing
a multi-filter architecture that is able to filter out
non-GSW tweets during the inference phase effectively.
We evaluated our model on the GSW LID shared task
test-set, and we reached an F1-score of 0.982.</p>
      <p>
        However, there are other useful features that can be
used during training, such as orthographic conventions
in GSW writing, as observed by Honnet et al. (2017),
which their presence might not be easily captured even
by a complex model like BERT. Moreover, in this paper,
we did not use tweets metadata as a feature and only
focused on tweet content, although they can improve
LID classification for dialects considerably
        <xref ref-type="bibr" rid="ref12">(Williams
and Dagli, 2017)</xref>
        . These two, among others, are future
works that need to be further studied to see their
usefulness for low-resource language identification.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Simon</given-names>
            <surname>Carter</surname>
          </string-name>
          , Wouter Weerkamp, and
          <string-name>
            <given-names>Manos</given-names>
            <surname>Tsagkias</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Microblog language identification: Overcoming the limitations of short, unedited and idiomatic text</article-title>
          .
          <source>Language Resources and Evaluation</source>
          ,
          <volume>47</volume>
          (
          <issue>1</issue>
          ):
          <fpage>195</fpage>
          -
          <lpage>215</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Jacob</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Ming-Wei</surname>
            <given-names>Chang</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Kenton</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and Kristina</given-names>
            <surname>Toutanova</surname>
          </string-name>
          .
          <year>2018</year>
          .
          <article-title>Bert: Pre-training of deep bidirectional transformers for language understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Dirk</given-names>
            <surname>Goldhahn</surname>
          </string-name>
          , Thomas Eckart, and
          <string-name>
            <given-names>Uwe</given-names>
            <surname>Quasthoff</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Building large monolingual dictionaries at the leipzig corpora collection: From 100 to 200 languages</article-title>
          . In LREC, volume
          <volume>29</volume>
          , pages
          <fpage>31</fpage>
          -
          <lpage>43</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          Harald Hammarstro¨m, Sebastian Bank, Robert Forkel, and
          <string-name>
            <given-names>Martin</given-names>
            <surname>Haspelmath</surname>
          </string-name>
          .
          <year>2020</year>
          .
          <article-title>Glottolog 4.2.1. Max Planck Institute for the Science of Human History, Jena</article-title>
          . Available online at http://glottolog.org, Accessed on 2020-
          <volume>04</volume>
          -18.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Pierre-Edouard</surname>
            <given-names>Honnet</given-names>
          </string-name>
          , Andrei Popescu-Belis,
          <string-name>
            <given-names>Claudiu</given-names>
            <surname>Musat</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Michael</given-names>
            <surname>Baeriswyl</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Machine translation of low-resource spoken dialects: Strategies for normalizing swiss german</article-title>
          .
          <source>arXiv preprint arXiv:1710</source>
          .
          <fpage>11035</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Armand</given-names>
            <surname>Joulin</surname>
          </string-name>
          , Edouard Grave, Piotr Bojanowski, and
          <string-name>
            <given-names>Tomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>Bag of tricks for efficient text classification</article-title>
          .
          <source>arXiv preprint arXiv:1607</source>
          .
          <fpage>01759</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Diederik P Kingma and Jimmy Ba</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <article-title>Adam: A method for stochastic optimization</article-title>
          .
          <source>arXiv preprint arXiv:1412</source>
          .
          <fpage>6980</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>Lucy</given-names>
            <surname>Linder</surname>
          </string-name>
          , Michael Jungo, Jean Hennebert, Claudiu Musat, and
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Fischer</surname>
          </string-name>
          .
          <year>2019</year>
          .
          <article-title>Automatic creation of text corpora for low-resource languages from the internet: The case of swiss german</article-title>
          . arXiv preprint arXiv:
          <year>1912</year>
          .00159.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Alan</given-names>
            <surname>Ritter</surname>
          </string-name>
          , Sam Clark,
          <string-name>
            <given-names>Oren</given-names>
            <surname>Etzioni</surname>
          </string-name>
          , et al.
          <year>2011</year>
          .
          <article-title>Named entity recognition in tweets: an experimental study</article-title>
          .
          <source>In Proceedings of the conference on empirical methods in natural language processing</source>
          , pages
          <fpage>1524</fpage>
          -
          <lpage>1534</lpage>
          . Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Elisabeth</given-names>
            <surname>Stark</surname>
          </string-name>
          , Simon Ueberwasser, and
          <string-name>
            <given-names>Beni</given-names>
            <surname>Ruef</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Swiss sms corpus</article-title>
          .
          <source>www.sms4science.ch.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Svitlana</given-names>
            <surname>Volkova</surname>
          </string-name>
          , Theresa Wilson, and
          <string-name>
            <given-names>David</given-names>
            <surname>Yarowsky</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>Exploring demographic language variations to improve multilingual sentiment analysis in social media</article-title>
          .
          <source>In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing</source>
          , pages
          <fpage>1815</fpage>
          -
          <lpage>1827</lpage>
          , Seattle, Washington, USA. Association for Computational Linguistics.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Jennifer</given-names>
            <surname>Williams</surname>
          </string-name>
          and
          <string-name>
            <given-names>Charlie</given-names>
            <surname>Dagli</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>Twitter language identification of similar languages and dialects without ground truth</article-title>
          .
          <source>In Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial)</source>
          , pages
          <fpage>73</fpage>
          -
          <lpage>83</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Wolf</surname>
          </string-name>
          , Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Re´mi Louf, Morgan Funtowicz, et al.
          <year>2019</year>
          .
          <article-title>Transformers: State-of-the-art natural language processing</article-title>
          . arXiv preprint arXiv:
          <year>1910</year>
          .03771.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Arkaitz</given-names>
            <surname>Zubiaga</surname>
          </string-name>
          , Inaki San Vicente, Pablo Gamallo, Jose´ Ramom Pichel Campos, In˜aki Alegr´ıa Loinaz, Nora Aranberri, Aitzol Ezeiza, and V´ıctor Fresno-Ferna´
          <fpage>ndez</fpage>
          .
          <year>2014</year>
          .
          <article-title>Overview of tweetlid: Tweet language identification at sepln 2014</article-title>
          . In TweetLID@ SEPLN, pages
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Arkaitz</given-names>
            <surname>Zubiaga</surname>
          </string-name>
          , Inaki San Vicente, Pablo Gamallo, Jose´ Ramom Pichel, Inaki Alegria, Nora Aranberri, Aitzol Ezeiza, and V´ıctor Fresno.
          <year>2016</year>
          .
          <article-title>Tweetlid: a benchmark for tweet language identification</article-title>
          .
          <source>Language Resources and Evaluation</source>
          ,
          <volume>50</volume>
          (
          <issue>4</issue>
          ):
          <fpage>729</fpage>
          -
          <lpage>766</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>