<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Vicomtech at MEDDOCAN: Medical Document Anonymization</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Naiara Perez?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Laura Garc a-Sardin~a?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Manex Serras?</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arantza Del Pozo</string-name>
          <email>adelpozog@vicomtech.org</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Machine</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Vicomtech</institution>
          ,
          <addr-line>Mikeletegi Pasealekua, 57, 20009 - Donostia/San Sebastian</addr-line>
          ,
          <country country="ES">Spain</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2019</year>
      </pub-date>
      <fpage>696</fpage>
      <lpage>703</lpage>
      <abstract>
        <p>This paper describes the participation of Vicomtech's team in the MEDDOCAN: Medical Document Anonymization challenge, which consisted in the recognition and classi cation of protected health information (PHI) in medical documents in Spanish. We tested di erent state-of-the-art classi cation algorithms, both deep and shallow, and rich sets of features, obtaining an F1-score of 0.960 in the strictest evaluation. The models submitted and scripts for decoding will be available at https://snlt.vicomtech.org/meddocan2019.</p>
      </abstract>
      <kwd-group>
        <kwd>PHI De-identi cation</kwd>
        <kwd>Learning</kwd>
        <kwd>Spanish Corpus</kwd>
        <kwd>Textual Anonymisation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        The major bottleneck for the advancement of Natural Language Processing
(NLP) in the medical eld is the struggle in accessing real clinical texts, mainly
due to data privacy protection issues. MEDDOCAN: Medical Document
Anonymization [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is the rst challenge devoted to the recognition and classi cation of
protected health information (PHI) in medical documents in Spanish. The
challenge has two sub-tasks: NER o set and entity type classi cation, and sensitive
span detection.
      </p>
      <p>
        This paper describes the participation of Vicomtech's team in the
MEDDOCAN challenge. Our aim has been to test a variety of state-of-the-art
approaches, whether neural or shallow, as well as their combinations. Speci cally,
Conditional Random Fields (CRF) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] are prominently featured, having been
extensively used for tasks of sequential nature such as named entity recognition
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] and for textual sensitive data identi cation and anonymisation [
        <xref ref-type="bibr" rid="ref3 ref4">4,3</xref>
        ]; other
techniques used include XGBoost [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], Convolutional Neural Networks (CNN)
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and Long Short-Term Memories (LSTM) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. The models submitted and
auxiliary scripts for feature extraction and decoding will be freely available at
https://snlt.vicomtech.org/meddocan2019.
      </p>
      <p>The paper is structured as follows: Section 2 starts describing the task's data
and the set of features extracted; then, the systems are presented, with a focus on
the practicalities of the implementations than on theoretical explanations. The</p>
      <p>N. Perez et al.
results obtained are reported in Section 3 and discussed in Section 4. Finally, the
paper ends by presenting the conclusions and hints for future work in Section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Materials and Methods</title>
      <p>2.1</p>
      <sec id="sec-2-1">
        <title>Data</title>
        <p>The MEDDOCAN corpus consists of clinical cases written in Spanish and
manually enriched with PHI expressions. A total number of 22 PHI categories are
considered which show high frequency variability1. The pre-processing and
formatting applied to the corpus consisted of the following steps:
1. Paragraph splitting. Documents were split into paragraphs using line
breaks in the original texts. We decided to work with paragraphs instead
of sentences because the provided sentence-splitting tool occasionally split
parts of target entities into di erent sentences.
2. Tokenisation. Each paragraph was tokenised using the SPACCC
Part-ofSpeech Tagger2 and some extra custom tokenisation rules, mainly to split
punctuation symbols if not inside a URL, email address or date, and to split
camel cased words in order to account for spacing errors in the original text
(e.g., `DominguezCorreo' into `Dominguez Correo').
3. Label formatting. The Brat-formatted annotations of the training and
development datasets were converted to token level tags following the BILOU
(Beginning, Inner, Last, Outside, Unique) scheme. Combining this tag scheme
with the original 22 granular PHI classes (e.g., for the granular class FECHA we
would have the tags B-FECHA, I-FECHA, L-FECHA, U-FECHA, plus the generic
O class) gives a nal tag set of 89 possible unique labels.</p>
        <p>The nal statistics including the number of documents, paragraphs, tokens,
vocabulary size, and PHI entities for each of the datasets in the pre-processed
corpus can be consulted in Table 1.
1 The MEDDOCAN annotation scheme de nes 29 PHI entity types, but only 22 of
them actually appear in the annotated sets.
2 https://github.com/PlanTL-SANIDAD/SPACCC POS-TAGGER
2.2</p>
      </sec>
      <sec id="sec-2-2">
        <title>Features</title>
        <p>The complete set of features extracted to train the classi ers is listed succinctly
in Table 2. Note, however, that the nal submitted results were obtained drawing
upon a di erent set of features in each case (detailed in Section 2.3).
2.3</p>
      </sec>
      <sec id="sec-2-3">
        <title>Systems</title>
        <p>
          Our team submitted 5 systems' results to the MEDDOCAN task. The same
systems were used for both sub-tasks: i) NER o set and entity type classi cation,
and ii) sensitive span detection. All the systems were complemented with a
small set of rules that annotated numerical expressions, such as dates and phone
numbers, via regular expressions. However, these rules had little impact on the
nal results. Next, predicted labels were post-processed to ensure that the result
followed the BILOU scheme, having the BILOU tag prevail over the PHI category
tag (e.g., the sequence B-CALLE &gt; L-PROFESION would be converted to B-CALLE
&gt; L-CALLE instead of U-CALLE &gt; U-PROFESION). Finally, the predictions had to
be converted back to Brat's format.
spaCy As a rst approach to the task, we experimented with spaCy's3 Named
Entity Recogniser (NER), built on Bloom Embeddings [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] and residual
Convolutional Neural Networks [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. We followed the given recipe 4 with default settings
and applied the recommended tweaks: compounding batch size, dropout decay,
and parameter averaging.
        </p>
        <p>spaCy supports a closed set of features, which overlaps only partially with
our own. Interestingly, training an empty model yielded better results on the
development set than using the accepted features. Likewise, training embeddings
from scratch also gave better results than using those presented in Section 2.2 as
pre-trained embeddings. Thus, the results submitted to the task were obtained
with a NER model trained from scratch, with no extra information provided but
the training data.</p>
        <p>CRF The second run presented corresponded to a system based on Conditional
Random Fields, implemented using the python sklearn-crfsuite library. The nal
CRF model did not include word embeddings or date-time expressions as
features, because they provided slightly worse results in previous feature selection
trials explored to reduce dimensionality. Features with oat values were rounded
to one decimal. The nal system was trained using the con guration presented
in Table 3.
3 https://spacy.io
4 https://spacy.io/usage/training#ner
Token: the token itself.</p>
        <p>Length: the length in characters of the token.</p>
        <p>Casing: features related to the token's casing, i.e., whether the token is uppercase,
lowercase or titlecase, and the ratio of uppercase characters to the token's length.
Digits and punctuation: features related to the token's character types, e.g.,
whether the token is alphanumeric or a punctuation mark, the ratio of the number of
punctuation marks to the token's length, and so on.</p>
        <p>A xes: the token's rst and last character bigrams and trigrams.</p>
        <p>Term characterisation
Linguistic information: the lemma and part-of-speech tag given by the SPACCC
tagger at the data pre-processing step.</p>
        <p>NERC: the named entity tag given by spaCy's model es core news md 2.1.0. If a
detected named entity was multi-word, we gave the same tag to all the tokens involved.
Date-time expressions: whether the token is part of a date and/or time expression
according to a left-to-right parser designed beforehand.</p>
        <p>Gazetteers: the maximum similarity score obtained when matching text n-grams
with gazetteer entries. We used a total of 10 gazetteers: the ones provided by the
organisers*, plus country names, kinship relations, months, and sexes. The string
similarity was computed with the python-Levenshtein library and was only added as
feature if it was greater than 0.75. If a match was multi-word, we gave the same score
to all the tokens involved.</p>
        <p>
          Brown clusters: complete paths and paths pruned at lengths 8, 16, 32, and 64. The
clusters [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] were computed on the training set's vocabulary with tan-clustering**, using
the default settings of the tool.
        </p>
        <p>
          Word vectors: each dimension in the word vectors provided by the task organisers
[
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Speci cally, we used the Word2Vec embeddings [
          <xref ref-type="bibr" rid="ref11 ref12">11,12</xref>
          ] of 300 dimensions trained
on SciELO and Wikipedia.
        </p>
        <p>Context characterisation
Boundaries: whether the token is rst or last in the paragraph.</p>
        <p>Length: the length in tokens of the paragraph the token belongs to.
Position: the position of the paragraph in the document.</p>
        <p>Header: the nearest expression to the left of each token that is followed by a colon,
lowercased (e.g., `email:', `antecedentes familiares:', and so on).</p>
        <p>Window: all the features of the neighbouring tokens in a 3 context window with
respect to the current token, except for the paragraph length and position.
* http://temu.bsc.es/meddocan/index.php/resources/
** https://github.com/mheilman/tan-clustering
CRF-XGBoost ensemble The third run corresponds to a system that
combines the non-sequential output of multiple XGBoost classi ers. The
XGBoost algorithm has achieved top-tier results in multiple Kaggle competitions
of di erent characteristics.</p>
        <p>The rst layer of the system is built with multiple XGBoost models trained
using di erent tagging schemes in addition to BILOU: i) 1/0 considering if the
token is PHI or not; ii) the token's PHI granular class, without considering any
sequence tagging scheme (i.e., CALLE instead of B-CALLE); and iii) BIO, a more
relaxed version of BILOU considering Beginning, Inner, Outside positions.</p>
        <p>
          To train these models, the full set of features described in Section 2.2 was
reduced to the top 200k features according to the univariate statistical test ANOVA
[
          <xref ref-type="bibr" rid="ref14">14</xref>
          ]. Finally, a CRF classi er with a 5 window was trained to tackle the lack of
context of the XGBoost algorithm using the di erent models' predictions and
the token, obtaining a more sequentially coherent result.
        </p>
        <p>
          NCRF++ NCRF++ [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] is an open-source toolkit built on PyTorch to train
neural sequence labelling models. We kept the default network con guration5: 4
CNN layers for character sequence representations, a bidirectional LSTM layer
for word sequence representations and an output CRF layer. The hyperparameter
settings are shown in Table 4. Regarding the features, in this case we used all
the available ones except for those derived from the word vectors, as these were
given as pre-trained embeddings to the network.
5 The maximum sentence length was hard-coded to 250 tokens at training; this
threshold had to be removed in prediction to accommodate a few longer sentences.
        </p>
        <p>Weighted Voting This run was an ensemble of the previous four taggers,
where each tagger's vote was given a weight equal to the F1-score obtained on
the development set in order to resolve potential ties.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Results</title>
      <p>All the systems achieved F1-scores over 0.950 on the test set, the best
F1scores being 0.960 and 0.968 for the rst and second sub-tasks, respectively.
All systems favour precision over recall. Among individual systems, NCRF++
has the best scores; particularly, it has a markedly better recall than the rest.
On the other hand, CRF outperforms the other systems in terms of precision,
but the lower recall relegates it to the last position in the rank. Weighted
Voting improves notably every individual score on the development set, but
does not prove to be that helpful on the test set. On the contrary, individual
systems show slightly better results on the test set than on the development set.
This improvement is more pronounced for recall. As for the CRF-XGBoost
ensemble and spaCy, they perform quite similar and remain ranked third and
fourth in both sub-tasks.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Discussion</title>
      <p>Although our focus is set on the nal systems, it is worth mentioning that several
non- nal versions were trained on data labelled using di erent tagging schemes.
Results of these models on the development set showed that using the BILOU
tagging scheme outperformed using other labelling practices.</p>
      <p>We also run some trials using models trained speci cally for the second
subtask, i.e., without using the PHI granular class labels. However, results on the
development set usually showed no signi cant improvement over models trained
using PHI class information. For this reason, the same nal classi ers were used
for both sub-tasks.</p>
      <p>Despite the vast amount of features extracted for the task, di erent feature
selection algorithms determined that the most relevant ones were Brown
clusters, followed by a xes. Context characterisation features were also denoted as
relevant, partly in uenced by the semi-structured nature of the documents.</p>
      <p>A tentative error analysis showed that the systems made very similar errors,
although with varying frequencies. Most of the false negatives involved entities
located at the least structured parts of the documents and usually consisted of
mentions to the patients' relatives, professions, and other types of less frequent
PHI categories. Another type of PHI class di cult to predict correctly was
addresses, because the systems segmented them into spans di erent to those in the
gold annotations. Finally, phone, fax, and identi cation numbers were correctly
recognised but incorrectly categorised on a few occasions. Regarding false
positives, most of them corresponded to improperly segmented addresses and the
misclassi cation of numeric expressions. The rest of falsely predicted PHI items
were most frequently entities seemingly missed by the human annotators.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and Future Work</title>
      <p>In this paper we described Vicomtech's approach to the MEDDOCAN challenge,
which consisted in trying di erent state-of-the-art Machine Learning classi
cation algorithms, both deep and shallow, and rich sets of features. Such approach
proved to be e ective, since all of the ve nal submitted systems achieved
F1-scores over 0.95 and 0.96 on the test set for the rst and second sub-tasks
respectively. The nal models submitted and auxiliary scripts for decoding will
be freely available at https://snlt.vicomtech.org/meddocan2019.</p>
      <p>Best results were achieved by a neural sequence classi er, followed by a
weighted voting ensemble system. Still, the results of other participants are
unknown to us at the time of writing and, thus, no conclusions can be reached as
to the competitiveness of the presented systems.</p>
      <p>Future work includes a deeper error analysis, in order to elucidate the di
erences in the results obtained on the development and test sets, and the proposal
of solutions to approach recurrent classi cation errors. For example, spotted
errors regarding family kinship or race could probably be solved with simple
postprocessing dictionary look-up heuristics. An analysis of which features turn out
more and less important for the classi ers' learning could also provide relevant
information for building new systems to tackle similar tasks.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Brown</surname>
            ,
            <given-names>P.F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Desouza</surname>
            ,
            <given-names>P.V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mercer</surname>
            ,
            <given-names>R.L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pietra</surname>
            ,
            <given-names>V.J.D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lai</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          :
          <article-title>Class-based ngram models of natural language</article-title>
          .
          <source>Computational linguistics 18(4)</source>
          ,
          <volume>467</volume>
          {
          <fpage>479</fpage>
          (
          <year>1992</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guestrin</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          :
          <article-title>XGBoost: A Scalable Tree Boosting System</article-title>
          .
          <source>In: Proc. of ACM SIGKDD</source>
          <year>2016</year>
          . pp.
          <volume>785</volume>
          {
          <issue>794</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Garc</surname>
            a-Sardin~a,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Serras</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>del Pozo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Knowledge transfer for active learning in textual anonymisation</article-title>
          .
          <source>In: Proc. of SLSP 2018</source>
          . pp.
          <volume>155</volume>
          {
          <issue>166</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>He</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Guan</surname>
          </string-name>
          , Y., Cheng, J.,
          <string-name>
            <surname>Cen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hua</surname>
          </string-name>
          , W.:
          <article-title>CRFs based de-identi cation of medical records</article-title>
          .
          <source>Journal of Biomedical Informatics</source>
          <volume>58</volume>
          ,
          <issue>S39</issue>
          {
          <fpage>S46</fpage>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>He</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Ren</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sun</surname>
          </string-name>
          , J.:
          <article-title>Deep Residual Learning for Image Recognition</article-title>
          .
          <source>In: Proc. of IEEE CVPR</source>
          <year>2016</year>
          . pp.
          <volume>770</volume>
          {
          <issue>778</issue>
          (
          <year>2016</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Hochreiter</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schmidhuber</surname>
            ,
            <given-names>J.:</given-names>
          </string-name>
          <article-title>Long short-term memory</article-title>
          .
          <source>Neural Computation</source>
          <volume>9</volume>
          (
          <issue>8</issue>
          ),
          <volume>1735</volume>
          {
          <fpage>1780</fpage>
          (
          <year>1997</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>La</surname>
            <given-names>erty</given-names>
          </string-name>
          , J.D.,
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pereira</surname>
            ,
            <given-names>F.C.N.</given-names>
          </string-name>
          :
          <article-title>Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data</article-title>
          .
          <source>In: Proc. of ICML 2001</source>
          . pp.
          <volume>282</volume>
          {
          <issue>289</issue>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>LeCun</surname>
          </string-name>
          , Y.,
          <string-name>
            <surname>Boser</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Denker</surname>
            ,
            <given-names>J.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henderson</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Howard</surname>
            ,
            <given-names>R.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hubbard</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Jackel</surname>
          </string-name>
          , L.D.:
          <article-title>Backpropagation applied to handwritten zip code recognition</article-title>
          .
          <source>Neural Computation</source>
          <volume>1</volume>
          (
          <issue>4</issue>
          ),
          <volume>541</volume>
          {
          <fpage>551</fpage>
          (
          <year>1989</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Marimon</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez-Agirre</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Intxaurrondo</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rodrguez</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lopez</surname>
            <given-names>Martin</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>J.A.</given-names>
            ,
            <surname>Villegas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            ,
            <surname>Krallinger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            :
            <surname>Automatic</surname>
          </string-name>
          de
          <article-title>-identi cation of medical texts in Spanish: the MEDDOCAN track, corpus, guidelines, methods and evaluation of results</article-title>
          .
          <source>In: Proc. of IberLEF 2019</source>
          . p.
          <source>TBA</source>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          :
          <article-title>Early results for named entity recognition with conditional random elds, feature induction and web-enhanced lexicons</article-title>
          .
          <source>In: Proc. of HLT NAACL</source>
          <year>2003</year>
          . pp.
          <volume>188</volume>
          {
          <issue>191</issue>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sutskever</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Corrado</surname>
            ,
            <given-names>G.S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dean</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          .
          <source>In: Proc. of NIPS 2013</source>
          . pp.
          <volume>3111</volume>
          {
          <issue>3119</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Mikolov</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yih</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zweig</surname>
          </string-name>
          , G.:
          <article-title>Linguistic regularities in continuous space word representations</article-title>
          .
          <source>In: Proc. of NAACL HLT</source>
          <year>2013</year>
          . pp.
          <volume>746</volume>
          {
          <issue>751</issue>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Serra</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Karatzoglou</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          :
          <article-title>Getting Deep Recommenders Fit: Bloom Embeddings for Sparse Binary Input/Output Networks</article-title>
          .
          <source>In: Proc. of RecSys 2017</source>
          . pp.
          <volume>279</volume>
          {
          <issue>287</issue>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Sheikhan</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bejani</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gharavian</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          :
          <article-title>Modular neural-SVM scheme for speech emotion recognition using ANOVA feature selection method</article-title>
          .
          <source>Neural Computing and Applications</source>
          <volume>23</volume>
          (
          <issue>1</issue>
          ),
          <volume>215</volume>
          {
          <fpage>227</fpage>
          (
          <year>2013</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Soares</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Villegas</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gonzalez-Agirre</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Krallinger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Armengol-Estape</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          :
          <article-title>Medical word embeddings for Spanish: Development and evaluation</article-title>
          .
          <source>In: Proc. of Clinical NLP Workshop 2019</source>
          . pp.
          <volume>124</volume>
          {
          <issue>133</issue>
          (
          <year>2019</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Yang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , Zhang, Y.:
          <article-title>NCRF++: An Open-source Neural Sequence Labeling Toolkit</article-title>
          .
          <source>In: Proc. of ACL 2018 (System Demonstrations)</source>
          . pp.
          <volume>74</volume>
          {
          <issue>79</issue>
          (
          <year>2018</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>