<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ADOP FERT-Automatic Detection of Occupations and Profession in Medical Texts using Flair and BERT</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Fazlourrahman Balouchzahi</string-name>
          <email>b@yahoo</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Grigori Sidorov</string-name>
          <email>bsidorov@cic</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hosahalli Lakshmaiah Shashirekha</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Center for Computing Research, Instituto Politecnico Nacional</institution>
          ,
          <addr-line>CDMX</addr-line>
          ,
          <country country="MX">Mexico</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Computer Science, Mangalore University</institution>
          ,
          <addr-line>Mangalore - 574199</addr-line>
          ,
          <country country="IN">India</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Technological developments in healthcare industry are generating lots of electronic health records as well as text data which is usually referred as medical text data. Processing medical text data in unstructured form is not only challenging but also has lot of applications. Named entity recognition, the task of extracting named entities and classifying them into prede ned categories is an important preprocessing step in the NLP pipeline. Extracting named entities from medical text is very useful for many applications and at the same time very challenging because of the characteristics of medical text data. Considering the gravity of medical text processing, in this paper, we (Team MUCIC) describe the models submitted to "MEDical DOocu-ments PROFessions recognition" (MEDDOPROF), a rst shared task consisting of three Tracks, namely: Track 1: MEDDOPROF-NER, Track 2: MEDDOPROF-CLASS, and Track 3: MEDDOPROF-NORM, in Spanish language. We participated in Track 1 and 2 and proposed two models based on ne-tuning BERT embeddings using i) BertForTokenClassi cation from transformers and ii) Flair framework, for the automatic detection of Occupations and Professions in medical text. The model using BertForTokenClassi cation reported micro F1 scores of 0.629 and 0.598 for Track 1 and 2 respectively, while the Flair framework model reported micro F1 scores of 0.8 and 0.764 for Track 1 and 2 respectively. Further, the Flair framework model for MEDDOPROF-NER track became one among the best models.</p>
      </abstract>
      <kwd-group>
        <kwd>Profession</kwd>
        <kwd>Medical Documents</kwd>
        <kwd>NER</kwd>
        <kwd>BERT</kwd>
        <kwd>Flair</kwd>
        <kwd>Embeddings</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The recent updates on medical and health-care information systems are
generating large amount of Electronic Health Records (EHRs) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] as well as text data
in medical domain. Despite the popularity of existing systems to manage EHRs
there is a massive amount of unstructured medical text data that are required
to be transformed into a more structured format for further processing [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Medical text processing or text analytics is one of the exciting areas of research in
NLP world that deals with various applications like Text Classi cation (TC)
(classi cation of medical records, classi cation of medical news articles), Text
Summarization (automatic generation of summaries from medical news articles,
summarization of clinical in-formation), Hypothesis Generation and Knowledge
Discovery and so on.
      </p>
      <p>
        One of the most popular text processing applications is Named Entity
Recognition (NER), which is used to automatically recognize and classify Named
Entities (NEs) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] representing names of persons, and organizations, locations and
so on from a given natural language text. NER is a crucial step in NLP pipeline
as performance of the NER module decides the performance of subsequent
modules [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] and NER systems also act as a preprocessing step for tasks like Relation
Extraction [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Medical NER which deals with extracting medical NEs such as
disease names, symptoms, medical conditions, medications, medical professions,
employment status, etc., from medical texts is challenging due to specialized
terminology, huge number of alternate spellings, and multi-word NEs. Even though
a variety of works have been explored for processing medical texts in diverse
aspects, very few works are reported in the literature on processing texts related to
medical profession and employment status in general and in particular to identify
and classify the NEs describing medical occupations in medical documents.
      </p>
      <p>
        To address the challenges of identifying and classifying the NEs describing
medical occupations and employment status in Spanish medical documents, in
this paper, we (Team MUCIC) describe the models submitted to two Tracks of
MEDical Documents PROFessions recognition (MEDDOPROF)[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] task.
MEDDOPROF is a rst shared task of its kind that consists of three sub-tracks and
the description of Tracks 1 and 2 (the ones in which we participated) are brie y
given below:
{ Track 1 - MEDDOPROF-NER: includes identifying the portion of texts
that mentions an occupation and classifying them into one of three
predened categories, namely: PROFESION (PROFESSION), SITUACION
LABORAL (WORKING STATUS) or ACTIVIDAD (ACTIVIDAD).
{ Track 2 - MEDDOPROF-CLASS: includes automatically nding the
beginning and end of occupation mentions and classifying them into one of
the further categories, namely: PACIENTE (Patient), FAMILIAR (Family
member), SANITARIO (Health professional) or OTRO (Other).
      </p>
      <p>
        Based on the description of Tracks and categories, Track1 and 2 can be
modeled as an NER task of identifying the NEs (tokens) which could be either a
single word or multi-word and then classifying/labeling them into one of the
pre-de ned categories according to the Tracks. Of late transformer based
models are achieving state-of-the art results for many NLP tasks compared to various
Machine Learning (ML) and Deep Learning (DL) models. To explore
transformers[
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], we have proposed two models based on ne tuning Bidirectional Encoder
Representations from Transformers (BERT) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] embeddings using i)
BertForTokenClassi cation class from transformer library and ii) Flair framework, for the
task of automatic detection of occupations and professions in Spanish medical
texts for Track 1 and 2 of MEDDOPROF.
      </p>
      <p>
        BERT as a language representation model employs bidirectional
representations from text to pre-train both left and right context. It also can be ne-tuned
for downstream tasks such as NER and TC, only by adding a speci c output
layer [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. The di erence between BERT and Embeddings from Language
Models (ELMo) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] that uses pre-trained language models is that ELMo uses the
language model as additional features whereas BERT enables ne-tuning all
parameters of pre-trained language model to make it task-speci c on downstream
task [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Flair framework provides a standard model for training along with
hyperparameter selection and uni ed interface that reduce complexity of using various
embeddings and enables researchers to mix the embeddings e ectively. It also
o ers various embeddings that are publically available in HuggingFace[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. In
the current work Flair is used with BERT embeddings [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. Generative
Pretrained Transformer (OpenAI GPT) is another architecture that allows
netuning. However, it is limited in unidirectional representation whereas BERT
utilize bidirectional representation which e ectively overcomes the restrictions
of OpenAi GPT's architecture [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>The rest of paper is organized as follows: Section 2 gives an overview of
the works carried out in the related area and Section 3 describes the proposed
methodology. While Section 4 presents Experiments and Results, Section 5 gives
the conclusion and throws light on future work.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>ML classi ers have reported reasonable and competitive performance for
various TC applications such as NER, Sentiment Analysis, Opinion Mining, etc.
However, these days' Neural Network (NN) based systems are commonly used
for various TC applications in various domains including medical domain. Some
recent adventures in medical text processing are described below:</p>
      <p>
        Yepes et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] developed a NN based system for the identi cation of
medical NEs from Twitter posts. The authors used 148 million tweets collected to
generate a CBOW word embeddings that is used as weights in model
construction. Two LSTMs are used to construct a sequence to sequence model where
rst LSTM acts as an encoder to encode the texts to vectors and second LSTM
as main classi cation model to classify the tokens. The proposed model on
Micromed[
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] dataset containing 1300 tweets obtained F1 scores of 0.665, 0.682,
and 0.718 on diseases, pharmacological substances and symptoms entities
respectively.
      </p>
      <p>
        Li et al. [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] presented a NN based model for medical NER in Chinese texts.
The authors used character level and word level embeddings to capture
orthographic and lexicosemantic features along with POS tags as word information
features. A Chinese medical corpus containing 12,498 records is used and 1739
records out of them were manually annotated into two categories, namely, subject
and lesion where symptoms related to body are considered as subject and lesion
refers to the pathological changes of the subjects. The dataset is transformed
into BIESO NE representation where B, I, E, and O indicates the beginning,
inside, end, and outside of the entity respectively and S illustrates that the
entity consists of only a single word. RNN, LSTM, GRU, BLSTM and BGRU are
experimented with various con gurations and feature combinations. Among all,
BGRU without employing any embeddings and only POS tags features, had the
best performance with 90.36 and 90.48% F1 scores for subject and lesion
detection tasks respectively.
      </p>
      <p>
        Feature engineering step is one of the important steps in any NLP task as it
aims to improve the performance of the system. Weegar et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] explored the
impact of simple feature engineering in NER systems for medical texts of three
languages namely, English, Swedish, and Spanish. The authors examined some
basic features including POS and semantic tags along with pre xes, window
size, and capitalization. Averaged structured perceptron algorithm is used with
SemEval-2014 Task 7 Analysis of Clinical Text Shared Task dataset containing
9,694 disease NEs for English, EHRs consisting of patient records developed by
Oronoz et al. [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] containing 3,362 instances of diseases and 1,406 drugs
entities as Spanish dataset and a dataset containing 4,000 entities corresponding
to body parts, disorders and ndings from over 500 di erent clinical units at
Karolinska University Hospital for Swedish released by Dalianis et al. [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. The
observation of the results illustrates that in many cases simple but neglected
features can signi cantly enhance the performance of the systems. Their best
performing systems which obtained F1 scores of 66.40 and 68.41, and 68.22 for
English, Swedish, and Spanish languages respectively used specialized medical
dictionaries.
      </p>
      <p>
        Sometimes instead of working on features and model construction, proposing
new representation for the data might be more e cient. In one of such
studies Hamada et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] proposed FROBES Segment Representation (SR) model
which is an exten-sion of IOBES model, when NEs are multi-word in nature. In
the proposed FROBES model F, R, O, B, and E represents front, rear, outside,
begin, and end respectively and S represents a single word. FROBES is an
extension of IOBES where tag I in IOBES is replaced with F and R when entity
has more than two words. As it considers both halves of entities, rst half are
annotated with B and F and second half contains R and E. The proposed SR
scheme is evaluated using BiLSTM as baseline model on two datasets, namely:
i2b2/VA 2010 challenge dataset and JNLPBA 2004 shared task dataset and the
results reported by the authors illustrate that using FROBES improved the
performance slightly. However, ensembling the baseline models with di erent SR
models, namely: IOB2, IOBES, and FROBES outperformed the baseline models
with F1 scores of 71.99 and 83.62 on the same datasets.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Methodology</title>
      <p>The two proposed models based on ne-tuning BERT embeddings using i)
BertForTokenClas-si cation from transformers and ii) Flair framework, designed
and evaluated for the Tracks 1 and 2 of MEDDOPROF are described in this
section.
3.1</p>
      <sec id="sec-3-1">
        <title>Data Transformation</title>
        <p>
          The datasets provided by the organizers of MEDDOPROF shared task for the
sub-Tracks are in Brat stando annotation format. As per this format, for each
text le there is a corresponding annotation le consisting of an annotation
ID, a label, and beginning and ending o set for each NE which could be of a
single word or multi-word. More details of Brat stando format can be found on
its website[
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. As the data in CONLL IOB [22] format is easy to handle, the
given data in Brat stando annotation format is transformed to CONLL format
with IOB representation using brat to conll.py [23] module. IOB representation
assigns the tags I and O for the tokens that are inside and outside the NE
respectively and assigns the tag B for the rst word of the NE [
          <xref ref-type="bibr" rid="ref2 ref4">2, 4</xref>
          ]. A snapshot
of data in Brat format and CONLL (IOB) format is shown in Figure 1.
        </p>
        <p>
          As the data transformed into CONLL IOB format is used to train the
classi er models, the predictions of the models will also be in CONLL IOB format.
This requires a post processing step to re-transform the predictions in CONLL
IOB format back to Brat stando annotation format to generate .ann les as
output as required by the organizers.
The main component of the proposed models is BETO [
          <xref ref-type="bibr" rid="ref16 ref17">16, 17, 24</xref>
          ] which is a
Spanish BERT language model trained on a large amount of Spanish
unannotated corpora [25]. In this work, we have used bert-base-spanish-wwm-cased[26]
model which is more e cient for NER tasks as capitalization play a major role
in identifying NEs.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>BertForTokenClassi cation using Transformer:</title>
        <p>The rst step of this model is to ne tune the BERT model on downstream
task using transformers library. Using the data in CONLL IOB format, the
netuned models are further trained for Track 1 and 2 of the shared tasks. For
each test dataset, the models generate tagged sequences sentence-wise in IOB
annotation format which will be converted back to Brat stando annotation
format manually.</p>
        <p>As BERT based models require to be fed with sequences of same length,
the maximum length of sequences is set into 510 and the shorter sequences are
padded to this length. However, an attention mask is employed to avoid
distracting models with padded elements. Similar to Keras[28], BERT support attention
masks that are used to allow the model focus on main part of the sequence
ignoring padded elements. In other words, mask is typically used for attention when a
batch has varying length of sentences. Therefore, it takes real tokens for training
by assigning 1 to in sequence tokens and 0 for out of sequence. After assembling
training data and corresponding masks using PyTorch [27], BETO will be
initialized using BertForTokenClassi cation class from transformers library which
adds a token level predictor on BERT model.</p>
        <p>Setting the optimizer to AdamW the models have been trained for 50 epochs.
Figure 2 represents training and validation loss where validation set is 10% of
training set. Overview of the model based on BERT using transformer library is
shown in Figure 3.</p>
      </sec>
      <sec id="sec-3-3">
        <title>Flair with BERT Embedding:</title>
        <p>
          Flair is a PyTorch based NLP tool that provides a model training framework in
which various embeddings and language models can be used individually or in
combination and ne-tuned for downstream tasks with special support for
Medical domain data [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. However, to compare the performance of this model with
that of BertForTokenClassi cation model, bert-base-spanish-wwm-cased model
is used and ne-tuned using Sequence Tagger from Flair which is BiLSTM based
backend. It is also possible to use CRF on top of the model, but it is not used in
this work. As Flair requires the training data in CONLL format, data from Brat
stando annotation format is transformed to CONLL IOB format as described
in Section 3.1 and is loaded using ColumnCorpus class from Flair library. A
summary of the layers used in this model is given in our Github page[29].
Parameters of the proposed model are set as given in Table 1 and an overview of
proposed Flair model is presented in Figure 4.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experiments and Results</title>
      <p>The main requirement of any task is an annotated dataset for training the
models. MEDDOPROF corpus provided by the organizers contains 1844 clinical cases
with more than 20 specialties annotated manually by clinical and linguistics
experts following strict guidelines. Each clinical case is stored as a single text le
along with a corresponding Brat stando annotation le. The description of the
dataset is available in the task website [30] and the descriptions of the labels for
both the Tracks are given in Table 2.</p>
      <p>
        Evaluating models' performance is the most important task. As per the
submission guidelines [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], the predictions for each test le should be in Brat stando
annotation format, i.e., the annotation le should have an extension .ann and
should consist of an annotation ID, label, correct beginning and ending o set
for each predicted NE, in one line, similar to the annotation le given in the
training set. However, the value of annotation ID is generated at random as it
does not have any in uence on the prediction. Annotation les are generated
for each le in the test set and submitted to the task organizer for evaluation.
The performance of the models is evaluated in terms of Micro average Precision,
Recall and F1-score.
      </p>
      <p>Organizers provided the results obtained by a simple lookup system and
the annotations from the training data as baseline. Therefore baseline's results
and performances of the proposed models reported by the organizer for both
the Tracks with Micro average scores are shown in Table 3. The results illustrate
that the pro-posed models obtained quite good performances for both the Tracks.
Further, the results also illustrate that the proposed models performed better
for MEDDOPROF-NER task. In addition, the model using Flair framework and
BERT embeddings outperformed the other proposed model and became one of
the best performing models in the shared task.
Medical text processing is one of more exciting as well as vital task in NLP.
Considering its importance MEDDOPROF has called for a shared task with
three Tracks and we participated in two of them, namely: MEDDOPROF-NER
and MEDDOPROF-CLASS for the automatic detection of occupations and
profession in Spanish medical texts. We (team MUCIC) proposed two models
using BERT embed-dings, namely: BertForTokenClassi cation from transformers
and Flair framework. The results illustrate that the models performed better
in NER and Flair model outperformed the other model in both the Tracks
and also obtained very good results with micro F1-scores of 0.8 and 0.764
for MEDDOPROF-NER and MEDDOPROF-CLASS respectively. Further, the
Flair model for MEDDOPROF became one of the best per-forming models in
the shared task. As future work it is planed for exploring the Language
Understanding with Knowledge-based Embeddings (LUKE) model which is a new
pre-trained contextualized representation of words and entities based on
transformer. Improving the performances of system with modi cations in NEs
representations and also exploring various learning approaches for task of NER in
medical texts are other plans of future work.
6</p>
    </sec>
    <sec id="sec-5">
      <title>Acknowledgements</title>
      <p>Team MUCIC deeply appreciates the organizers of MEDDOPROF shared task
for their e orts, guidance and supports during the task and the anonymous
reviewers for their valuable comments.
22. ://en.wikipedia.org/wiki/Inside%E2%80%93outside%E2%80%93beginning
(tagging)
23. NeuroNER, https://github.com/Franck-Dernoncourt/NeuroNER
24. BETO: Spanish BERT, https://github.com/dccuchile/beto
25. Spanish Unannotated Corpora, https://github.com/josecannete/spanish-corpora
26. Spanish Bert, https://huggingface.co/dccuchile/bert-base-spanish-wwm-cased
27. PyTorch homepage, https://pytorch.org/
28. Keras homepage, https://keras.io/
29. https://github.com/fazlfrs/MUCIC-MEDDOPROF/blob/main/Flair%20arcitecture
30. MEDDOPROF homepage, https://temu.bsc.es/meddoprof/data/</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Nayel</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shashirekha</surname>
            <given-names>HL</given-names>
          </string-name>
          .
          <article-title>Improving NER for Clinical Texts by Ensemble Approach using Segment Representations</article-title>
          .
          <source>In Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017) 2017</source>
          Dec (pp.
          <fpage>197</fpage>
          -
          <lpage>204</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cui</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bai</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lu</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Li</surname>
            <given-names>X</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Aickelin</surname>
            <given-names>U</given-names>
          </string-name>
          , Ge P.
          <article-title>Regular Expression Based Medical Text Clas-si cation using Constructive Heuristic Approach</article-title>
          . IEEE Access.
          <source>2019 Oct</source>
          <volume>11</volume>
          ;
          <fpage>7</fpage>
          :
          <fpage>147892</fpage>
          -
          <lpage>904</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Balouchzahi</given-names>
            <surname>Fazlourrahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. L.</given-names>
            <surname>Shashirekha</surname>
          </string-name>
          .
          <article-title>PUNER-Parsi ULMFiT for NamedEntity Recognition in Persian Texts</article-title>
          .
          <source>No. 4224. EasyChair</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Nayel</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shashirekha</surname>
            <given-names>HL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Shindo</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Matsumoto</surname>
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Improving Multi-Word Entity</surname>
          </string-name>
          Recog
          <article-title>-nition for Biomedical Texts</article-title>
          . arXiv preprint arXiv:
          <year>1908</year>
          .05691.
          <year>2019</year>
          Aug 15.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Shashirekha</surname>
            <given-names>HL</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nayel</surname>
            <given-names>HA</given-names>
          </string-name>
          .
          <article-title>A Comparative Study of Segment Representation for Biomedi-cal Named Entity Recognition</article-title>
          .
          <source>In 2016 International Conference on Advances in Compu-ting, Communications and Informatics (ICACCI) 2016 Sep</source>
          <volume>21</volume>
          (pp.
          <fpage>1046</fpage>
          -
          <lpage>1052</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Salvador</given-names>
            <surname>Lima-Lopez</surname>
          </string-name>
          , Eulalia Farre-Maduell, Antonio Miranda-Escalada, Vicent Briva-Iglesias,
          <string-name>
            <given-names>Martin</given-names>
            <surname>Krallinger</surname>
          </string-name>
          .
          <article-title>"NLP applied to occupational health: MEDDOPROF shared task at IberLEF 2021 on automatic recognition, classi cation and normalization of professions and occupations from medical texts"</article-title>
          .
          <source>Procesamiento del Lenguaje Natural</source>
          <volume>67</volume>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Devlin</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chang</surname>
            <given-names>MW</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toutanova</surname>
            <given-names>K</given-names>
          </string-name>
          . Bert:
          <article-title>Pre-Training of Deep Bidirectional Trans-formers for Language Understanding</article-title>
          . arXiv preprint arXiv:
          <year>1810</year>
          .04805. 2018 Oct 11.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Campillos-Llanos</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Valverde-Mateos</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Capllonch-Carrion</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Moreno-Sandoval A. A Clinical Trials</surname>
          </string-name>
          <article-title>Corpus Annotated with UMLS Entities to Enhance the Access to Evidence-Based Medicine</article-title>
          .
          <article-title>BMC medical informatics and decision making</article-title>
          .
          <source>2021 Dec; 21</source>
          (
          <issue>1</issue>
          ):
          <fpage>1</fpage>
          -
          <lpage>9</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Peters</surname>
            <given-names>ME</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Neumann</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Iyyer</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gardner</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Clark</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zettlemoyer L. Deep</surname>
          </string-name>
          Con-textualized
          <source>Word Representations. arXiv preprint arXiv:1802.05365. 2018 Feb</source>
          <volume>15</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Akbik</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bergmann</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Blythe</surname>
            <given-names>D</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Rasul</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Schweter</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vollgraf</surname>
            <given-names>R. FLAIR</given-names>
          </string-name>
          :
          <article-title>An Easy-To-Use Framework for State-Of-The-Art NLP</article-title>
          .
          <source>In Proceedings of the</source>
          <year>2019</year>
          <article-title>Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations) 2019 Jun</article-title>
          (pp.
          <fpage>54</fpage>
          -
          <lpage>59</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Yepes</surname>
            <given-names>AJ</given-names>
          </string-name>
          ,
          <article-title>MacKinlay A. NER for Medical Entities in Twitter using Sequence to Sequence Neural Networks</article-title>
          .
          <source>In Proceedings of the Australasian Language Technology Association Workshop</source>
          <year>2016</year>
          2016 Dec (pp.
          <fpage>138</fpage>
          -
          <lpage>142</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Li</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zhao</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yang</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Huang</surname>
            <given-names>Z</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            <given-names>B</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chen</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pan</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wang</surname>
            <given-names>Q.</given-names>
          </string-name>
          <article-title>WCP-RNN: a Novel RNN-based Approach for Bio-NER in Chinese EMRs</article-title>
          .
          <source>The journal of supercomputing</source>
          . 2020 Mar;
          <volume>76</volume>
          (
          <issue>3</issue>
          ):
          <fpage>1450</fpage>
          -
          <lpage>67</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Weegar</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>de Ilarraza</surname>
            <given-names>AD</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Oronoz</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gojenola</surname>
            <given-names>K.</given-names>
          </string-name>
          <article-title>The Impact of Sim-ple Feature Engineering in Multilingual Medical NER</article-title>
          .
          <source>In Proceedings of the Clinical Natural Language Processing Workshop</source>
          (ClinicalNLP) 2016 Dec (pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Oronoz</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Casillas</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gojenola</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez</surname>
            <given-names>A</given-names>
          </string-name>
          .
          <article-title>Automatic Annotation of Medical Records in Spanish with Disease, Drug and Substance Names</article-title>
          .
          <source>In Iberoamerican Congress on Pattern Recognition 2013 Nov</source>
          <volume>20</volume>
          (pp.
          <fpage>536</fpage>
          -
          <lpage>543</lpage>
          ). Springer, Berlin, Heidelberg.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Dalianis</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Henriksson</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Kvist</surname>
            <given-names>M</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Velupillai</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weegar</surname>
            <given-names>R. HEALTH</given-names>
          </string-name>
          <article-title>BANK-A Work-bench for Data Science Applications in Healthcare</article-title>
          .
          <source>CAiSE Industry Track</source>
          .
          <source>2015 Jun</source>
          <volume>11</volume>
          ;
          <issue>1381</issue>
          :
          <fpage>1</fpage>
          -
          <lpage>8</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Canete</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Chaperon</surname>
            <given-names>G</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fuentes</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Perez</surname>
            <given-names>J</given-names>
          </string-name>
          .
          <article-title>Spanish Pre-trained BERT Model and Evaluation data</article-title>
          .
          <source>PML4DC at ICLR</source>
          .
          <year>2020</year>
          ;
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>Wu</surname>
            <given-names>S</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dredze</surname>
            <given-names>M</given-names>
          </string-name>
          .
          <article-title>Beto, bentz, becas: The Surprising Cross-lingual E ectiveness of BERT</article-title>
          . arXiv preprint arXiv:
          <year>1904</year>
          .09077.
          <year>2019</year>
          Apr 19.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>18. BertForTokenClassi cation, https://huggingface.co/transformers/model doc/bert. htmlbertfortokenclassi cation</mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>19. Hugging Face homepage, https://huggingface.co/</mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>20. MedInfo 2015 Dataset, https://github.com/IBMMRL/medinfo2015</mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <article-title>brat stando format homepage</article-title>
          , https://brat.nlplab.org/stando .html
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>