<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Polimi at CLinkaRT: a Conditional Random Field vs a BERT-based approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Vittorio Torri</string-name>
          <email>vittorio.torri@polimi.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Ieva</string-name>
          <email>francesca.ieva@polimi.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Natural Language Processing, Named Entity Recognition, Clinical documents,</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HDS - Health Data Science Centre, Human Technopole</institution>
          ,
          <addr-line>Viale Rita Levi-Montalcini 1, Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>MOX - Modelling and Scientific Computing Lab, Department of Mathematics</institution>
          ,
          <addr-line>Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Processing and Speech Tools for Italian</institution>
          ,
          <addr-line>Sep 7 - 8, Parma, IT</addr-line>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Workshop Proce dings</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the context of the EVALITA 2023 challenge, we present the models we have developed for the CLinkaRT task, which aims to identify medical examinations and their corresponding results in Italian clinical documents. We propose two distinct approaches: one utilising a Conditional Random Field (CRF), a probabilistic graphical model traditionally used for Named Entity Recognition, and the other based on BERT, the transformer-based model that is currently state-of-the-art for many Natural Language Processing tasks. Both models incorporate external knowledge from publicly available medical resources and are enhanced with heuristic rules to establish associations between exams and results. Our comparative analysis elects the CRF-based model as the winner, securing the third position in the competition ranking, but the BERT-based model demonstrated competitive performance.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <sec id="sec-1-1">
        <title>The widespread adoption of Electronic Health Records</title>
        <p>
          (EHR) has led to a significant transformation in
healthcare data collection, allowing for the accumulation of
extensive patient information. However, a considerable
portion of this data remains unstructured, posing
challenges to its utilisation in statistical analyses. Within
EHR systems, vast amounts of textual data, such as
clinical notes, reports, and discharge summaries, are stored,
containing valuable patient history that often lacks in
traditional databases. In recent years, Natural Language
Processing (NLP) advancements have opened up
possibilguages other than English [
          <xref ref-type="bibr" rid="ref2">2</xref>
          ].
        </p>
      </sec>
      <sec id="sec-1-2">
        <title>This paper presents the models we have developed</title>
        <p>
          for the CLinkaRT task [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] as part of the EVALITA 2023
challenge [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. The task entails identifying pairs of
medical examinations and their corresponding results within
        </p>
      </sec>
      <sec id="sec-1-3">
        <title>Italian clinical documents. To accomplish this, a subset</title>
        <p>
          of the Italian section of the E3C corpus [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], annotated by
the task organisers, was provided as the training set.
        </p>
      </sec>
      <sec id="sec-1-4">
        <title>Our first system is based on a Conditional Random</title>
      </sec>
      <sec id="sec-1-5">
        <title>Field (CRF), a probabilistic graphical model that has been</title>
        <p>nEvelop-O
(F. Ieva)
CEUR
htp:/ceur-ws.org</p>
        <p>ISN1613-073</p>
        <p>CEUR</p>
        <p>Workshop Proceedings (CEUR-WS.org)
gorizing specific types of entities within a given text. In
our case, we apply NER to recognize examination names
and their corresponding results. This model is enhanced
by incorporating external knowledge from additional
resources and employing rules to associate each
examination with its result. We compare it with an approach
based on BERT, the more recent transformer-based
neural network that is currently state-of-the-art for many</p>
      </sec>
      <sec id="sec-1-6">
        <title>NLP tasks [7]. In this case, we fine-tune the latest Italian</title>
        <p>
          version of BERT, Umberto [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ], using the E3C corpus. To
exploit the entire corpus, we automatically translated
documents that are in languages other than Italian.
Subsequently, this fine-tuned BERT model undergoes
trainset provided for the challenge, incorporating a linear
        </p>
        <p>Both models demonstrated discrete performances in
the NER tasks of identifying examinations and results,
while the figures were lower for the actual CLinkaRT
task, which involves associating examinations with their
corresponding results. The CRF-based model achieved
the best results, particularly due to higher recall on
examination names and higher precision on examination
results, achieving the third position in the final ranking.</p>
        <sec id="sec-1-6-1">
          <title>The code of our models is available on GitHub1.</title>
          <p>The rest of this paper is structured as follows:
Secdiscusses the dataset used for the task, Section 4 presents
a detailed description of our system, Section 5 reports the
obtained results, and Section 6 provides a comprehensive
discussion on our findings.</p>
        </sec>
      </sec>
      <sec id="sec-1-7">
        <title>1https://github.com/vittot/CLinkaRT-2023-Polimi</title>
        <p>
          numerous challenges persist, particularly in specialised
domains like medicine [
          <xref ref-type="bibr" rid="ref1">1</xref>
          ] and when dealing with lan- classification layer.
ities for extracting structured data from text. However, ing for token classification using the annotated training
NER is an NLP task that involves identifying and
catewidely used for Named Entity Recognition (NER) [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. tion 2 provides an analysis of related works, Section 3
        </p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        access to datasets comprising thousands of documents or
annotations, typically from a single source and within a
The application of Natural Language Processing (NLP) specific medical domain (e.g., cardiology). In our case, the
techniques to Italian medical documents has been rela- documents can cover any medical area, and the concept
tively limited. However, a few studies have addressed of examinations and their results have to be intended in
tasks relevant to this challenge. Viani et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] focused a broad sense.
on identifying various entities within Italian cardiology Table 1 provides examples of annotated sentences from
reports, including specific examination results and drug the training set. Sentence #1 has been annotated with
prescriptions. Their approach involved a pipeline utilis- two examinations: “fluenza” (“fluency” ) and “memoria”
ing dictionary lookup and an ontology with regular ex- (“memory”), both with the value “valori ai limiti della
pressions linked to concepts. They developed and evalu- norma” (“values at normal limits”). This example
demonated their methodology using a dataset of 5400 reports. In strates that the task involves not only identifying
laboraa subsequent study [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], a supervised learning approach tory examinations with precise numerical results but also
based on recurrent neural networks was employed to encompasses various types of examinations where results
extract events from a smaller dataset of 75 cardiology can be expressed qualitatively. Sentence #2 has been
anreports, encompassing 4300 event occurrences. notated as containing an examination “calo” (“loss”) with
      </p>
      <p>
        Chiaramello et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] explored the mapping between a result of “4 kg circa” (“about 4 kg”), although it can be
relevant terms in Italian clinical notes and concepts in the debated whether this qualifies as an examination.
Italian version of the Unified Medical Language System Another element of uncertainty relates to the
anno(UMLS) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], including the use of the MetaMap tool [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] tations boundaries, particularly for examination results.
on Italian documents. For instance, Sentence #3 has been annotated as having
      </p>
      <p>
        Another example of NER on Italian clinical data, based the result “della positività” (“of the positivity”) for the
exon recurrent neural network architecture, is [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], even if amination “asCa” while the proposition “della” (“of the”)
the goal, in this case, was the de-identification of clinical could have been excluded from the result.
notes and not information extraction. It is important to note that no specific annotation
      </p>
      <p>
        While the number of works specifically focusing on guidelines have been released, at least at the present
Italian documents remains limited, a more extensive time.
body of literature exists concerning English documents. The complexities arising from document heterogeneity,
These studies predominantly employ rule-based and dic- diferent possible interpretations of examination results,
tionary lookup approaches [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], conditional random and the absence of comprehensive annotation guidelines
ifelds [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], recurrent neural networks [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and, more highlight the challenges involved in the CLinkaRT task.
recently, transformer-based neural networks [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ].
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Data</title>
      <sec id="sec-3-1">
        <title>The training set provided by the task organisers consists</title>
        <p>of 83 documents extracted from the Italian subset of the
E3C corpus. These documents have been annotated with
pairs of examination mentions and corresponding results.
In particular, there are 658 pairs in the dataset, among
which there are 367 unique examination names and 395
unique examination values.</p>
        <p>The challenge ranking is based on the performance
of the models on a test set consisting of 80 documents.
The test set was initially released to participants without
annotations.</p>
        <p>The documents in the E3C corpus are clinical
narratives originating from diferent sources: journal papers,
admission tests for specialities in medicine, patient
information leaflets for medicines, and abstracts of theses in
medical science.</p>
        <p>The CLinkaRT task poses several dificulties due to the
heterogeneity of the documents and the small size of the
training set. Previous works in related areas often had</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Description of the system</title>
      <p>We decomposed the task problem into three subproblems:</p>
      <sec id="sec-4-1">
        <title>1. NER of examination names</title>
        <p>2. NER of examination results
3. Linking between examination names and results
For the NER subproblems, we propose the two
alternative approaches of CRF and BERT in Subsection 4.1 and
4.2, respectively, while for the linking, we propose an
approach based on heuristic rules in Subsection 4.3
4.1. CRF model
The primary model we developed and used for the
results submission is a Conditional Random Field (CRF). A
Conditional Random Field is an undirected probabilistic
graphical model widely used for Named Entity
Recognition (NER). The model’s random variables are divided
between the observed variables X and the output
variables Y, and the graph models the conditional probability
Alla luce della positività
In light of the positivity of
(asCa, della positività)
(asCa, of the positivity)
length of the sequence,  indexes the feature functions  
and   are the parameters to be learnt. Multiple feature
functions   can be defined, both as state feature
functions or as transition feature functions. While the first
ones depend on the current label   and on the observed
sequence x, the latter also depends on the previous label
 −1 .</p>
      </sec>
      <sec id="sec-4-2">
        <title>This task has two types of entities: examination names</title>
        <p>and examination results. It is possible to use two distinct</p>
      </sec>
      <sec id="sec-4-3">
        <title>CRFs for the two types of entities or a single one, which</title>
        <p>might be preferable as it can leverage the information
obtained from predicting an examination name label to
predict an examination result label, and vice versa. We
considered an extensive set of internal features for the</p>
      </sec>
      <sec id="sec-4-4">
        <title>CRF model, as listed in Table 2. Diferent combinations</title>
        <p>of them have been tested, but the best results have been
achieved with the complete set of features.</p>
      </sec>
      <sec id="sec-4-5">
        <title>All these features are computed on the current token, the previous, and the next token.</title>
      </sec>
      <sec id="sec-4-6">
        <title>Additionally, we incorporated features related to ex</title>
        <p>ternal knowledge sources. The first source is the UMLS
vocabulary. We translated each token in the training set
to English and queried the English UMLS vocabulary to
obtain the list of concepts corresponding to the token,
with their associated semantic types. We considered a
Annotations (ITA)
(fluenza,valori ai limiti
nella
norma)</p>
        <p>(memoria,valori ai limiti nella
norma)</p>
        <p>Annotations (ENG)
(fluency, values at normal
limits)
(memory, values at
normal limits)
(calo, 4 kg circa)
(loss, about 4 kg)
set of binary features for the presence of the 50 most
relevant semantic types and a more restricted set of features
only for the presence of three specific semantic types
(Laboratory or Test Result, Laboratory Procedure, Amino</p>
      </sec>
      <sec id="sec-4-7">
        <title>Acid, Peptide, or Protein) that are most likely associated</title>
        <p>with examination names, particularly for laboratory
examinations.</p>
      </sec>
      <sec id="sec-4-8">
        <title>The second external knowledge base we used is the</title>
        <p>oficial medical procedures nomenclature in Lombardy</p>
        <sec id="sec-4-8-1">
          <title>Region2. It is a list containing the names of all medical</title>
          <p>procedures provided by the Regional Health System in</p>
        </sec>
      </sec>
      <sec id="sec-4-9">
        <title>Lombardy. We considered only the categories primarily</title>
        <p>related to examinations: Anatomy-Pathological
HistologyGenetics, Immunohematology-Transfusion, Clinical
Chemistry, Laboratory in general, Microbiology-Virology. We
extracted a binary feature indicating if a token is present
in a processed version of this list, where we removed the
most frequent words (frequency &gt; 5).</p>
      </sec>
      <sec id="sec-4-10">
        <title>The CRF was trained with the lbgfs gradient descent</title>
        <p>algorithm, 200 maximum iterations and regularisation
coeficients  1 = 0.03 and  2 = 0.02.
4.2. BERT-based model</p>
      </sec>
      <sec id="sec-4-11">
        <title>BERT is a transformer-based neural network that has</title>
        <p>achieved state-of-the-art performances in many NLP
tasks. Although there are no domain-specific versions of
BERT for the medical domain in Italian, there are
generaldomain versions, the most recent of which is Umberto.</p>
      </sec>
      <sec id="sec-4-12">
        <title>2https://www.dati.lombardia.it/Sanit-/</title>
        <p>Transcodifica-Codici-prestazioni/7ugz-vcug</p>
        <p>Feature
Lowercased value of the current token
Lemmatized lowered value of the current token
Prefix of the current token
Sufix of the current token
Upper token flag
Title token flag
Digit flag
Math symbol flag
Part of speech tag
Exam abbreviation flag
We fine-tuned Umberto using the entire E3C corpus, in- and they are reported for the B-Exam, I-Exam, B-Value,
cluding labelled and unlabelled documents in all E3C and I-Value, even if there is no distinction between B
languages. Non-Italian documents were automatically and I tags in the annotations, to verify if longer entities
translated into Italian using Google Translate’s APIs. A show diferent performances. The NER results of the
linear token-level classification layer was added to this two models are comparable. They show higher precision,
BERT version and trained on the annotated dataset pro- in particular for the examination names, for which the
vided for the challenge while keeping the other layers recall is very low. The CRF has higher precision than
frozen. BERT on examination results and it has higher recall on</p>
        <p>Fine-tuning of the Umberto model over the E3C corpus examination names. These results are not surprising,
involved 3 epochs of training with a learning rate of given the limited amount of training data and the large
2 ⋅ 10−5 and weight decay of 0.01. The last layer was number of possible examinations that can exist in this
trained for 50 epochs with a learning rate of 10−3 and type of data. NER results on the test set are comparable
weight decay of 0.01. to those obtained via cross-validation over the training
set (we do not report them here due to space constraints).
4.3. Linking between exams and results The CLinkaRT task evaluation is based only on
recognising pairs of examinations and results. Only the pairs
We employed the following heuristic rules to link pairs that precisely matched the gold standard annotations
of examinations and results: each exam/result is paired were considered for ranking and evaluation. Precision,
with the nearest result/exam within the same sentence. recall, and F1-score were computed based on this precise
If there are no available elements to pair with it, it is matching. The results on the final test set for both
sysdiscarded. tems, computed with the oficial evaluation script, are
shown in Table 5. Both are aligned with the NER results
5. Results in terms of precision and lower in terms of recall. The
CRF model results are the best, for both precision and
recall.</p>
        <p>A manual analysis of the results highlighted that in
System
CRF
BERT
approach based on heuristics did not appear to be a
limiting factor, considering the F1-score achieved on the
(exam, result) pairs compared to the F1-score for the NER
Table 4 of exam names. However, it is possible to explore
dataResults of the two systems on the test set for the (exam, result) driven models for this subtask, even though the scarcity
pairs recognition of available data presents challenges.</p>
        <p>We strongly believe in the application of Natural
Language Processing techniques to the medical domain and
some cases the BERT model is capturing only part of recognise the huge need for developing models that can
the value, while the CRF model is typically capturing efectively process Italian healthcare data.
Simultaneit entirely or not capturing it at all. Some examples of ously, it is crucial to improve the quantity and quality
values captured by BERT vs the gold standard: “39” vs of annotated datasets to drive the development of such
“39%”, “10 ng/L” vs “inferiori a 10 ng/L”, “3.6” vs “3.6- models. Challenges like this are a valuable tool for
mo0.9mg/dL”. tivating the academic community to contribute to this</p>
        <p>Another observed aspect is that the BERT model seems field.
to be based more on the position of the words in the
sentence than on the words themselves. While this is
positive, sometimes it leads to recognizing as examination References
names words that are nearer to the value but are not the
actual name (e.g.: in “bilirubina diretta 1,8 mg/dL” (“direct
bilirubin 1,8 mg/dL”) it takes “diretta” (“direct”) as name
instead of “bilirubina” (“bilirubin”)). On the contrary,
the CRF model is more based on the words themselves,
at least for exam names, as it is shown by the fact that
among the features with the highest weight it has many
specific exam names (7 out of the first 10 features).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>6. Discussion</title>
      <p>Our two systems performed similarly on the Named
Entity Recognition (NER) task. They demonstrated
reasonable results for identifying examination results, although
there is room for improvement. However, the
identiifcation of examination names proved to be more
challenging for both systems. This can be attributed to the
limited size of the training data, which made it dificult
for the models to generalise to a larger set of previously
unseen examination names. Despite incorporating
external knowledge resources directly into the CRF model
and indirectly into the BERT-based model, they were
insuficient to enhance the performance in recognising
a broader range of examinations. Further investigation
is necessary to explore how other data sources can be
utilised for this purpose. While we were unable to find
an Italian dataset specifically annotated for examination
names, it may be worthwhile to investigate the use of
existing annotations for clinical entities in the E3C corpus,
selecting a subset that closely aligns with the concept of
examinations. Another element which is worth to further
investigating is the tokenizer: for the BERT-based model,
we utilised the Umberto tokenizer, but its limitations in
dealing with medical terminology might have negatively
afected the performances.</p>
      <p>Regarding linking examinations to results, our naive</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O. G.</given-names>
            <surname>Iroju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. O.</given-names>
            <surname>Olaleke</surname>
          </string-name>
          ,
          <article-title>A systematic review of natural language processing in healthcare</article-title>
          ,
          <source>International Journal of Information Technology and Computer Science</source>
          <volume>8</volume>
          (
          <year>2015</year>
          )
          <fpage>44</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Névéol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Dalianis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Velupillai</surname>
          </string-name>
          , G. Savova,
          <string-name>
            <surname>P. Zweigenbaum,</surname>
          </string-name>
          <article-title>Clinical natural language processing in languages other than English: opportunities and challenges</article-title>
          ,
          <source>Journal of biomedical semantics 9</source>
          (
          <year>2018</year>
          )
          <fpage>1</fpage>
          -
          <lpage>13</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Altuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Karunakaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Speranza</surname>
          </string-name>
          , R. Zanoli, CLinkaRT at EVALITA 2023:
          <article-title>Overview of the Task on Linking a Lab Result to its Test Event in the Clinical Domain , in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Lai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Menini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Polignano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Russo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Sprugnoli</surname>
          </string-name>
          , G. Venturi,
          <string-name>
            <surname>EVALITA</surname>
          </string-name>
          <year>2023</year>
          :
          <article-title>Overview of the 8th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian, in: Proceedings of the Eighth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian</article-title>
          .
          <source>Final Workshop (EVALITA</source>
          <year>2023</year>
          ), CEUR.org, Parma, Italy,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>B.</given-names>
            <surname>Magnini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Altuna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Lavelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Speranza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zanoli</surname>
          </string-name>
          ,
          <article-title>The E3C Project: Collection and Annotation of a Multilingual Corpus of Clinical Cases</article-title>
          ,
          <source>in: Proceedings of the Seventh Italian Conference on Computational Linguistics</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>H. M.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <article-title>Conditional random fields: An introduction</article-title>
          ,
          <source>Technical Report</source>
          , Department of CIS, University of Pennsylvania (
          <year>2004</year>
          )
          <fpage>22</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          , M.-
          <string-name>
            <given-names>W.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          , BERT:
          <article-title>Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies</article-title>
          , Volume
          <volume>1</volume>
          (Long and Short Papers),
          <year>2019</year>
          , pp.
          <fpage>4171</fpage>
          -
          <lpage>4186</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>F.</given-names>
            <surname>Tamburini</surname>
          </string-name>
          ,
          <article-title>How “BERTology” changed the stateof-the-art also for Italian NLP</article-title>
          ,
          <string-name>
            <surname>Computational Linguistics</surname>
          </string-name>
          CLiC-it
          <year>2020</year>
          (
          <year>2020</year>
          )
          <fpage>415</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>N.</given-names>
            <surname>Viani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Larizza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Tibollo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napolitano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. G.</given-names>
            <surname>Priori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bellazzi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Sacchi</surname>
          </string-name>
          ,
          <article-title>Information extraction from Italian medical reports: An ontology-driven approach</article-title>
          ,
          <source>International journal of medical informatics 111</source>
          (
          <year>2018</year>
          )
          <fpage>140</fpage>
          -
          <lpage>148</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>N.</given-names>
            <surname>Viani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. A.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Napolitano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. G.</given-names>
            <surname>Priori</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Savova</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Bellazzi</surname>
          </string-name>
          , L. Sacchi,
          <article-title>Supervised methods to extract clinical events from cardiology reports in italian</article-title>
          ,
          <source>Journal of biomedical informatics 95</source>
          (
          <year>2019</year>
          )
          <fpage>103219</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>E.</given-names>
            <surname>Chiaramello</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Pinciroli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bonalumi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Caroli</surname>
          </string-name>
          , G. Tognola,
          <article-title>Use of “of-the-shelf” information extraction algorithms in clinical informatics: A feasibility study of MetaMap annotation of Italian medical notes</article-title>
          ,
          <source>Journal of biomedical informatics 63</source>
          (
          <year>2016</year>
          )
          <fpage>22</fpage>
          -
          <lpage>32</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>O.</given-names>
            <surname>Bodenreider</surname>
          </string-name>
          ,
          <article-title>The unified medical language system (UMLS): integrating biomedical terminology</article-title>
          ,
          <source>Nucleic acids research</source>
          <volume>32</volume>
          (
          <year>2004</year>
          )
          <fpage>D267</fpage>
          -
          <lpage>D270</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Aronson</surname>
          </string-name>
          ,
          <article-title>Efective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program</article-title>
          ,
          <source>in: Proceedings of the AMIA Symposium</source>
          , American Medical Informatics Association,
          <year>2001</year>
          , p.
          <fpage>17</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>R.</given-names>
            <surname>Catelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Gargiulo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Casola</surname>
          </string-name>
          , G. De Pietro,
          <string-name>
            <given-names>H.</given-names>
            <surname>Fujita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Esposito</surname>
          </string-name>
          ,
          <article-title>A novel Covid-19 data set and an efective deep learning approach for the deidentification of Italian medical records</article-title>
          ,
          <source>Ieee Access</source>
          <volume>9</volume>
          (
          <year>2021</year>
          )
          <fpage>19097</fpage>
          -
          <lpage>19110</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>M. A.</given-names>
            <surname>Tanenblatt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Coden</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I. L.</given-names>
            <surname>Sominsky</surname>
          </string-name>
          ,
          <article-title>The ConceptMapper Approach to Named Entity Recognition</article-title>
          , in: LREC,
          <year>2010</year>
          , pp.
          <fpage>546</fpage>
          -
          <lpage>51</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>H. U.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Chowk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hahn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Segall</surname>
          </string-name>
          ,
          <article-title>Disease named entity recognition using conditional random fields</article-title>
          ,
          <source>in: Proceedings of the 7th International Symposium on Semantic Mining in Biomedicine</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Magge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Scotch</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Gonzalez-Hernandez, Clinical NER and relation extraction using bi-charLSTMs and random forest classifiers</article-title>
          , in: International workshop on medication and
          <article-title>adverse drug event detection</article-title>
          ,
          <source>PMLR</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>25</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>M.</given-names>
            <surname>Abadeer</surname>
          </string-name>
          ,
          <article-title>Assessment of DistilBERT performance on named entity recognition task for the detection of protected health information and medical concepts</article-title>
          ,
          <source>in: Proceedings of the 3rd clinical natural language processing workshop</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>158</fpage>
          -
          <lpage>167</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>