<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Acronym Identification and Disambiguation Shared Tasks for Scientific Document Understanding</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Amir Pouran Ben Veyseh</string-name>
          <email>apouranbg@cs.uoregon.edu</email>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Franck Dernoncourt</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Thien Huu Nguyen</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Walter Chang</string-name>
          <email>wachangg@adobe.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Leo Anthony Celi</string-name>
          <email>lceli@bidmc.harvard.edu</email>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Adobe Research</institution>
          ,
          <addr-line>San Jose, CA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Harvard University</institution>
          ,
          <addr-line>Cambridge, MA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Massachusetts Institute of Technology</institution>
          ,
          <addr-line>Cambridge, MA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>University of Oregon</institution>
          ,
          <addr-line>Eugene, OR</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <issue>7</issue>
      <abstract>
        <p>Acronyms are the short forms of longer phrases and they are frequently used in writing, especially scholarly writing, to save space and facilitate the communication of information. As such, every text understanding tool should be capable of recognizing acronyms in text (i.e., acronym identification) and also finding their correct meaning (i.e., acronym disambiguation). As most of the prior works on these tasks are restricted to the biomedical domain and use unsupervised methods or models trained on limited datasets, they fail to perform well for scientific document understanding. To push forward research in this direction, we have organized two shared task for acronym identification and acronym disambiguation in scientific documents, named AI@SDU and AD@SDU, respectively. The two shared tasks have attracted 52 and 43 participants, respectively. While the submitted systems make substantial improvements compared to the existing baselines, there are still far from the human-level performance. This paper reviews the two shared tasks and the prominent participating systems for each of them.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        One of the common practices in writing to save space and
make the flow of information smoother is to avoid repetition
of long phrases which might waste space and the reader’s
time. To this end, acronyms that are the shortened form
of a long-phrase are often used in various types of
writing, especially in scientific documents. However, this
prevalence might introduce more challenges for text
understanding tools. More specifically, as the acronyms might not be
defined in dictionaries, especially locally-defined acronyms
whose long-form is only provided in the document that
introduces them, a text processing model should be able to
identify the acronyms and their long forms in the text (i.e.,
acronym identification). For instance, in the sentence “The
main key performance indicator, herein referred to as KPI,
is the E2E throughput”, the text processing system must
recognize KPI and E2E as acronyms and the phrase key
performance indicator as the long-form. Another issue related to
the acronym that text understanding tools encounter is that
the correct meaning (i.e., long-form) of the acronym might
not be provided in the document itself (e.g., the acronym
E2E in the running example). In these cases, the correct
meaning can be obtained by looking up the meaning in
an acronym dictionary. However, as different long forms
could share the same acronym (e.g., two long forms
Cable News Network and Convolution Neural Network share
the acronym CNN), this meaning look-up is not
straightforward and the system must disambiguate the acronym (i.e.,
acronym disambiguation). Both AI and AD models could be
used in downstream applications including definition
extraction
        <xref ref-type="bibr" rid="ref10 ref13 ref2 ref6 ref8">(Veyseh et al. 2020a; Spala et al. 2020, 2019;
EspinosaAnke and Schockaert 2018; Jin et al. 2013)</xref>
        , various
information extraction tasks
        <xref ref-type="bibr" rid="ref20 ref29 ref7 ref9">(Liu et al. 2019; Pouran Ben
Veyseh, Nguyen, and Dou 2019)</xref>
        and question answering
        <xref ref-type="bibr" rid="ref1">(Ackermann et al. 2020; Veyseh 2016)</xref>
        .
      </p>
      <p>
        Due to the importance of the two aforementioned tasks,
i.e. acronym identification (AI) and acronym
disambiguation (AD), there is a wealth of prior work on AI and AD
        <xref ref-type="bibr" rid="ref11 ref14 ref17 ref20 ref21 ref25 ref4 ref7 ref7 ref9">(Park and Byrd 2001; Schwartz and Hearst 2002; Nadeau
and Turney 2005; Kuo et al. 2009; Taneva et al. 2013;
Kirchhoff and Turner 2016; Li et al. 2018; Ciosici, Sommer, and
Assent 2019; Jin, Liu, and Lu 2019; Veyseh et al. 2021)</xref>
        .
However, there are two major limitations in the existing
systems. First, for AD tasks, the existing models are mainly
limited to the biomedical domain, ignoring the challenges in
other domains. Second, for the AI task, the existing models
employ either unsupervised methods or models trained using
a limited manually annotated AI dataset. The unsupervised
methods or small size of the AI dataset results in errors for
acronym identification which could be also propagated for
acronym disambiguation task.
      </p>
      <p>To address the above issues in the prior works, we recently
released the largest manually annotated acronym
identification dataset for scientific documents (viz., SciAI) (Veyseh
Domain
Wikipedia</p>
      <p>Wikipedia
Scientific Papers</p>
      <p>Patent</p>
      <p>
        News
Scientific Papers
et al. 2020b). This dataset consists of 17,506 sentences from
6,786 English papers published in arXiv. The annotation of
each sentence involves the acronyms and long forms
mentioned in the sentence. Also, using this manually annotated
AI dataset, we also created a dictionary of 732 acronyms
with multiple corresponding long forms (i.e., ambiguous
acronyms) which is the largest available acronym dictionary
for scientific documents. Moreover, using the prepared
dictionary and 2,031,592 sentences extracted from arXiv
papers, we created a dataset for the acronym disambiguation
task (viz., SciAD)
        <xref ref-type="bibr" rid="ref23">(Veyseh et al. 2020b)</xref>
        . This dataset
consists of 62,441 sentences, which is larger than the prior AD
dataset for the scientific domain.
      </p>
      <p>Using the two datasets SciAI and SciAD, we organize
two shared tasks for acronym identification and acronym
disambiguation for scientific document understanding (i.e.,
AI@SDU and AD@SDU, respectively). The AI@SDU
shared task has attracted 52 participant teams with 19
submissions during the evaluation phase. The AD@SDU has
also attracted 43 participant teams with 10 submissions
during the evaluation phase. The participant teams made
considerable progress on both shared task compared to the
provided baselines. However, the top-performing models, (viz.,
AT-BERT-E for AI@SDU with 93.3% F1 score and
DeepBlueAI for AD@SDU with 94.0% F1 score), underperforms
human (with 96.0% and 96.1% F1 score for AI@SDU and
AD@SDU shared task, respectively), leaving room for
future research. In this paper, we review the dataset creation
process, the details of the shared task, and the prominent
submitted systems.</p>
    </sec>
    <sec id="sec-2">
      <title>Dataset &amp; Task Description</title>
      <sec id="sec-2-1">
        <title>Acronym Identification</title>
        <p>The acronym identification (AI) task aims to recognize all
acronym and long forms mentioned in a sentence.
Formally, given the sentence S = [w1; w2; : : : ; wN ] the goal
is to predict the sequence L = [l1; l2; : : : ; lN ] where li 2
fBa; Ia; Bl; Il; Og. Note that Ba and Ia indicate the
beginning and inside an acronym, respectively, while Bl and Il
show beginning and inside a long form, respectively, and O
is the label for other words.</p>
        <p>
          As mentioned in the introduction, the existing AI datasets
are either created using some unsupervised methods (e.g.,
by character matching the acronym with their surrounding
words in the text) or they are small-sized thus
inappropriate for data-hungry deep learning models. To address these
limitations we aim to create the largest acronym
identification dataset which is manually labeled. To this end, we
first collect 6,786 English papers from arXiv. This
collection contains 2,031,592 sentences. As all of these sentences
might not contain the acronym and their long forms, we
first filter out the sentences without any candidate acronym
and long-form. To identify the candidate acronyms, we use
the rule that the word wt is a candidate acronym if half
of its characters are upper-cased. To identify the candidate
long forms, we employ the rule that the subsequent words
[wj ; wj + 1; : : : ; wj+k] are a candidate long-form if the
concatenation of their first one, two, or three characters can
form a candidate acronym, i.e., wt, in the sentence. After
filtering sentences without any candidate acronym and
longform, 17,506 sentences are selected that are annotated by
three annotators from Amazon Mechanical Turk (MTurk).
More specifically, MTurk workers annotated the acronyms,
long forms, and the mapping between identified acronyms
and long forms. In case of disagreements, if two out of
three workers agree on an annotation, we use majority
voting to decide the correct annotation. Otherwise, a fourth
annotator is hired to resolve the conflict. The inter-annotator
agreement (IAA) using Krippendorff’s alpha
          <xref ref-type="bibr" rid="ref12">(Krippendorff
2011)</xref>
          with the MASI distance metric
          <xref ref-type="bibr" rid="ref26">(Passonneau 2006)</xref>
          for
short-forms (i.e., acronyms) is 0.80 and for long-forms (i.e.,
phrases) is 0.86. This dataset is called SciAI. A comparison
of the SciAI dataset with other existing manually annotated
AI datasets is provided in Table 1.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Acronym Disambiguation</title>
        <p>The goal of acronym disambiguation (AD) task is to find
the correct meaning of a given acronym in a sentence. More
specifically, given the sentences S = [w1; w2; : : : ; wN ] and
the index t where wt is an acronym with multiple long forms
L = fl1; l2; : : : ; lmg the goal is to predict the long form li
form L as the correct meaning of wt.</p>
        <p>As discussed earlier, one of the issues with the
existing AD datasets is that they mainly focus on the
biomedical domain, ignoring the challenges in other domains. This
domain shift is important as some of the existing models
for biomedical AD exert domain-specific resources (e.g.,
BioBERT) which might not be suitable for other domains.
Another issue of the existing AD datasets, especially the
ones proposed for a scientific domain, is that they are based
on unsupervised AI datasets. That is, acronyms and long
forms in a corpus are identified using some rules and the
resulting AI dataset is employed to find acronyms with
multiple long forms to create the AD dataset. This
unsupervised method to create an AD dataset could introduce noises
and miss some challenging cases. To address these
limitations, we created a new AD dataset using the manually
labeled SciAI dataset. More specifically, first using the
mappings between annotated acronyms and long forms in SciAI,
we create a dictionary of acronyms that have multiple long
forms (i.e., ambiguous acronyms). This dictionary contains
732 acronyms with an average of 3.1 meaning (i.e.,
longform) per acronym. Afterward, to create samples for the AD
dataset, we look up all sentences in the collected corpus in
which one of the ambiguous acronyms is locally defined
(i.e., its long-form is provided in the same sentence). Next,
in the documents hosting these sentences, we automatically
annotate every occurrence of the acronym with its locally
defined long-form. Using this process a dataset consisting
of 62,441 sentences is created. We call this dataset SciAD.
A comparison of the SciAD dataset with other existing
scientific AD dataset is provided in Table 2</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Participating Systems &amp; Results</title>
      <sec id="sec-3-1">
        <title>Acronym Identification</title>
        <p>For the AI task, we provide a rule-based baseline. In
particular, inspired by (Schwartz and Hearst 2002), the baseline
identifies the acronyms and their long-forms if they match
one of the patterns of long form (acronym) or acronym (long
form). More specifically, if there is a word with more than
60% upper-cased characters which is inside parentheses or
right before parentheses, it is predicted as an acronym.
Afterward, we assess the words before or after the acronym
(depending on which pattern the predicted acronym belongs
to) that fall into the pre-defined window of size min(jAj +
5; 2 jAj), where jAj is the number of characters in the
acronym. In particular, if there is a sequence of characters
in these words which can form the upper-cased characters in
the acronym then the words after or before the acronym are
selected as its meaning (i.e., long-form). Moreover, as SciAI
dataset annotates acronyms even if they do not have any
locally defined long-form, we extend the rule for identifying
the acronyms by relaxing the requirement of being inside or
before the parentheses.</p>
        <p>
          In the AI@SDU task, 54 teams participated and 18 of
them submitted their system results in the evaluation phase.
In total, all teams submitted 254 submissions for different
versions of their models. The submitted systems employ
various methods including: (1) Rule-based Methods: Similar
to our baseline, some participants exploited manually
designed rules which could have high precision, but low recall
          <xref ref-type="bibr" rid="ref13 ref16 ref5 ref8">(Rogers, Rae, and Demner-Fushman 2020; Li et al. 2020)</xref>
          (2) Feature-based Models: These models extract various
features from the texts to be used by a statistical model
to predict the acronyms and long forms
          <xref ref-type="bibr" rid="ref16">(Li et al. 2020)</xref>
          (3) Transformer-based models: In these systems, the
sentence is encoded with a pre-trained transformer-based
language model and the labels are predicted using the obtained
word embeddings
          <xref ref-type="bibr" rid="ref13 ref13 ref16 ref5 ref5 ref8 ref8">(Kubal and Nagvenkar 2020; Li et al.
2020; Egan and Bohannon 2020)</xref>
          . Some of these models
may also leverage adversarial training to make the model
more robust to the noises (Zhu et al. 2020) or they might
employ an ensemble model
          <xref ref-type="bibr" rid="ref13 ref5 ref8">(Singh and Kumar 2020)</xref>
          . Among
all submitted models, the method proposed by (Zhu et al.
2020), i.e., AT-BERT-E, achieves the highest performance.
This model employs an adversarial training approach to
increase the model robustness toward the noise. More
specifically, they augment the training data with adversarial
perturbed samples and fine-tune a BERT model followed by a
feed-forward neural net on this task. For the adversarial
perturbation, they leverage a gradient-based approach in which
the sample representations are altered in the direction that
the gradient of the loss function rises.
        </p>
        <p>We evaluate the systems based on macro-averaged
precision, recall, and F1 score of the acronym and long-form
prediction. The results are shown in Table 3. This table shows
that the participants have made considerable improvement
over the provided baseline. However, there is still a gap
between the performance of the task winner (i.e., AT-BERT-E
(Zhu et al. 2020)) and human-level performance, suggesting
more improvement is required.</p>
      </sec>
      <sec id="sec-3-2">
        <title>Acronym Disambiguation</title>
        <p>For AD task, we propose to employ the frequency of the
acronym long forms to disambiguate them. More
specifically, for the acronym a with the long forms L =
[l1; l2; : : : ; lm], we compute the number of occurrence of
each of its long forms in the training data, i.e., F =
[f1; f2; : : : ; fm] where fi = jAiaj and Aia is the set of
sentences in the training data with the acronym a and the long
form li. In inference time, the acronym a is expanded to its
long form with the highest frequency, i.e., i = arg maxifi.</p>
        <p>
          The AD@SDU task attracted 44 participants, 12
submissions at the evaluation phase, and 187 total submissions
for different versions of the participating systems. This task
has been approached with a variety of methods, including
(1) Feature-based models: Some systems extract features
from the input sentence (e.g., word stems, part-of-speech
tags, or special characters in the acronym). Next, a
statistical model, such as Support Vector Machine, Naive Bayes,
and K-nearest neighbors, is employed to predict the correct
long form of the acronym
          <xref ref-type="bibr" rid="ref13 ref13 ref27 ref5 ref5 ref8 ref8">(Jaber and Martinez 2020; Pereira,
Galhardas, and Shasha 2020)</xref>
          ; (2) Neural Networks: A
few of the participating systems employ deep architectures,
e.g., convolution neural networks (CNN) or long
shortterm memory (LSTM)
          <xref ref-type="bibr" rid="ref13 ref5 ref8">(Rogers, Rae, and Demner-Fushman
2020)</xref>
          ; (3) Transformer-based Models: The majority of
the participants resort to transformer-based language
models, e.g., BERT, SciBERT or RoBERTa, to encode the
input sentence. However, they differ in how they leverage the
outputs of these language models for prediction and also
how they formulate the task. Whereas most of the existing
works formulate the task as a classification problem
          <xref ref-type="bibr" rid="ref23">(Pan
et al. 2020; Zhong et al. 2020)</xref>
          , authors in
          <xref ref-type="bibr" rid="ref13 ref5 ref8">(Egan and
Bohannon 2020)</xref>
          use an information retrieval approach. More
specifically, the cosine similarity between the embeddings
of the candidates and the input is employed to compute the
score of each candidate and then to rank them based on their
scores. Moreover, authors in
          <xref ref-type="bibr" rid="ref13 ref5 ref8">(Singh and Kumar 2020)</xref>
          model
this task as a span prediction problem. Specifically, the
concatenation of the different candidate long forms with the
acronym and the input sentence is encoded by a
transformerbased language model. Afterward, a sequence labeling
component predicts the sub-sequence with the highest
probability of being the correct long form. Among all
submitted systems, the DeepBlueAI model proposed by
          <xref ref-type="bibr" rid="ref23">(Pan et al.
2020)</xref>
          obtained the highest performance for acronym
disambiguation on SciAD test set. This model formulate this
task as a binary classification problem in which each
candidate long-form is assigned a score by a binary classifier
and the candidate with the highest scores is selected as the
final model prediction. For the classifier, authors employ a
pre-trained BERT model that takes the input in the form of
Li [SEP ] w1; w2; : : : ; start; wa; end; : : : ; wn, where Li is
the long-form candiate, wi is the words of the input
sentence, wa is the ambiguous acronym in the input sentence,
and start and end are two special tokens to provide the
position of the acronym to the model.
        </p>
        <p>
          We evaluate the systems using their macro-averaged
precision, recall, and F1 score for predicting the correct long
form. The results are shown in Table 4. Again, this table
shows that the participating systems considerably improved
the performance over the provided baseline. Although, the
existing gap between the best performing model, i.e.,
DeepBlueAI
          <xref ref-type="bibr" rid="ref23">(Pan et al. 2020)</xref>
          , and human-level performance
shows that more research is required.
        </p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Conclusion</title>
      <p>In this paper, we summarized the task of acronym
identification and acronym disambiguation at a scientific
document understanding workshop (AI@SDU and AD@SDU).
For these tasks, we provide two novel datasets that
address the limitations of the prior work. Both tasks attracted
substantial participants with considerable performance
improvement over provided baselines. However, the lower
performance of the best performing models compared to the
human level performance shows that more research should
be conducted on both tasks.</p>
      <p>Schwartz, A. S.; and Hearst, M. A. 2002. A simple algorithm
for identifying abbreviation definitions in biomedical text. In
Biocomputing 2003, 451–462. World Scientific.</p>
      <p>Singh, A.; and Kumar, P. 2020. SciDr at SDU-2020 : IDEAS
- Identifying and Disambiguating Everyday Acronyms for
Scientific Domain. In SDU@AAAI-21.</p>
      <p>Spala, S.; Miller, N.; Dernoncourt, F.; and Dockhorn, C.
2020. SemEval-2020 Task 6: Definition Extraction from
Free Text with the DEFT Corpus. In Proceedings of the
Fourteenth Workshop on Semantic Evaluation.</p>
      <p>Spala, S.; Miller, N. A.; Yang, Y.; Dernoncourt, F.; and
Dockhorn, C. 2019. DEFT: A corpus for definition
extraction in free- and semi-structured text. In Proceedings of the
13th Linguistic Annotation Workshop.</p>
      <p>Taneva, B.; Cheng, T.; Chakrabarti, K.; and He, Y. 2013.
Mining acronym expansions and their meanings using query
click log. In Proceedings of the 22nd international
conference on World Wide Web, 1261–1272.</p>
      <p>Veyseh, A. P. B. 2016. Cross-lingual question answering
using common semantic space. In Proceedings of
TextGraphs10: the workshop on graph-based methods for natural
language processing, 15–19.</p>
      <p>Veyseh, A. P. B.; Dernoncourt, F.; Chang, W.; and Nguyen,
T. H. 2021. MadDog: A Web-based System for Acronym
Identification and Disambiguation. In EACL.</p>
      <p>Veyseh, A. P. B.; Dernoncourt, F.; Dou, D.; and Nguyen,
T. H. 2020a. A Joint Model for Definition Extraction with
Syntactic Connection and Semantic Consistency. In AAAI,
9098–9105.</p>
      <p>Veyseh, A. P. B.; Dernoncourt, F.; Tran, Q. H.; and Nguyen,
T. H. 2020b. What Does This Acronym Mean? Introducing
a New Dataset for Acronym Identification and
Disambiguation. In Proceedings of COLING.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Ackermann</surname>
            ,
            <given-names>C. F.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Beller</surname>
            ,
            <given-names>C. E.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Boxwell</surname>
            ,
            <given-names>S. A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Katz</surname>
            ,
            <given-names>E. G.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Summers</surname>
            ,
            <given-names>K. M.</given-names>
          </string-name>
          <year>2020</year>
          .
          <article-title>Resolution of acronyms in question answering systems</article-title>
          .
          <source>US Patent</source>
          <volume>10</volume>
          ,
          <issue>572</issue>
          ,
          <fpage>597</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Charbonnier</surname>
          </string-name>
          , J.; and
          <string-name>
            <surname>Wartena</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Using Word Embeddings for Unsupervised Acronym Disambiguation</article-title>
          .
          <source>In Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe</source>
          , New Mexico, USA:
          <article-title>Association for Computational Linguistics</article-title>
          . URL https://www.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>aclweb.org/anthology/C18-1221.</mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Ciosici</surname>
            ,
            <given-names>M. R.</given-names>
          </string-name>
          ; Sommer,
          <string-name>
            <given-names>T.</given-names>
            ; and
            <surname>Assent</surname>
          </string-name>
          ,
          <string-name>
            <surname>I.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Unsupervised abbreviation disambiguation</article-title>
          . arXiv preprint arXiv:
          <year>1904</year>
          .00929 .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Egan</surname>
          </string-name>
          , N.; and
          <string-name>
            <surname>Bohannon</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <year>2020</year>
          .
          <article-title>Primer AI's Systems for Acronym Identification and Disambiguation</article-title>
          .
          <source>In SDU@AAAI-21.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <surname>Espinosa-Anke</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Schockaert</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Syntactically Aware Neural Architectures for Definition Extraction</article-title>
          . In NAACL-HLT.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <surname>Harris</surname>
            ,
            <given-names>C. G.</given-names>
          </string-name>
          ; and Srinivasan,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>My Word! Machine versus Human Computation Methods for Identifying and Resolving Acronyms</article-title>
          .
          <source>Computacio´n y Sistemas</source>
          <volume>23</volume>
          (
          <issue>3</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Jaber</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and Martinez,
          <string-name>
            <surname>P.</surname>
          </string-name>
          <year>2020</year>
          .
          <article-title>Participation of UC3M in SDU@AAAI-21: A Hybrid Approach to Disambiguate Scientific Acronyms</article-title>
          .
          <source>In SDU@AAAI-21.</source>
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <surname>Jin</surname>
            ,
            <given-names>Q.</given-names>
          </string-name>
          ; Liu, J.; and
          <string-name>
            <surname>Lu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <year>2019</year>
          .
          <article-title>Deep Contextualized Biomedical Abbreviation Expansion</article-title>
          . arXiv preprint arXiv:
          <year>1906</year>
          .03360 .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <surname>Jin</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Kan</surname>
          </string-name>
          , M.-Y.;
          <string-name>
            <surname>Ng</surname>
            ,
            <given-names>J.-P.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>He</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          <year>2013</year>
          .
          <article-title>Mining Scientific Terms and their Definitions: A Study of the ACL Anthology</article-title>
          .
          <source>In EMNLP.</source>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <surname>Kirchhoff</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Turner</surname>
            ,
            <given-names>A. M.</given-names>
          </string-name>
          <year>2016</year>
          .
          <article-title>Unsupervised resolution of acronyms and abbreviations in nursing notes using document-level context models</article-title>
          .
          <source>In Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis</source>
          ,
          <fpage>52</fpage>
          -
          <lpage>60</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <surname>Krippendorff</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <year>2011</year>
          .
          <article-title>Computing Krippendorff's alphareliability.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <surname>Kubal</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Nagvenkar</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          <year>2020</year>
          .
          <article-title>Effective Ensembling of Transformer based Language Models for Acronyms Identification</article-title>
          .
          <source>In SDU@AAAI-21.</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <surname>Kuo</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          -J.; Ling,
          <string-name>
            <given-names>M. H.</given-names>
            ;
            <surname>Lin</surname>
          </string-name>
          , K.-T.; and
          <string-name>
            <surname>Hsu</surname>
            ,
            <given-names>C.-N.</given-names>
          </string-name>
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <surname>BIOADI:</surname>
          </string-name>
          <article-title>a machine learning approach to identifying abbreviations and definitions in biological literature</article-title>
          .
          <source>In BMC bioinformatics</source>
          , volume
          <volume>10</volume>
          , S7. Springer.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Mai</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zou</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ; Ou,
          <string-name>
            <given-names>W.</given-names>
            ;
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            ;
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          ; and Zhang,
          <string-name>
            <surname>W.</surname>
          </string-name>
          <year>2020</year>
          .
          <article-title>Systems at SDU-2021 Task 1: Transformers for Sentence Level Sequence Label</article-title>
          .
          <source>In SDU@AAAI-21.</source>
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          <string-name>
            <surname>Li</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhao</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Fuxman</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Tao</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <year>2018</year>
          .
          <article-title>Guess Me if You Can: Acronym Disambiguation for Enterprises.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          <source>In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)</source>
          ,
          <fpage>1308</fpage>
          -
          <lpage>1317</lpage>
          . Melbourne, Australia:
          <article-title>Association for Computational Linguistics</article-title>
          . doi:
          <volume>10</volume>
          .18653/v1/
          <fpage>P18</fpage>
          -1121. URL https://www.aclweb.org/anthology/P18-1121.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Chen</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Zhang, Y.; and Huang,
          <string-name>
            <surname>Y.</surname>
          </string-name>
          <year>2011</year>
          .
          <article-title>Learning conditional random fields with latent sparse features for acronym expansion finding</article-title>
          .
          <source>In Proceedings of the 20th ACM international conference on Information and knowledge management</source>
          ,
          <fpage>867</fpage>
          -
          <lpage>872</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>Y.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Meng</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Zhang</surname>
          </string-name>
          , J.;
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ; Chen,
          <string-name>
            <given-names>Y.</given-names>
            ; and
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Gcdt: A global context enhanced deep transition architecture for sequence labeling</article-title>
          .
          <source>In ACL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          <string-name>
            <surname>Nadeau</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ; and Turney,
          <string-name>
            <surname>P. D.</surname>
          </string-name>
          <year>2005</year>
          .
          <article-title>A supervised learning approach to acronym identification</article-title>
          .
          <source>In Conference of the Canadian Society for Computational Studies of Intelligence</source>
          ,
          <fpage>319</fpage>
          -
          <lpage>329</lpage>
          . Springer.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          <string-name>
            <surname>Nautial</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Sristy</surname>
            ,
            <given-names>N. B.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Somayajulu</surname>
            ,
            <given-names>D. V.</given-names>
          </string-name>
          <year>2014</year>
          .
          <article-title>Finding acronym expansion using semi-Markov conditional random fields</article-title>
          .
          <source>In Proceedings of the 7th ACM India Computing Conference</source>
          ,
          <volume>1</volume>
          -
          <fpage>6</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          <string-name>
            <surname>Pan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Song</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ; and
          <string-name>
            <surname>Luo</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <year>2020</year>
          .
          <article-title>BERT-based Acronym Disambiguation with Multiple Training Strategies</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          <string-name>
            <surname>In</surname>
            <given-names>SDU</given-names>
          </string-name>
          @AAAI-21.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          <string-name>
            <surname>Park</surname>
            , Y.; and Byrd,
            <given-names>R. J.</given-names>
          </string-name>
          <year>2001</year>
          .
          <article-title>Hybrid text mining for finding abbreviations and their definitions</article-title>
          .
          <source>In Proceedings of the 2001 conference on empirical methods in natural language processing.</source>
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          <string-name>
            <surname>Passonneau</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Measuring agreement on set-valued items (MASI) for semantic and pragmatic annotation</article-title>
          .
          <source>In LREC.</source>
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          <string-name>
            <surname>Pereira</surname>
            ,
            <given-names>J. L. M.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Galhardas</surname>
          </string-name>
          , H.; and
          <string-name>
            <surname>Shasha</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          <article-title>Acronym Expander at SDU@AAAI-21: an Acronym Disambiguation Module</article-title>
          .
          <source>In SDU@AAAI-21.</source>
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          <string-name>
            <given-names>Pouran</given-names>
            <surname>Ben Veyseh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ;
            <surname>Nguyen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. H.</given-names>
            ; and
            <surname>Dou</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.</surname>
          </string-name>
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          <string-name>
            <surname>Prokofyev</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ; Demartini,
          <string-name>
            <given-names>G.</given-names>
            ;
            <surname>Boyarsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            ;
            <surname>Ruchayskiy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            ; and
            <surname>Cudre</surname>
          </string-name>
          ´-Mauroux, P.
          <year>2013</year>
          .
          <article-title>Ontology-based word sense disambiguation for scientific literature</article-title>
          .
          <source>In European conference on information retrieval</source>
          ,
          <fpage>594</fpage>
          -
          <lpage>605</lpage>
          . Springer.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>