<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Helping term sense disambiguation with active learning</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pierre Andre´ Me´nard</string-name>
          <email>pamenard@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Caroline Barrie`re</string-name>
          <email>caroline.barriere@crim.ca</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jean Quirion</string-name>
          <email>jquirion@uottawa.ca</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre de recherche informatique</institution>
          ,
          <addr-line>de Montre ́al</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Centre de recherche informatique</institution>
          ,
          <addr-line>de Montre ́al</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>E ́ cole de traduction, Universite ́ d'Ottawa</institution>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <fpage>89</fpage>
      <lpage>98</lpage>
      <abstract>
        <p>Our research highlights the problem of term polysemy within terminometrics studies. Terminometrics is the measure of term usage in specialized communication. Polysemy, especially within single-word terms as we will show, prevents using term corpus frequencies as appropriate statistics for terminometrics. Automatic term sense disambiguation, as a possible solution, requires human annotation to feed a supervised learning algorithm. Within our experiments, we show that although being polysemous, terms have a strong in-domain sense bias, making random sampling of annotation data less than optimal. We suggest the use of active learning and implement it within an annotation platform as a way of reducing annotation time.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Results also show that terms, although
polysemous, have a very strong bias toward their
indomain sense. In such biased case, a random
sampling of annotation data is far from optimal,
wasting much human effort. We therefore
introduce active learning (Section 5) and implement it
within an annotation platform (Section 6), to
obtain a sense-annotated dataset in less time.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Terminometrics</title>
      <p>
        Terminometrics is the measure of term usage
in different types of communications
        <xref ref-type="bibr" rid="ref11">(Quirion,
2006)</xref>
        . Its purpose is to determine, for a
particular concept, the relative corpus frequencies of its
competing terms.
      </p>
      <p>The protocol of terminometrics, as defined in
Quirion (2003), consists in first deciding on a
domain of interest and selecting its set of concepts
(most often all) from a term bank. Then, for each
particular concept, the individual number of
occurrences of all its competing terms is counted
within different corpora from the same domain
gathered by terminologists to represent different
communicative settings. Acknowledging the
possible polysemy of competing terms, the protocol
includes a human expert, to actually disambiguate
a randomly selected subset of occurrences, and
obtain better estimates of real frequencies.</p>
      <p>A good example of this would be the concept
of a atomic cluster within the nanotechnology
domain. According to the term bank used, such
notion can be expressed by the following 6 terms
atomic cluster, atom cluster, atomic aggregate,
atom aggregate, cluster and aggregate. In
terminometrics, comparative studies of use of terms
in specialized communications, government
literature, specialized media, and general media are of
interest, as they might reveal how some terms are
used by the general public, while others are used
by more official government documents.</p>
      <p>Studying the occurrence in text of different
synonyms of concepts would not be problematic if
each one was monosemous. But unfortunately,
that is not the case. For example, referring to
Table 1, the term cluster is a competing term for
multiple concepts, and simply counting its
occurrences in text, without disambiguation, would not
be indicative of its usage for any of them.</p>
      <p>
        Obviously, human annotation is costly, and the
possibility of performing automatic term sense
disambiguation is quite appealing. In
terminometrics, concepts are evaluated one at a time, reducing
the disambiguation task to a binary decision. The
annotation is not a selection among N senses, but
rather a yes/no decision on whether the current
instance represents the current concept or not.
Furthermore, term disambiguation within
terminometry cannot be dealt with similarly to more
typical word-sense disambiguation or even term-sense
disambiguation relying on knowledge contained
in an external resource
        <xref ref-type="bibr" rid="ref2">(Barrie`re, 2010)</xref>
        since the
annotator, or the algorithm, is likely to only have
access to the context of occurrences to perform
term disambiguation.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Polysemy of specialized terms</title>
      <p>Terms for the terminometrics studies are provided
by term banks. Such repositories of terms are
not often investigated for the study of polysemy.</p>
      <p>
        In Natural Language Processing, a typical task
of word sense disambiguation requires a
lexicographic resource, such as WordNet
        <xref ref-type="bibr" rid="ref7">(Miller, 1995)</xref>
        ,
to provide a repository of possible word senses in
order to disambiguate words in texts
        <xref ref-type="bibr" rid="ref9">(Pantel and
Lin, 2002)</xref>
        . No doubt that words are polysemous,
even in specific domains
        <xref ref-type="bibr" rid="ref14 ref5">(Chroma, 2011; Vogel,
2007)</xref>
        , but less studies show and discuss the
polysemy of terms.
      </p>
      <p>Terms are single-word or multi-word
expressions denoting particular concepts within
particular domains. A term bank is organized by
domains (e.g. biology, automotive, etc) and contains
records corresponding to concepts. Each record
contains at least one term, and often competing
terms (synonyms) denoting that concept, possibly
in more than one language. Examples of records
for the term cluster, as found in the Grand
Dictionnaire Terminologique (GDT)1 are shown in
Table 1.</p>
      <p>There might be a misconception that
specialized language is less ambiguous, and would then
not provide a proper challenge for word-sense
disambiguation. A study by Barrie`re (2007), shows
the contrary, as Wordnet and Termium2 (the actual
resource used in this experiment) were compared
along different criteria. One criteria of comparison
was coverage, and another one, more of our
interest in this research, is the degree of polysemy in
relation to word specificity. Word specificity was
approximated by ”hit counts”, as found in a very
large corpora (Waterloo Terabyte Corpus, used by
Terra and Clarke (2003)), with words occurring
from 1 to millions of times. Figure 1 shows their
results. We see how for common words (hit counts
in the log10(f req) &gt; 3), the degree of polysemy
in the term bank is even larger than in WordNet.</p>
      <p>In our study, we wished to further
characterize this degree of polysemy in terminological
resources. We used a small set of 164 terms from the
current experiment (presented in Section 4.1), and
looked at the number of senses in two term banks:
Termium and GDT. Figure 2 shows that
specialized terms, especially short ones (1 to 3 words)
can have many senses (records) and span many
domains. This trend generally diminishes as the
term length increases.</p>
      <p>1The GDT can only be accessed via a web interface at
http://www.granddictionnaire.com .</p>
      <p>2Termium term bank can be accessed online at
http://www.btb.termiumplus.gc.ca or downloaded at
http://open.canada.ca/data/en/dataset/94fc74d6-9b9a4c2e-9c6c-45a5092453aa
Domain
nanotechnology
seafood
software
mining
internet
nanotechnology</p>
      <p>Terms
atomic aggregate, cluster, aggregate, atom aggregate, atom cluster, atomic cluster
molecular aggregate, cluster, aggregate, molecule aggregate, molecule cluster
nanoaggregate, cluster, aggregate, nanocluster, nanometer-size cluster, nanoscale aggregate,
nanoscale cluster
crab section, section, crab cluster, cluster
cluster, document cluster
vein system, vein set, cluster of veins, mining cluster, cluster
service cluster, cluster of service, cluster
scanning tunneling electron microscope, microscope, scanning tunneling microscope, STM
atomic force microscope, microscope, AFM, SFM, scanning force microscope
magnetic force microscope, microscope, MFM, SMM, scanning magnetic microscope
scanning probe microscope, microscope, SPM, scanned-probe microscope</p>
    </sec>
    <sec id="sec-4">
      <title>Experiment - Terminometrics in nanotechnology domain</title>
      <p>Our current terminometrics study focuses on term
usage in the nanotechnology domain within
Canadian French. This domain, within the GDT term
bank, contains 1,035 records (concepts)3, each
with its competing terms. This set of terms is
what we call our nanotechnology term base
covering ”‘the science of working with atoms and
molecules to build devices that are extremely
small”’ (Merriam-Webster dictionary).</p>
      <p>To study the competing terms for the
nanotechnology concepts, a corpus was built using
documents from corporative, educational, news
medias and government websites. These documents
were retrieved first by selecting most of the
organizations originating from the province of Que´bec,
Canada, and whose core activities dealt with
nanotechnology. This list was then vetted by an
expert. Next, the websites of these organizations
3As the GDT expands everyday, this number might not
represent its current status.
were downloaded. After such process, the corpus
might still be noisy, but it does contain a majority
of nanotechnology-related documents.</p>
      <p>All terms in the nanotechnology term base are
searched for in the corpus. For each of their
occurrences, a window spanning 90 characters each
side of the term is extracted. This text span
becomes a contextualized instance to be annotated.
Table 2 shows examples of these instances.
4.1</p>
      <sec id="sec-4-1">
        <title>Human annotation process</title>
        <p>For our current annotation experiment, a total of
164 terms taken from 29 records (among the 1,035
mentioned earlier) were selected along with the
complete set of instances found in the
nanotechnology corpus. Each term occurred between 75 to
2100 times in the corpus for a total of 17,227
instances for the whole term sample. This dataset
was divided into two parts distributed between 2
PhD students in terminology. As shown in
Table 2, annotators were presented text sample with
a targeted term and were asked to indicate ”yes” if
the term was used in the correct nanotechnology
sense and ”no” otherwise. Prior to the annotation
effort, the dataset was sorted by terms, as this was
considered easier to annotate compared to an
annotation by document order, which would ask the
annotator to constantly switch between term
definitions. They took a total of 82 hours (41 hours
each) to annotate all the instances of the selected
dataset. Each text sample was composed of the 90
characters prior to a term occurrence, the term
occurrence as is, and another 90 characters following
the term occurrence. The 90 characters window
was adjusted to avoid word truncation.</p>
        <p>The annotators were also asked to indicate the
difficulty level of the provided answer: standard,
Annotation</p>
        <p>Yes
Yes
No
Yes
No
No</p>
        <p>Instance
... une technologie d’inte´gration par laquelle plusieurs nanostructures sont inte´gre´es sur un meˆme substrat. L’interface
entre les dispositifs et d’autres syste`mes (oxyde, verre) sera aussi e´tudie´e. (... an integration technology for which many
nanostructures are integrated on a substrate. The interface between the components in other systems (oxyde, glass)
will also be studied.)
... dollars a` Bromont dans une petite usine qui allait employer 200 personnes pour la production de substrats, que le
dictionnaire de´finit comme un mate´riau sur lequel sont re´alise´s les e´le´ments d’un ... (...dollars at Bromont in a small
factory which was going to employ 200 people for the production of substrates, which dictionary define as a material on
which are realized elements of...)
... et valoriser les boues de station d’e´puration. L’investigation des possibilite´s d’acque´rir ces substrats requiert
l’inventaire des industries de la re´gion, les quantite´s et les caracte´ristiques des ... (... and valorize the epuration
station’s muds. Investigating the possibility of acquiring these substrates requires to inventoriate the region’s
industries, the quantity and features of ...)
... MNT De´finition : Fabrication me´canique et controˆle´e de structures mole´culaires, par une approche ascendante qui
consiste a` les assembler, e´tape par e´tape, mole´cule par mole´cule, en se servant d’appareil ... (... MNT Definition :
Mechanical and controled fabrication of molecular structures by a bottom-up approach which consist of assembling, step
(by step, molecule by molecule, by using tool ...
... Quand il est possible de le faire, l’analyse de la demande d’e´nergie est fonde´e sur une approche ascendante agre´geant
les demandes par usage, par secteur d’activite´s e´conomiques, par re´gion et par ... ( When it is possible to do it, the energy
request analysis is founded on a bottom-up approach aggregating the requests by use, by economic activity sector, by
regions and by ...)
... que beaucoup de proble`mes rencontre´s en pratique ne sont pas adresse´s par ces processus. L’approche ascendante de
l’ame´lioration du processus consiste donc, selon ces meˆmes auteurs, a` implanter une e´quipe ... (... that many issues
encountered in practice are not adressed by these processes. The bottom-up approach of process improvement consist of,
for these same authors, implanting a team ...)
hard, hardest. Results showed that 626 instances
(3.6%) needed a little more analysis while 222
instances (1.3%) were much harder to annotate with
only the presented context. All the other instances
were judged of standard difficulty meaning that
the textual contexts of the term occurrences were
sufficient for the disambiguation task. In
anticipation of an automatic disambiguation algorithm
which would only have access to the immediate
context of the term, this confirmed that for most
cases, it should be possible to disambiguate with a
±90 characters window4.
4.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Observations and results on polysemy</title>
        <p>Analysis of the annotated instances reveals that
84.31% (14,524) of them occur in the correct
nanotechnology sense of the term, and the
remaining 15.69% (2,703 instances) are used with other
meanings. To measure the overall polysemy in our
dataset, we use the notion of entropy. Entropy is
defined as a summation of all possible event
probabilities multiplied by the log of their probability.
In our current experiment, there are only two
possible events, first the occurrence of a term in a
correct sense, let us call that x, and second, the
oc4This claim disregards the fact that humans certainly have
much apriori knowledge which they use during the
disambiguation task. Nevertheless, trigger of this apriori
knowledge would still come from the limited context window.
currence of a term in a different sense. If P (x) is
the probability of the correct sense, then 1 P (x)
is the probability of another sense. Then, we have
the entropy, shown in Equation 1, as a sum over
two possible events.</p>
        <p>E(x) = Pxlog2Px + (1
Px)log2(1</p>
        <p>Px) (1)</p>
        <p>The resulting function is at its maximum, a
value of 1, with a probability of 50% and is equal
to 0 with probabilities of either 0% or 100%. In
our case, x is the rate of occurrence of an
anticipated term sense in a corpus. A term with an
entropy of 0 would mean it is not ambiguous,
either all or none of the term’s instances use the
correct sense, and a term with an entropy of 1 would
mean 50% of its instances are used in the correct
sense, the remaining 50% of the instances using
other meanings.</p>
        <p>For example, the term STM (acronym of
scanning tunnelling microscope) counts as a
singleword term occurring a total of 341 times. Among
those, 104 instances (104/341=0.30499) have the
nanotechnology sense, which gives an entropy of
0.8873 as shown in Equation 2. This is a relatively
high entropy level as it nears the 50% maximum.
If the case would have been less ambiguous, for
example 5 out of 341 instances, the entropy would
have been 0.1103.</p>
        <p>The bottom dashed line (Figure 3) shows the
average entropy over all terms having a particular
word count. The top full line shows the average
entropy for the 5 terms with the highest enthropy
(and thus the highest degree of ambiguity) of each
length, emphasizing how a few terms account for
much of the corpus polysemous instances.
Examples of these very polysemous terms are
tunnelling, substrat, or top-down.</p>
        <p>These corpus results, showing an overall
tendency for entropy to decrease with term length, are
in line with our previous results presented in
Figure 2 relating term length to the polysemy level
within term banks. Nevertheless, these corpus
results also show that the in-domain sense is much
more likely than all other senses. This leads us
to think that we should take advantage of the
particularity of our task in selecting the annotation
dataset, as we further describe in the next section.
5</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Active learning for term sense annotation</title>
      <p>The strong in-domain sense bias results shown in
the previous section, indicate that random
sampling, suggested by the terminometrics
methodology, could lead to collecting a biased sample and
provide a distorted analysis. Traditional machine
learning algorithms trained on these unbalanced
samples would suffer the same bias, as less
information would be available to classify the minority
class. This type of algorithm would likely produce
a prediction model which would only target the
majority class, overlooking instances potentially
useful for terminometrics experts.</p>
      <p>To sidestep this risk, we lean toward a learning
approach called active learning which defines an
iterative annotation process in order to reduce the
risk of producing a biased prediction model. As
shown in Figure 4, this four-step process implies
the interaction with an oracle, typically a human
annotator who needs to be familiar with the
domain’s terminology and concepts being studied.</p>
      <p>The active learning process starts with a set of
unlabelled data (U D) containing, in the current
context, individual occurrences of a term in a
corpus, described by a group of features (e.g. a
bagof-word made of its co-occurring words in
context). At this point, the labeled dataset (LD) is
empty and there is no prediction model available.
The active learning algorithm starts by selecting a
group of instances, called the seed S, from U D.
For each instance of S, the oracle is queried to
specify a label, and the labeled example is then
stored in LD. The oracle annotates the instance
using one value of a predefined class label set,
in this case {yes, no}, yes meaning the instance
is used in the targeted sense, no if another other
sense is used. When all instances in S are labeled,
the active learning algorithm uses them to create a
prediction model. It is important to note that there
is no ideal size for the seed, but it should be
sufficient to enable the algorithm to train a relevant
prediction model.</p>
      <p>Once a prediction model is available, the
process takes place in the same order, but with a
variant. Instead of a seed, the algorithm superficially
applies the prediction model to instances in U D
(without labeling them or changing them to the
labeled set) and pick an instance for which the
model does not provide a sufficient level of
confidence for its classification. It then submits this
instance to the oracle who applies a label. Then,
the newly labeled example is added to LD. The
prediction model is then retrained and the process
continues until the algorithm reaches an overall
level of confidence for all instances in U D.</p>
      <p>When this stopping criteria is reached, the
active learning process is complete and the
prediction model can be used to annotate the
remaining instances in U D, if needed, or another
similar dataset. Again, the level of confidence used as
the stopping criteria must be empirically defined,
as there is no ideal value. Of course, a higher
confidence level might increase the annotation
effort needed to produce the final prediction model,
while a lower value might produce a less
effective prediction model using fewer instances.
Finetuning the confidence level helps to reduce the risk
of training a biased prediction model on a
predominant class in a dataset.</p>
      <p>
        In our current implementation of active
learning, we select a seed of 20 instances with
random sampling which is then processed with
RandomForest
        <xref ref-type="bibr" rid="ref13">(Tin Kam Ho, 1995)</xref>
        as the prediction
model. The oracle is then asked to annotate other
blocks of 20 instances until the algorithm reaches
its parametered confidence level. If this level is not
reached after a total of 200 instances (including
the seed), a final prediction model is trained and
applied on U D in order to limit the effort to
annotate each expression. The features for the
classification process are extracted from the 90 characters
window, which was judged as sufficient during the
experiments (Section 4.1).
      </p>
      <p>
        At this stage in our research, the current
implementation provides a baseline on which we
can later improve using different alternative
models presented in the literature. Certainly, other
research in word sense disambiguation has
explored the empirical behaviour of active learning
(e.g.
        <xref ref-type="bibr" rid="ref4">(Chen et al., 2006)</xref>
        ). Specific issues
associated with active learning range from feature
selection for particular disambiguation tasks
        <xref ref-type="bibr" rid="ref8">(Palmer
and Chen, 2005)</xref>
        , model adaptation when
changing domain between the training and application
of the model
        <xref ref-type="bibr" rid="ref14 ref15 ref3">(Chan and Ng, 2007)</xref>
        , class
imbalance problem
        <xref ref-type="bibr" rid="ref14 ref15 ref3">(Zhu and Hovy, 2007)</xref>
        or deciding
when the prediction algorithm stops asking for
additional annotation
        <xref ref-type="bibr" rid="ref16">(Zhu et al., 2008)</xref>
        .
6
      </p>
    </sec>
    <sec id="sec-6">
      <title>Terminometrics active-learning platform</title>
      <p>We developed an annotation platform, shown in
Figure 5, to facilitate terminometrics studies with
an active learning component for term
disambiguation. The platform implements the
interactive active learning process described above to
control and optimize the active learning between
the prediction module and the human annotator.
The platform will also enable future experiments
within the field of terminometrics in which both
the active learning algorithms and the human
interaction can be further explored.</p>
      <p>The user of this platform (typically the oracle in
the active learning process) can create a corpus of
documents, use this corpus to create an annotation
project by defining a set of concepts, related terms
and variations (plural, gender) and participate in
the active learning process. At the end of the
active learning process, the platform annotates the
remaining instances in U D (see Figure 4) in
order to estimate the distribution of occurrences of
competing terms of a concept. This is used for the
terminometrics analysis.</p>
      <p>Aside from the {yes, no} classification, the
interface offers two other choices; undecided and
reject. The first choice allows the user to skip an
instance and go to the next, while being able to later
return to provide an answer. This could happen
when the user wishes to see a larger context to
perform the disambiguation. In fact, to help this
process, the platform also provides an option to view
an instance within its original document. The
second choice, reject, removes the instance entirely
from the unlabeled and labeled datasets. This is
used typically when the user considers that the
instance should not be used for the terminometrics
final analysis.</p>
      <p>In order to further reduce the annotation effort
needed to perform a terminometrics study, other
features, unrelated to active learning, were added
to the platform. The first is a language-based
document filter which can be applied during the
corpus creation to try to remove documents which are
not suited for the targeted analysis. Each
document is analysed with a language detection
algorithm to extract a confidence level associated with
its deduced language. It then enables the user to
keep only the documents which are above a
specific threshold and exclude the remaining from the
corpus to be annotated. Of course, documents
with no text, such as files containing only images,
are also removed.</p>
      <p>Another effort reduction feature is the duplicate
context detection which takes place at the creation
of an annotation project. The source issue is that
a sentence or a whole paragraph (or sometimes
complete documents) can be found in several
locations within a corpus created from web sites.
While each occurrence of a term (or its variations)
is stored and kept for an accurate assessment of its
rate of occurrence in a corpus, only unique
contexts (the term occurrence and a ±90 characters
window) are used for the active learning process.
For example, if the first context of ”‘substrat”’
shown in Table 2 was found with the same prior
and post context in five documents in a corpus, the
oracle would be asked at most once to annotate
this instance (if it is selected for annotation by the
algorithm), but it would count as five occurrences
in the terminometrics analysis.</p>
      <p>The platform also facilitates the management
of terminometrics studies by providing many
features: an integrated storage and search
capability on domain-specific corpora, a user interface
specifically designed to facilitate annotation by
providing in-context display of a term to validate,
an access to a term list with the possibility for
addition and removal of terms, and so on. This is an
improvement over the traditional manual handling
of documents and term lists, instances generation
and annotation, traditionally done with folders and
spreadsheets. While the upper limits of the
platform have not been tested explicitly, the current
experiment was done with a term list of 1,036
entries on a corpus of over 220,000 documents. As
far as the sizes of the corpora and vocabulary are
concerned, the platform is mainly limited by the
speed and capacity of the computer that runs it.
7</p>
    </sec>
    <sec id="sec-7">
      <title>Conclusion and future work</title>
      <p>In this article, we introduced term sense
disambiguation, a close cousin to word sense
disambiguation, but much less studied within the NLP
community. We showed how terms, especially
single-word terms, are polysemous, both in term
banks and in specialized corpus.</p>
      <p>We presented the idea of using active learning
within our terminometrics application, in which
the in-domain sense bias is quite strong. So far,
we have implemented a simple active learning
algorithm, and will move toward more complex
ones in the near future. The annotation platform,
ready for experimentation, will allow
terminologists to further complete, in less time, the
annotation process of the nanotechnology domain and
other domains. This will provide test data, on
which we can measure the different gains in terms
of time and accuracy of our current and future
active learning approaches.</p>
      <p>
        Furthermore, we plan to push further our
exploration of term disambiguation. In fact, although
lexicographic and terminological resources are
organized differently, the distinction between terms
and words is not always that ”clear-cut”. Many
single-word terms exist also as common words.
Some specialized terms also migrate from specific
domains to the general language
        <xref ref-type="bibr" rid="ref6">(Meyer, 2000)</xref>
        when a specialized domain becomes more part of
the day-to-day life of people (e.g. computer
domain). We believe there is much room to further
study term polysemy in term banks, in specialized
corpus and also in more general corpus where both
specialized and common senses might be present.
      </p>
      <p>One of the envisioned experiments is to
annotate semi-automatically a whole corpus to be able
to compare the current approach to a supervised
learning method. This will enable us to evaluate
the contribution of active learning on the raw
performance of disambiguation and time reduction of
the annotation task. A new dataset related to a
domain different than nanotechnology will also be
defined for this experiment to avoid evaluating the
approach on the dataset used for development.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>We thank the annotators Julia´n Zapata and Barıs¸
Bilgen.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          Caroline Barrie`re.
          <year>2007</year>
          .
          <article-title>La de´sambigu¨ısation du sens en traitement automatique des langues (TAL): l'apport de resources terminologiques et lexicographiques</article-title>
          . In
          <string-name>
            <surname>Marie-Claude L'Homme</surname>
          </string-name>
          and Sylvie Vandaele, editors, Lexicographie et Terminologie:
          <article-title>compatibilite´ des mode`les et me´thodes</article-title>
          , pages
          <fpage>113</fpage>
          -
          <lpage>140</lpage>
          . Presses de l'Universite´ d'Ottawa.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          Caroline Barrie`re.
          <year>2010</year>
          .
          <article-title>Recherche contextuelle d'e´quivalents en banque de terminologie</article-title>
          .
          <source>In Traitement Automatique des Langues Naturelles</source>
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Yee</given-names>
            <surname>Sang</surname>
          </string-name>
          Chan and
          <string-name>
            <given-names>Ht</given-names>
            <surname>Ng</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Domain adaptation with active learning for word sense disambiguation</article-title>
          .
          <source>Acl.</source>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Jinying</given-names>
            <surname>Chen</surname>
          </string-name>
          , Andrew Schein, Lyle Ungar, and
          <string-name>
            <given-names>Martha</given-names>
            <surname>Palmer</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>An empirical study of the behavior of active learning for word sense disambiguation</article-title>
          .
          <source>Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics</source>
          , pages
          <fpage>120</fpage>
          -
          <lpage>127</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Marta</given-names>
            <surname>Chroma</surname>
          </string-name>
          .
          <year>2011</year>
          .
          <article-title>Synonymy and Polysemy in Legal Terminology and Their Applications to Bilingual and Bijural Translation</article-title>
          . Research in Language,
          <volume>9</volume>
          :
          <fpage>31</fpage>
          -
          <lpage>50</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Ingrid</given-names>
            <surname>Meyer</surname>
          </string-name>
          .
          <year>2000</year>
          . Computer Words in Our Everyday Lives :
          <article-title>How are they interesting for terminography and lexicography</article-title>
          ? In Euralex'2000, International Congress on Lexicography, pages
          <fpage>39</fpage>
          -
          <lpage>58</lpage>
          , Stuttgart, Germany.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>George A.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>WordNet: a lexical database for English</article-title>
          .
          <source>Communications of the ACM</source>
          ,
          <volume>38</volume>
          (
          <issue>11</issue>
          ):
          <fpage>39</fpage>
          -
          <lpage>41</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <given-names>M</given-names>
            <surname>Palmer</surname>
          </string-name>
          and
          <string-name>
            <given-names>Jy</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Towards robust high performance word sense disambiguation of English verbs using rich linguistic features</article-title>
          .
          <source>Natural Language Processing - Ijcnlp</source>
          <year>2005</year>
          , Proceedings,
          <volume>3651</volume>
          :
          <fpage>933</fpage>
          -
          <lpage>944</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Patrick</given-names>
            <surname>Pantel</surname>
          </string-name>
          and
          <string-name>
            <given-names>Dekang</given-names>
            <surname>Lin</surname>
          </string-name>
          .
          <year>2002</year>
          .
          <article-title>Discovering word senses from text</article-title>
          .
          <source>In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '02</source>
          , pages
          <fpage>613</fpage>
          -
          <lpage>619</lpage>
          , New York, NY, USA. ACM.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Jean</given-names>
            <surname>Quirion</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Methodology for the design of a standard research protocol for measuring terminology usage</article-title>
          .
          <source>Terminology</source>
          ,
          <volume>9</volume>
          (c):
          <fpage>29</fpage>
          -
          <lpage>49</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          <string-name>
            <given-names>Jean</given-names>
            <surname>Quirion</surname>
          </string-name>
          .
          <year>2006</year>
          .
          <article-title>Terminometrics - an Evaluation Tool of/for Term Standardization</article-title>
          . In TSTT'2006 - International Conference on Terminology,
          <source>Standardization and Technology Transfer</source>
          , pages
          <fpage>19</fpage>
          -
          <lpage>24</lpage>
          , Beijing, China.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <string-name>
            <given-names>Egidio</given-names>
            <surname>Terra</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.L.A.</given-names>
            <surname>Clarke</surname>
          </string-name>
          .
          <year>2003</year>
          .
          <article-title>Frequency Estimates for Statistical Word Similarity Measures</article-title>
          .
          <source>In Proceedings of the NAACL</source>
          <year>2003</year>
          , page 165.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          <string-name>
            <given-names>Tin</given-names>
            <surname>Kam Ho</surname>
          </string-name>
          .
          <year>1995</year>
          .
          <article-title>Random decision forests</article-title>
          .
          <source>Proceedings of 3rd International Conference on Document Analysis and Recognition</source>
          ,
          <volume>1</volume>
          :
          <fpage>278</fpage>
          -
          <lpage>282</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          <string-name>
            <given-names>Radek</given-names>
            <surname>Vogel</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Synonymy and polysemy in accounting terminology: fighting to avoid inaccuracy</article-title>
          .
          <source>In Proceedings of the English for Specific Purposes Terminology and Translation Workshop</source>
          , Kosˇice 13-
          <issue>14</issue>
          <year>September 2007</year>
          . Univerzita P.J. Sˇ afa´rika.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          <string-name>
            <given-names>Jingbo</given-names>
            <surname>Zhu</surname>
          </string-name>
          and
          <string-name>
            <given-names>EH</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>Active Learning for Word Sense Disambiguation with Methods for Addressing the Class Imbalance Problem</article-title>
          .
          <source>EMNLPCoNLL.</source>
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          <string-name>
            <given-names>Jingbo</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Huizhen</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Eduard</given-names>
            <surname>Hovy</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Learning a Stopping Criterion for Active Learning for Word Sense Disambiguation and Text Classification</article-title>
          .
          <source>International Joint Conference on Natural Language Processing</source>
          , pages
          <fpage>366</fpage>
          -
          <lpage>372</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>