=Paper= {{Paper |id=Vol-1495/paper_2 |storemode=property |title=Helping Term Sense Disambiguation with Active Learning |pdfUrl=https://ceur-ws.org/Vol-1495/paper_2.pdf |volume=Vol-1495 |dblpUrl=https://dblp.org/rec/conf/tia/MenardBQ15 }} ==Helping Term Sense Disambiguation with Active Learning== https://ceur-ws.org/Vol-1495/paper_2.pdf
                  Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                              89




              Helping term sense disambiguation with active learning


      Pierre André Ménard                     Caroline Barrière                          Jean Quirion
    Centre de recherche informatique Centre de recherche informatique     École de traduction
         de Montréal, Canada             de Montréal, Canada        Université d’Ottawa, Canada
     pamenard@gmail.com             caroline.barriere@crim.ca           jquirion@uottawa.ca




                      Abstract                                    Results also show that terms, although poly-
                                                               semous, have a very strong bias toward their in-
      Our research highlights the problem of                   domain sense. In such biased case, a random
      term polysemy within terminometrics stud-                sampling of annotation data is far from optimal,
      ies. Terminometrics is the measure of term               wasting much human effort. We therefore intro-
      usage in specialized communication. Pol-
                                                               duce active learning (Section 5) and implement it
      ysemy, especially within single-word terms
      as we will show, prevents using term cor-                within an annotation platform (Section 6), to ob-
      pus frequencies as appropriate statistics for            tain a sense-annotated dataset in less time.
      terminometrics. Automatic term sense dis-
      ambiguation, as a possible solution, re-                 2   Terminometrics
      quires human annotation to feed a super-
      vised learning algorithm. Within our experi-             Terminometrics is the measure of term usage
      ments, we show that although being polyse-               in different types of communications (Quirion,
      mous, terms have a strong in-domain sense
                                                               2006). Its purpose is to determine, for a partic-
      bias, making random sampling of annota-
      tion data less than optimal. We suggest                  ular concept, the relative corpus frequencies of its
      the use of active learning and implement it              competing terms.
      within an annotation platform as a way of                   The protocol of terminometrics, as defined in
      reducing annotation time.                                Quirion (2003), consists in first deciding on a do-
                                                               main of interest and selecting its set of concepts
                                                               (most often all) from a term bank. Then, for each
1    Introduction
                                                               particular concept, the individual number of oc-
In our research, we investigate the measure of term            currences of all its competing terms is counted
usage in specialized communication, called Ter-                within different corpora from the same domain
minometrics (Section 2). The studied terms are                 gathered by terminologists to represent different
provided by a term bank, and we are interested                 communicative settings. Acknowledging the pos-
in the fact that many of these terms are polyse-               sible polysemy of competing terms, the protocol
mous, creating difficulties for our terminometrics             includes a human expert, to actually disambiguate
study. We therefore investigate the presence of                a randomly selected subset of occurrences, and
polysemy in term banks (Section 3). Polysemy                   obtain better estimates of real frequencies.
found in term banks is problematic since it leads                 A good example of this would be the concept
to term occurrences in corpora being possibly pol-             of a atomic cluster within the nanotechnology do-
ysemous, preventing simple corpus frequencies to               main. According to the term bank used, such no-
provide proper statistics. We confirm such poly-               tion can be expressed by the following 6 terms
semous occurrences within our specific termino-                atomic cluster, atom cluster, atomic aggregate,
metrics experiment in the nanotechnology domain                atom aggregate, cluster and aggregate. In ter-
(Section 4), as we analyse term sense human an-                minometrics, comparative studies of use of terms
notation results for a set of nanotechnology terms.            in specialized communications, government liter-
                Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                            90




ature, specialized media, and general media are of
interest, as they might reveal how some terms are
used by the general public, while others are used
by more official government documents.
   Studying the occurrence in text of different syn-
onyms of concepts would not be problematic if
each one was monosemous. But unfortunately,
that is not the case. For example, referring to Ta-
ble 1, the term cluster is a competing term for
multiple concepts, and simply counting its occur-
rences in text, without disambiguation, would not
be indicative of its usage for any of them.
                                                            Figure 1: Degree of polysemy in Wordnet and Ter-
   Obviously, human annotation is costly, and the
                                                            mium per term length
possibility of performing automatic term sense
disambiguation is quite appealing. In terminomet-
rics, concepts are evaluated one at a time, reducing        for the term cluster, as found in the Grand Dictio-
the disambiguation task to a binary decision. The           nnaire Terminologique (GDT)1 are shown in Ta-
annotation is not a selection among N senses, but           ble 1.
rather a yes/no decision on whether the current in-            There might be a misconception that special-
stance represents the current concept or not. Fur-          ized language is less ambiguous, and would then
thermore, term disambiguation within terminom-              not provide a proper challenge for word-sense dis-
etry cannot be dealt with similarly to more typi-           ambiguation. A study by Barrière (2007), shows
cal word-sense disambiguation or even term-sense            the contrary, as Wordnet and Termium2 (the actual
disambiguation relying on knowledge contained               resource used in this experiment) were compared
in an external resource (Barrière, 2010) since the         along different criteria. One criteria of comparison
annotator, or the algorithm, is likely to only have         was coverage, and another one, more of our inter-
access to the context of occurrences to perform             est in this research, is the degree of polysemy in
term disambiguation.                                        relation to word specificity. Word specificity was
                                                            approximated by ”hit counts”, as found in a very
3   Polysemy of specialized terms                           large corpora (Waterloo Terabyte Corpus, used by
                                                            Terra and Clarke (2003)), with words occurring
Terms for the terminometrics studies are provided           from 1 to millions of times. Figure 1 shows their
by term banks. Such repositories of terms are               results. We see how for common words (hit counts
not often investigated for the study of polysemy.           in the log10 (f req) > 3), the degree of polysemy
In Natural Language Processing, a typical task              in the term bank is even larger than in WordNet.
of word sense disambiguation requires a lexico-
                                                               In our study, we wished to further character-
graphic resource, such as WordNet (Miller, 1995),
                                                            ize this degree of polysemy in terminological re-
to provide a repository of possible word senses in
                                                            sources. We used a small set of 164 terms from the
order to disambiguate words in texts (Pantel and
                                                            current experiment (presented in Section 4.1), and
Lin, 2002). No doubt that words are polysemous,
                                                            looked at the number of senses in two term banks:
even in specific domains (Chroma, 2011; Vogel,
                                                            Termium and GDT. Figure 2 shows that special-
2007), but less studies show and discuss the poly-
                                                            ized terms, especially short ones (1 to 3 words)
semy of terms.
                                                            can have many senses (records) and span many
   Terms are single-word or multi-word expres-              domains. This trend generally diminishes as the
sions denoting particular concepts within partic-           term length increases.
ular domains. A term bank is organized by do-
                                                                 1
mains (e.g. biology, automotive, etc) and contains                   The GDT can only be accessed via a web interface at
records corresponding to concepts. Each record              http://www.granddictionnaire.com .
                                                                 2
                                                                 Termium term bank can be accessed online at
contains at least one term, and often competing             http://www.btb.termiumplus.gc.ca or downloaded at
terms (synonyms) denoting that concept, possibly            http://open.canada.ca/data/en/dataset/94fc74d6-9b9a-
in more than one language. Examples of records              4c2e-9c6c-45a5092453aa
                      Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                                    91



           Domain              Terms
                               atomic aggregate, cluster, aggregate, atom aggregate, atom cluster, atomic cluster
           nanotechnology      molecular aggregate, cluster, aggregate, molecule aggregate, molecule cluster
                               nanoaggregate, cluster, aggregate, nanocluster, nanometer-size cluster, nanoscale aggregate,
                               nanoscale cluster
           seafood             crab section, section, crab cluster, cluster
           software            cluster, document cluster
           mining              vein system, vein set, cluster of veins, mining cluster, cluster
           internet            service cluster, cluster of service, cluster

                               scanning tunneling electron microscope, microscope, scanning tunneling microscope, STM
                               atomic force microscope, microscope, AFM, SFM, scanning force microscope
           nanotechnology
                               magnetic force microscope, microscope, MFM, SMM, scanning magnetic microscope
                               scanning probe microscope, microscope, SPM, scanned-probe microscope


                              Table 1: Different records for ”‘cluster”’ and ”‘microscope”’.


                                                                    were downloaded. After such process, the corpus
                                                                    might still be noisy, but it does contain a majority
                                                                    of nanotechnology-related documents.
                                                                       All terms in the nanotechnology term base are
                                                                    searched for in the corpus. For each of their oc-
                                                                    currences, a window spanning 90 characters each
                                                                    side of the term is extracted. This text span be-
                                                                    comes a contextualized instance to be annotated.
                                                                    Table 2 shows examples of these instances.

                                                                    4.1     Human annotation process
Figure 2: Degree of polysemy in Termium and GDT
per term length                                                     For our current annotation experiment, a total of
                                                                    164 terms taken from 29 records (among the 1,035
                                                                    mentioned earlier) were selected along with the
4       Experiment - Terminometrics in                              complete set of instances found in the nanotech-
        nanotechnology domain                                       nology corpus. Each term occurred between 75 to
Our current terminometrics study focuses on term                    2100 times in the corpus for a total of 17,227 in-
usage in the nanotechnology domain within Cana-                     stances for the whole term sample. This dataset
dian French. This domain, within the GDT term                       was divided into two parts distributed between 2
bank, contains 1,035 records (concepts)3 , each                     PhD students in terminology. As shown in Ta-
with its competing terms. This set of terms is                      ble 2, annotators were presented text sample with
what we call our nanotechnology term base cov-                      a targeted term and were asked to indicate ”yes” if
ering ”‘the science of working with atoms and                       the term was used in the correct nanotechnology
molecules to build devices that are extremely                       sense and ”no” otherwise. Prior to the annotation
small”’ (Merriam-Webster dictionary).                               effort, the dataset was sorted by terms, as this was
                                                                    considered easier to annotate compared to an an-
   To study the competing terms for the nanotech-
                                                                    notation by document order, which would ask the
nology concepts, a corpus was built using doc-
                                                                    annotator to constantly switch between term def-
uments from corporative, educational, news me-
                                                                    initions. They took a total of 82 hours (41 hours
dias and government websites. These documents
                                                                    each) to annotate all the instances of the selected
were retrieved first by selecting most of the orga-
                                                                    dataset. Each text sample was composed of the 90
nizations originating from the province of Québec,
                                                                    characters prior to a term occurrence, the term oc-
Canada, and whose core activities dealt with nan-
                                                                    currence as is, and another 90 characters following
otechnology. This list was then vetted by an ex-
                                                                    the term occurrence. The 90 characters window
pert. Next, the websites of these organizations
                                                                    was adjusted to avoid word truncation.
    3
    As the GDT expands everyday, this number might not                 The annotators were also asked to indicate the
represent its current status.                                       difficulty level of the provided answer: standard,
                    Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                                        92



 Annotation    Instance
    Yes        ... une technologie d’intégration par laquelle plusieurs nanostructures sont intégrées sur un même substrat. L’interface
               entre les dispositifs et d’autres systèmes (oxyde, verre) sera aussi étudiée. (... an integration technology for which many
               nanostructures are integrated on a substrate. The interface between the components in other systems (oxyde, glass)
               will also be studied.)
       Yes     ... dollars à Bromont dans une petite usine qui allait employer 200 personnes pour la production de substrats, que le
               dictionnaire définit comme un matériau sur lequel sont réalisés les éléments d’un ... (...dollars at Bromont in a small
               factory which was going to employ 200 people for the production of substrates, which dictionary define as a material on
               which are realized elements of...)
       No      ... et valoriser les boues de station d’épuration. L’investigation des possibilités d’acquérir ces substrats requiert
               l’inventaire des industries de la région, les quantités et les caractéristiques des ... (... and valorize the epuration
                station’s muds. Investigating the possibility of acquiring these substrates requires to inventoriate the region’s
               industries, the quantity and features of ...)

       Yes     ... MNT Définition : Fabrication mécanique et contrôlée de structures moléculaires, par une approche ascendante qui
               consiste à les assembler, étape par étape, molécule par molécule, en se servant d’appareil ... (... MNT Definition :
               Mechanical and controled fabrication of molecular structures by a bottom-up approach which consist of assembling, step
               (by step, molecule by molecule, by using tool ...
       No      ... Quand il est possible de le faire, l’analyse de la demande d’énergie est fondée sur une approche ascendante agrégeant
               les demandes par usage, par secteur d’activités économiques, par région et par ... ( When it is possible to do it, the energy
                request analysis is founded on a bottom-up approach aggregating the requests by use, by economic activity sector, by
                regions and by ...)
       No      ... que beaucoup de problèmes rencontrés en pratique ne sont pas adressés par ces processus. L’approche ascendante de
               l’amélioration du processus consiste donc, selon ces mêmes auteurs, à implanter une équipe ... (... that many issues
                encountered in practice are not adressed by these processes. The bottom-up approach of process improvement consist of,
                for these same authors, implanting a team ...)

        Table 2: Instances for the terms substrat (substrate) and approche ascendante (bottom-up approach)


hard, hardest. Results showed that 626 instances                         currence of a term in a different sense. If P (x) is
(3.6%) needed a little more analysis while 222 in-                       the probability of the correct sense, then 1 P (x)
stances (1.3%) were much harder to annotate with                         is the probability of another sense. Then, we have
only the presented context. All the other instances                      the entropy, shown in Equation 1, as a sum over
were judged of standard difficulty meaning that                          two possible events.
the textual contexts of the term occurrences were
sufficient for the disambiguation task. In antici-
pation of an automatic disambiguation algorithm                              E(x) = Px log2 Px + (1               Px )log2 (1        Px ) (1)
which would only have access to the immediate                               The resulting function is at its maximum, a
context of the term, this confirmed that for most                        value of 1, with a probability of 50% and is equal
cases, it should be possible to disambiguate with a                      to 0 with probabilities of either 0% or 100%. In
±90 characters window4 .                                                 our case, x is the rate of occurrence of an antic-
                                                                         ipated term sense in a corpus. A term with an
4.2     Observations and results on polysemy
                                                                         entropy of 0 would mean it is not ambiguous, ei-
Analysis of the annotated instances reveals that                         ther all or none of the term’s instances use the cor-
84.31% (14,524) of them occur in the correct nan-                        rect sense, and a term with an entropy of 1 would
otechnology sense of the term, and the remain-                           mean 50% of its instances are used in the correct
ing 15.69% (2,703 instances) are used with other                         sense, the remaining 50% of the instances using
meanings. To measure the overall polysemy in our                         other meanings.
dataset, we use the notion of entropy. Entropy is                           For example, the term STM (acronym of scan-
defined as a summation of all possible event prob-                       ning tunnelling microscope) counts as a single-
abilities multiplied by the log of their probability.                    word term occurring a total of 341 times. Among
In our current experiment, there are only two pos-                       those, 104 instances (104/341=0.30499) have the
sible events, first the occurrence of a term in a cor-                   nanotechnology sense, which gives an entropy of
rect sense, let us call that x, and second, the oc-                      0.8873 as shown in Equation 2. This is a relatively
   4
                                                                         high entropy level as it nears the 50% maximum.
    This claim disregards the fact that humans certainly have
much apriori knowledge which they use during the disam-
                                                                         If the case would have been less ambiguous, for
biguation task. Nevertheless, trigger of this apriori knowl-             example 5 out of 341 instances, the entropy would
edge would still come from the limited context window.                   have been 0.1103.
                 Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                             93




 Figure 3: Average entropy per length on gold corpus




    E(ST M ) = 0.30499 ⇥ log2 (0.30499)+
        (1   0.30499) ⇥ log2 (1        0.30499)       (2)
                                       = 0.8873              Figure 4: High level view of the active learning pro-
                                                             cess.
   The bottom dashed line (Figure 3) shows the av-
erage entropy over all terms having a particular             a prediction model which would only target the
word count. The top full line shows the average              majority class, overlooking instances potentially
entropy for the 5 terms with the highest enthropy            useful for terminometrics experts.
(and thus the highest degree of ambiguity) of each              To sidestep this risk, we lean toward a learning
length, emphasizing how a few terms account for              approach called active learning which defines an
much of the corpus polysemous instances. Ex-                 iterative annotation process in order to reduce the
amples of these very polysemous terms are tun-               risk of producing a biased prediction model. As
nelling, substrat, or top-down.                              shown in Figure 4, this four-step process implies
   These corpus results, showing an overall ten-             the interaction with an oracle, typically a human
dency for entropy to decrease with term length, are          annotator who needs to be familiar with the do-
in line with our previous results presented in Fig-          main’s terminology and concepts being studied.
ure 2 relating term length to the polysemy level                The active learning process starts with a set of
within term banks. Nevertheless, these corpus re-            unlabelled data (U D) containing, in the current
sults also show that the in-domain sense is much             context, individual occurrences of a term in a cor-
more likely than all other senses. This leads us             pus, described by a group of features (e.g. a bag-
to think that we should take advantage of the par-           of-word made of its co-occurring words in con-
ticularity of our task in selecting the annotation           text). At this point, the labeled dataset (LD) is
dataset, as we further describe in the next section.         empty and there is no prediction model available.
                                                             The active learning algorithm starts by selecting a
5    Active learning for term sense
                                                             group of instances, called the seed S, from U D.
     annotation
                                                             For each instance of S, the oracle is queried to
The strong in-domain sense bias results shown in             specify a label, and the labeled example is then
the previous section, indicate that random sam-              stored in LD. The oracle annotates the instance
pling, suggested by the terminometrics methodol-             using one value of a predefined class label set,
ogy, could lead to collecting a biased sample and            in this case {yes, no}, yes meaning the instance
provide a distorted analysis. Traditional machine            is used in the targeted sense, no if another other
learning algorithms trained on these unbalanced              sense is used. When all instances in S are labeled,
samples would suffer the same bias, as less infor-           the active learning algorithm uses them to create a
mation would be available to classify the minority           prediction model. It is important to note that there
class. This type of algorithm would likely produce           is no ideal size for the seed, but it should be suf-
                 Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                             94




ficient to enable the algorithm to train a relevant          ated with active learning range from feature selec-
prediction model.                                            tion for particular disambiguation tasks (Palmer
   Once a prediction model is available, the pro-            and Chen, 2005), model adaptation when chang-
cess takes place in the same order, but with a vari-         ing domain between the training and application
ant. Instead of a seed, the algorithm superficially          of the model (Chan and Ng, 2007), class imbal-
applies the prediction model to instances in U D             ance problem (Zhu and Hovy, 2007) or deciding
(without labeling them or changing them to the               when the prediction algorithm stops asking for ad-
labeled set) and pick an instance for which the              ditional annotation (Zhu et al., 2008).
model does not provide a sufficient level of con-
fidence for its classification. It then submits this         6    Terminometrics active-learning
instance to the oracle who applies a label. Then,                 platform
the newly labeled example is added to LD. The                We developed an annotation platform, shown in
prediction model is then retrained and the process           Figure 5, to facilitate terminometrics studies with
continues until the algorithm reaches an overall             an active learning component for term disam-
level of confidence for all instances in U D.                biguation. The platform implements the inter-
   When this stopping criteria is reached, the ac-           active active learning process described above to
tive learning process is complete and the predic-            control and optimize the active learning between
tion model can be used to annotate the remain-               the prediction module and the human annotator.
ing instances in U D, if needed, or another simi-            The platform will also enable future experiments
lar dataset. Again, the level of confidence used as          within the field of terminometrics in which both
the stopping criteria must be empirically defined,           the active learning algorithms and the human in-
as there is no ideal value. Of course, a higher              teraction can be further explored.
confidence level might increase the annotation ef-              The user of this platform (typically the oracle in
fort needed to produce the final prediction model,           the active learning process) can create a corpus of
while a lower value might produce a less effec-              documents, use this corpus to create an annotation
tive prediction model using fewer instances. Fine-           project by defining a set of concepts, related terms
tuning the confidence level helps to reduce the risk         and variations (plural, gender) and participate in
of training a biased prediction model on a predom-           the active learning process. At the end of the ac-
inant class in a dataset.                                    tive learning process, the platform annotates the
   In our current implementation of active learn-            remaining instances in U D (see Figure 4) in or-
ing, we select a seed of 20 instances with ran-              der to estimate the distribution of occurrences of
dom sampling which is then processed with Ran-               competing terms of a concept. This is used for the
domForest (Tin Kam Ho, 1995) as the prediction               terminometrics analysis.
model. The oracle is then asked to annotate other               Aside from the {yes, no} classification, the in-
blocks of 20 instances until the algorithm reaches           terface offers two other choices; undecided and re-
its parametered confidence level. If this level is not       ject. The first choice allows the user to skip an in-
reached after a total of 200 instances (including            stance and go to the next, while being able to later
the seed), a final prediction model is trained and           return to provide an answer. This could happen
applied on U D in order to limit the effort to anno-         when the user wishes to see a larger context to per-
tate each expression. The features for the classifi-         form the disambiguation. In fact, to help this pro-
cation process are extracted from the 90 characters          cess, the platform also provides an option to view
window, which was judged as sufficient during the            an instance within its original document. The sec-
experiments (Section 4.1).                                   ond choice, reject, removes the instance entirely
   At this stage in our research, the current im-            from the unlabeled and labeled datasets. This is
plementation provides a baseline on which we                 used typically when the user considers that the in-
can later improve using different alternative mod-           stance should not be used for the terminometrics
els presented in the literature. Certainly, other            final analysis.
research in word sense disambiguation has ex-                   In order to further reduce the annotation effort
plored the empirical behaviour of active learning            needed to perform a terminometrics study, other
(e.g. (Chen et al., 2006)). Specific issues associ-          features, unrelated to active learning, were added
                Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                            95




                               Figure 5: Annotation interface for terminometrics.


to the platform. The first is a language-based doc-         of an annotation project. The source issue is that
ument filter which can be applied during the cor-           a sentence or a whole paragraph (or sometimes
pus creation to try to remove documents which are           complete documents) can be found in several lo-
not suited for the targeted analysis. Each docu-            cations within a corpus created from web sites.
ment is analysed with a language detection algo-            While each occurrence of a term (or its variations)
rithm to extract a confidence level associated with         is stored and kept for an accurate assessment of its
its deduced language. It then enables the user to           rate of occurrence in a corpus, only unique con-
keep only the documents which are above a spe-              texts (the term occurrence and a ±90 characters
cific threshold and exclude the remaining from the          window) are used for the active learning process.
corpus to be annotated. Of course, documents                For example, if the first context of ”‘substrat”’
with no text, such as files containing only images,         shown in Table 2 was found with the same prior
are also removed.                                           and post context in five documents in a corpus, the
                                                            oracle would be asked at most once to annotate
  Another effort reduction feature is the duplicate         this instance (if it is selected for annotation by the
context detection which takes place at the creation
                 Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                             96




algorithm), but it would count as five occurrences            the day-to-day life of people (e.g. computer do-
in the terminometrics analysis.                               main). We believe there is much room to further
   The platform also facilitates the management               study term polysemy in term banks, in specialized
of terminometrics studies by providing many fea-              corpus and also in more general corpus where both
tures: an integrated storage and search capabil-              specialized and common senses might be present.
ity on domain-specific corpora, a user interface                 One of the envisioned experiments is to anno-
specifically designed to facilitate annotation by             tate semi-automatically a whole corpus to be able
providing in-context display of a term to validate,           to compare the current approach to a supervised
an access to a term list with the possibility for ad-         learning method. This will enable us to evaluate
dition and removal of terms, and so on. This is an            the contribution of active learning on the raw per-
improvement over the traditional manual handling              formance of disambiguation and time reduction of
of documents and term lists, instances generation             the annotation task. A new dataset related to a do-
and annotation, traditionally done with folders and           main different than nanotechnology will also be
spreadsheets. While the upper limits of the plat-             defined for this experiment to avoid evaluating the
form have not been tested explicitly, the current             approach on the dataset used for development.
experiment was done with a term list of 1,036 en-
tries on a corpus of over 220,000 documents. As               Acknowledgments
far as the sizes of the corpora and vocabulary are            We thank the annotators Julián Zapata and Barış
concerned, the platform is mainly limited by the              Bilgen.
speed and capacity of the computer that runs it.

7   Conclusion and future work                                References
                                                              Caroline Barrière.      2007.     La désambiguı̈sation
In this article, we introduced term sense disam-
                                                                 du sens en traitement automatique des langues
biguation, a close cousin to word sense disam-                   (TAL): l’apport de resources terminologiques et lex-
biguation, but much less studied within the NLP                  icographiques. In Marie-Claude L’Homme and
community. We showed how terms, especially                       Sylvie Vandaele, editors, Lexicographie et Termi-
single-word terms, are polysemous, both in term                  nologie: compatibilité des modèles et méthodes,
banks and in specialized corpus.                                 pages 113–140. Presses de l’Université d’Ottawa.
   We presented the idea of using active learning             Caroline Barrière. 2010. Recherche contextuelle
                                                                 d’équivalents en banque de terminologie. In Traite-
within our terminometrics application, in which
                                                                 ment Automatique des Langues Naturelles 2010.
the in-domain sense bias is quite strong. So far,             Yee Sang Chan and Ht Ng. 2007. Domain adaptation
we have implemented a simple active learning                     with active learning for word sense disambiguation.
algorithm, and will move toward more complex                     Acl.
ones in the near future. The annotation platform,             Jinying Chen, Andrew Schein, Lyle Ungar, and Martha
ready for experimentation, will allow terminolo-                 Palmer. 2006. An empirical study of the behav-
gists to further complete, in less time, the anno-               ior of active learning for word sense disambiguation.
tation process of the nanotechnology domain and                  Proceedings of the main conference on Human Lan-
                                                                 guage Technology Conference of the North Amer-
other domains. This will provide test data, on
                                                                 ican Chapter of the Association of Computational
which we can measure the different gains in terms                Linguistics, pages 120–127.
of time and accuracy of our current and future ac-            Marta Chroma. 2011. Synonymy and Polysemy in
tive learning approaches.                                        Legal Terminology and Their Applications to Bilin-
   Furthermore, we plan to push further our explo-               gual and Bijural Translation. Research in Language,
ration of term disambiguation. In fact, although                 9:31–50.
lexicographic and terminological resources are or-            Ingrid Meyer. 2000. Computer Words in Our Every-
ganized differently, the distinction between terms               day Lives : How are they interesting for terminog-
                                                                 raphy and lexicography ? In Euralex’2000, Inter-
and words is not always that ”clear-cut”. Many
                                                                 national Congress on Lexicography, pages 39–58,
single-word terms exist also as common words.                    Stuttgart, Germany.
Some specialized terms also migrate from specific             George A. Miller.         1995.    WordNet: a lexical
domains to the general language (Meyer, 2000)                    database for English. Communications of the ACM,
when a specialized domain becomes more part of                   38(11):39–41.
                 Proceedings of the conference Terminology and Artificial Intelligence 2015 (Granada, Spain)

                                                             97




M Palmer and Jy Chen. 2005. Towards robust
   high performance word sense disambiguation of En-
   glish verbs using rich linguistic features. Natural
   Language Processing - Ijcnlp 2005, Proceedings,
   3651:933–944.
Patrick Pantel and Dekang Lin. 2002. Discovering
   word senses from text. In Proceedings of the Eighth
   ACM SIGKDD International Conference on Knowl-
   edge Discovery and Data Mining, KDD ’02, pages
   613–619, New York, NY, USA. ACM.
Jean Quirion. 2003. Methodology for the design of a
   standard research protocol for measuring terminol-
   ogy usage. Terminology, 9(c):29–49.
Jean Quirion. 2006. Terminometrics - an Evaluation
   Tool of/for Term Standardization. In TSTT’2006
   - International Conference on Terminology, Stan-
   dardization and Technology Transfer, pages 19–24,
   Beijing, China.
Egidio Terra and C.L.A. Clarke. 2003. Frequency Es-
   timates for Statistical Word Similarity Measures. In
   Proceedings of the NAACL 2003, page 165.
Tin Kam Ho. 1995. Random decision forests. Pro-
   ceedings of 3rd International Conference on Docu-
   ment Analysis and Recognition, 1:278–282.
Radek Vogel. 2007. Synonymy and polysemy in ac-
   counting terminology: fighting to avoid inaccuracy.
   In Proceedings of the English for Specific Purposes
   Terminology and Translation Workshop, Košice 13-
   14 September 2007. Univerzita P.J. Šafárika.
Jingbo Zhu and EH Hovy. 2007. Active Learning
   for Word Sense Disambiguation with Methods for
   Addressing the Class Imbalance Problem. EMNLP-
   CoNLL.
Jingbo Zhu, Huizhen Wang, and Eduard Hovy. 2008.
   Learning a Stopping Criterion for Active Learning
   for Word Sense Disambiguation and Text Classifi-
   cation. International Joint Conference on Natural
   Language Processing, pages 366–372.