<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On the use of antonyms and synonyms from a domain perspective</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Debela Tesfaye</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carita Paradis</string-name>
          <email>carita.peamraaidlis@englund.lu.se</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Copyright © by the paper's authors. Copying permitted for private and academic purposes.</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Centre for Languages and Literature, Lund University</institution>
          ,
          <addr-line>Lund</addr-line>
          ,
          <country country="SE">Sweden</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>IT PhD Program, Addis Ababa University</institution>
          ,
          <addr-line>Addis Ababa</addr-line>
          ,
          <country country="ET">Ethiopia</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>In Vito Pirrelli, Claudia Marzi, Marcello Ferro (eds.): Word Structure and Word Usage. Proceedings of the NetWordS Final</institution>
          ,
          <addr-line>Conference, Pisa, March 30-April 1, 2015, published at http://ceur-ws.org</addr-line>
        </aff>
      </contrib-group>
      <fpage>150</fpage>
      <lpage>154</lpage>
      <abstract>
        <p>This corpus study addresses the question of the nature and the structure of antonymy and synonymy in language use, following automatic methods to identify their behavioral patterns in texts. We examine the conceptual closeness/distance of synonyms and antonyms through the lens of their DOMAIN instantiations.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        Using data from Wikipedia, this corpus study
addresses the question of the nature and the
structure of antonym and synonymy in language
use. While quite a lot of empirical research using
different observational techniques has been
carried on antonymy
        <xref ref-type="bibr" rid="ref3 ref4 ref6 ref8">(e.g. Roehm et al. 2007,
Lobanova 2013, Paradis et al. 2009, Jones et al. 2012)</xref>
        ,
not as much has been devoted to synonymy
        <xref ref-type="bibr" rid="ref1">(e.g.
Divjak 2010)</xref>
        and very little has been carried out
on both of them using the same methodologies
        <xref ref-type="bibr" rid="ref2">(Gries &amp; Otani 2010)</xref>
        . The goal of this study is to
bring antonyms and synonyms together, using
the same automatic methods to identify their
behavioral patterns in texts. We examine the
conceptual closeness/distance of synonyms and
antonyms through the lens of their domain
instantiations. For instance, strong used in the context
of wind or taste (of tea) as compared to light and
weak respectively, and light as compared to
heavy when talking about rain or weight.
      </p>
      <p>The basic assumption underlying this study is
that the strength of co-occurrence of antonyms
and synonyms is dependent on the domain in
which they are instantiated and co-occur. In
order to test the hypothesis we mine the
cooccurrence information of the antonyms and the
synonyms relative to the domains using a
dependency grammar method. 1
1 http://nlp.stanford.edu/software/lexparser.shtml
The rationale is that the dependency parsing
produces the relational information among the
constituent words of a given sentence, which allows
us to (i) extract co-occurrences specific to a
given domain/context, and (ii) capture long distance
co-occurrences between the word pairs. Consider
(1).</p>
      <p>1. Winters are cold and dry, summers are
cool in the hills and quite hot in the plains.
In (1), the antonyms cold: hot modify winters
and summers respectively. Those forms express
the lexical concepts winter and summer in the
domain temperature. The antonyms cold: hot
cooccur but at a distance in the sentence. Thanks to
the dependency information, it is possible to
extract such long distance co-occurrences together
with the concepts modified.</p>
      <p>The article is organized as follows. In section
2, we describe the procedure and the two
methods used: co-occurrence extraction of lexical
items in the same sentence and a variant domain
dependent co-occurrence extraction method. The
latter method extracts patterns of co-occurrence
information of the synonyms and antonyms in
different sentences. In section 3 we present the
results and discussions followed by a discussion
of our results in comparison with related
previous works in section 4. The conclusions are
presented in section 5.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Procedure</title>
      <p>Using an algorithm similar to the one proposed
by Tesfaye &amp; Zock (2012) and Zock &amp; Tesfaye
(2012), we extracted the co-occurrence
information of the pairs in different domains separately,
measuring the strength of their relation in the
different domains with the aim of (i) making
principled comparisons between antonyms and
synonyms from a domain perspective, and (ii)
determining the structure of antonymy and
synonymy as categories in language and cognition.</p>
      <p>Our algorithm is similar to the standard
ngram co-occurrences extraction algorithms, but
instead of using the linear ordering of the words
in the text, it generates co-occurrences
frequencies along paths in the dependency tree of the
sentence as presented in the sections 2.2–2.5.
2.1</p>
    </sec>
    <sec id="sec-3">
      <title>Training and testing data</title>
      <p>
        The antonyms and synonyms employed for
training and testing were extracted from the data used
by Paradis et al. (2009) where the antonyms are
presented according to their underlying
dimensions and synonyms were provided for all the
individual antonyms
        <xref ref-type="bibr" rid="ref6">(for a description of the
principles see Paradis et al. 2009)</xref>
        . That set of
antonyms and synonyms were used to extract
their co-occurrence patterns from the Wikipedia
texts in this study.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Dimensions</title>
      <sec id="sec-4-1">
        <title>Size</title>
      </sec>
      <sec id="sec-4-2">
        <title>Speed</title>
      </sec>
      <sec id="sec-4-3">
        <title>Strength</title>
      </sec>
      <sec id="sec-4-4">
        <title>Merit</title>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Antonyms</title>
      <sec id="sec-5-1">
        <title>Large</title>
      </sec>
      <sec id="sec-5-2">
        <title>Small</title>
      </sec>
      <sec id="sec-5-3">
        <title>Fast</title>
      </sec>
      <sec id="sec-5-4">
        <title>Slow</title>
        <p>Bad</p>
      </sec>
      <sec id="sec-5-5">
        <title>Good</title>
        <p>The associated
synonyms of the antonyms
huge, vast, massive ,big
,bulky, giant ,gross,
heavy, significant ,wide
little, low, minor, minute,
petite, slim, tiny
quick, hurried, prompt,
accelerating, rapid
sudden, dull, gradual, lazy</p>
      </sec>
      <sec id="sec-5-6">
        <title>Strong forceful, hard, heavy,</title>
        <p>muscular, powerful,
substantial, tough
Weak light, soft, thin, wimpy
crappy, defective, evil
,harmful, poor ,shitty
,spoiled ,unhappy
awful ,genuine ,great,
honorable ,hot, neat, nice,
reputable, right ,safe ,well</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Extracting the co-occurrences of the antonyms and synonyms in the respective domains</title>
      <p>In order to extract the co-occurrences of the
antonyms/synonyms in the respective domains we
produced the relational information among the
constituent words of a given sentence. To this
end, we extracted the patterns linking the
synonyms/antonyms and the concepts they modify
and used this same pattern to extract more lexical
concepts. The procedure was as follows.




</p>
      <p>Start with the selected set of
synonym/antonym pairs
Extract sentences containing the pairs
Identify the dependency information of
the sentences
Mine the dependency patterns linking
the pairs with the concepts they modify
Use these learned patters to extract
further relations (synonym/antonym pairs
and the associated concepts)
2.3</p>
    </sec>
    <sec id="sec-7">
      <title>Extracting the domains</title>
      <p>We created a matrix of antonym and synonym
pairs matching every antonym and synonym
from the list in Table 1. Using the patterns
learned in section 2.2 we identified as many
domains as possible for the pairs of synonyms and
antonyms and calculated their frequency of
cooccurrence in the respective domains.</p>
      <p>When the lexical concepts were considered
too specific, we referred them to more inclusive,
superordinate domains. Frequency of occurrence
was used as a criterion for conflation of concepts
into superordinate ones as follows.</p>
      <p> Extract term co-occurrence frequencies
within a window of sentences
constituting both the antonyms/synonyms and the
potential domain concepts. For instance:
o Antonyms: cold: hot, domain</p>
      <p>concepts: winter, summer
o Synonyms: strong: heavy,
do</p>
      <p>main concepts: wind, rain
 Create a matrix of the potential domain
concepts and the co-occurring terms with
their frequencies
 Cluster them using the k-means
algorithm
 Take the term with the maximal
frequency (centroid) in each cluster and consider
it the domain term
 Test the result using expert judgment
running the algorithm on the test set.
m
y
n
o
n
y
S
/
m
y
n
o
t
n</p>
      <p>A
hot
cold
o
D t
p
e
ila con
t c
n n
teo ia</p>
      <p>P m
summer win- temperature
ter climate</p>
      <p>Wind
Words
cooccurring
with possible
domain
concepts
50
43
30
strong
heavy
wind rain wind rain 86
winds snow- winds snow- 3
fall fall
winds rainfall winds rainfall 34
waves rain- waves rainfall 4
fall</p>
    </sec>
    <sec id="sec-8">
      <title>Extracting co-occurrences frequency specific to a given Domain/Context</title>
      <p>The algorithm calculated the co-occurrence
frequency of the antonyms/synonyms with the
different concepts they refer to (or modify) as
presented in table 3 by combining the information
obtained in section 2.3 and section 2.4.</p>
      <p>s
m
y
n
o
t
n</p>
      <p>
        A
hot
cold
strong
heavy
In the previous algorithm, the co-occurrence
information was extracted from the same sentence.
However, unlike the antonyms, synonyms rarely
occurred together in the same context (the same
sentence and domain). It is natural to assume that
in most cases synonyms are used in different
contexts since they evoke similar but not
identical meanings. This is however not the case for
antonyms, which were always used to evoke
properties of the same meanings when these
antonymic words were used to express opposition
(Paradis &amp; Willners 2011), and in fact also when
they are not used to express opposition
        <xref ref-type="bibr" rid="ref7">(Paradis,et al., 2015)</xref>
        . Because of this we decided to
extract a variant domain dependent
cooccurrence algorithm for the synonyms and
antonyms, which instead extracts patterns of
cooccurrence information of the synonyms and
antonyms in different sentences, because we
expected synonyms to be applicable to different,
rather than the same contexts, since complete
overlap of meanings of words are rare or even
non-existent. This way we were able to gain
information indirectly about their use by extracting
their co-occurrence when they appear separately
in different sentences while still being
instantiated in the same domain. We mined the
cooccurrence information of the synonym/antonym
pairs separately in all possible domains and
check if they co-occurred in the same sorts of
domains:
 X(y, f)
 Z(y, f)
      </p>
    </sec>
    <sec id="sec-9">
      <title>Where,</title>
      <p>X and Z are a pair of a given
antonym/synonym, Y is the domain within
which the pairs of the antonym/synonym
co-occur and f the frequency of the x-y
or z-y co-occurrence.</p>
      <p>The frequency of a pair of the
antonyms/synonyms in the Y domain was counted
and the same applies to the other pair. This made
it possible to measure the degree of
cooccurrence of the antonym/synonym pairs from
the domain perspective indirectly.</p>
    </sec>
    <sec id="sec-10">
      <title>3. Results and discussion</title>
      <p>3.1</p>
    </sec>
    <sec id="sec-11">
      <title>Co-occurrences in the same sentence</title>
      <p>Based on the results of the experiment the
strength of the antonyms/synonyms varies in
relation to the domains of instantiation. Hence, the
strength of the co-occurrence of antonyms and
synonyms is a function of the domains. For
instance, the antonyms: slow: fast, slow: quick and
slow: rapid were used in completely different
domains with little or no overlap. Slow: fast is
used in the domains of motion, movement,
speed; slow: quick is used for time, march, steps
domains. The synonyms powerful: strong are
used in the domains of voices, links, meaning;
strong: muscular in the domains of legs, neck;
strong: heavy are used in the domains of wind
rain, waves rainfall, winds snow respectively;
intense: strong in the domains of battle
resistance, radiation gravity, updrafts clouds
respectively.</p>
      <p>We observed some unique patterns among the
antonyms and synonyms as described below:
The antonyms:
 Co-occurred frequently in the same
domain in the same sentence.
 The strength of the co-occurrence
depends on the domain: slow: fast in the
domains of growth, lines , motion,
movement, speed ,trains, music, pitch;
slow: quick in the domains of time,
march, steps; slow: gradual in the
domains of process, change, transition;
small: big in the domains of screen,
band; small: large in the domains of
intestine, companies, businesses; week:
strong in the domains of force,
interaction, team, ties, points, sides, wind.</p>
      <p>The Synonyms:
 Co-occurred in the same sentence but
mainly in different domains. For
instance, fast: quick, strong: heavy. Few
co-occurrences in the same sentences in
the same domains as exhibited by the
pairs gradual: slow in the domains of
process, change, development.
 The strength of the synonym
cooccurrence depends on the domains. For
instance, the synonyms strong: heavy in
wind and rain domains respectively to
express intensity; the synonyms large:
wide in the domains of population and
distribution domains respectively;
gradual: slow in the domains of process,
change, development; small: low in the
domain of size cost, range, size weight,
area, size price, amount density; micro:
small in the domains of enterprises,
businesses, entrepreneurs..
3.2</p>
    </sec>
    <sec id="sec-12">
      <title>The variant domain dependent cooccurrence method</title>
      <p>As mentioned before, the variant domain
dependent co-occurrence extraction algorithm mines
the patterns of co-occurrence information of the
synonyms and antonyms in different sentences.
The result from the variant co-occurrence
experiment showed hardly any differences in the
domains with which the synonyms and antonyms
are associated. Strong in the domains of
influence, force, wind, interactions, evidence, ties;
Heavy in the domains of loss, rain, industry,
traffic; gradual: slow in the domains of process,
change, transition. However, we observed that
the frequency of co-occurrence differed
significantly. For instance, the frequency of the pair
gradual: slow was 76 in same sentences
experiment but 1436 in the variant co-occurrence
experiment.</p>
    </sec>
    <sec id="sec-13">
      <title>Comparison with related works</title>
      <p>
        Previous research has shown that there are
antonyms that are strongly opposing (canonical
antonyms)
        <xref ref-type="bibr" rid="ref3 ref6">(Paradis et al. 2009, Jones et al. 2012)</xref>
        .
Such antonyms are very frequent in terms of
cooccurrence as compared to other antonyms:
small: large as compared with small: big. In this
experiment we found that the canonical
antonyms are the set of antonyms the domains in
which they function were numerous and
productive. For instance the number of domains for
small: large (11704) is by far greater than for
small: big (120). However this doesn’t make the
antonym small: large more felicitous in all the
domains. Small: big are the most felicitous
antonyms for the domains such as screen, band as
compared to small: large.
      </p>
      <p>Measuring the strength of antonyms without
taking domains into account provided higher
values for the canonicals as they tended to be
used in several domains. If domains were taken
in to account, as we did in this experiment, all
the antonyms were strong in their specific
domains. The antonym pair small: large had higher
value without considering domain in to account
yet had 0.29 value in the domain of screen where
small: big has much higher value (0.71). The
values were calculated taking the frequency of
co-occurrence of the domain term (screen in this
case) with each antonyms and dividing it by the
summation of the frequency of co-occurrence of
the domain term (again screen in this case) with
both antonyms (small big and small large).
5</p>
    </sec>
    <sec id="sec-14">
      <title>Conclusion</title>
      <p>The strength of the antonyms/synonyms varied in
relation to the domains of instantiation. The use
of antonyms and synonyms was very consistent
with few overlaps across the domains. Similar
results were observed in both experiments from
the domain perspective although with significant
differences in frequency. Antonyms frequently
co-occurred in the same domains in the same
sentences and synonyms co-occurred in different
domains in the same sentences (with less
frequency) and more frequently in different
sentences in the same domains.</p>
    </sec>
    <sec id="sec-15">
      <title>Acknowledgments</title>
      <p>We acknowledge European Science Foundation
(ESF) for providing us the funding to undertake
this work.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <given-names>Dagmar</given-names>
            <surname>Divjak</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Structuring the lexicon: a clustered model for near-synonymy</article-title>
          . Berlin: de Gruyter.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <given-names>Gries Stefan Th. &amp; N.</given-names>
            <surname>Otani</surname>
          </string-name>
          .
          <year>2010</year>
          .
          <article-title>Behavioral profiles: a corpus-based perspective on synonymy and antonymy</article-title>
          .
          <source>ICAME Journal</source>
          ,
          <volume>34</volume>
          :
          <fpage>121</fpage>
          -
          <lpage>150</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <given-names>Jones</given-names>
            <surname>Steven</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.L.</given-names>
            <surname>Murphy</surname>
          </string-name>
          , Carita Paradis &amp; Caroline
          <string-name>
            <surname>Willners</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Antonyms in English: Construals, constructions and canonicity</article-title>
          . Cambridge University Press, Cambridge, UK.
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <given-names>Anna</given-names>
            <surname>Lobanova</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>The Anatomy of Antonymy: A Corpus-Driven Approach</article-title>
          . Dissertation, University of Groningen.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <given-names>Carita</given-names>
            <surname>Paradis</surname>
          </string-name>
          .
          <year>2005</year>
          .
          <article-title>Ontologies and construals in lexical semantics</article-title>
          .
          <source>Axiomathes</source>
          ,
          <volume>15</volume>
          :
          <fpage>541</fpage>
          -
          <lpage>573</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <string-name>
            <given-names>Carita</given-names>
            <surname>Paradis</surname>
          </string-name>
          , Caroline Willners &amp; Jones
          <string-name>
            <surname>Steven</surname>
          </string-name>
          .
          <year>2009</year>
          .
          <article-title>Good and bad opposites: using textual and psycholinguistic techniques to measure antonym canonicity</article-title>
          .
          <source>The Mental Lexicon</source>
          ,
          <volume>4</volume>
          (
          <issue>3</issue>
          ):
          <fpage>380</fpage>
          -
          <lpage>429</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <string-name>
            <given-names>Carita</given-names>
            <surname>Paradis</surname>
          </string-name>
          , Simon Löhndorf , Joost van de Weijer &amp; Caroline
          <string-name>
            <surname>Willners</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Semantic profiles of antonymic adjectives in discourse</article-title>
          .
          <source>Linguistics</source>
          ,
          <volume>53</volume>
          .1:
          <fpage>153</fpage>
          -
          <lpage>191</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          <string-name>
            <surname>Roehm</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Bornkessel-Schlesewsky</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Rösler</surname>
          </string-name>
          &amp;
          <string-name>
            <surname>M. Schlesewsky</surname>
          </string-name>
          .
          <year>2007</year>
          .
          <article-title>To predict or not to predict: Influences of task and strategy on the processing o f semantic relations</article-title>
          .
          <source>Journal of Cognitive Neuroscience</source>
          ,
          <volume>19</volume>
          (
          <issue>8</issue>
          ):
          <fpage>1259</fpage>
          -
          <lpage>1274</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          <string-name>
            <given-names>Debela</given-names>
            <surname>Tesfaye</surname>
          </string-name>
          . &amp;
          <string-name>
            <given-names>Michael</given-names>
            <surname>Zock</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Automatic Extraction of Part-whole Relations</article-title>
          .
          <source>In Proceedings of the 9th International Workshop on Natural Language Processing and Cognitive Science.</source>
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          <string-name>
            <given-names>Michael</given-names>
            <surname>Zock</surname>
          </string-name>
          . &amp;
          <string-name>
            <given-names>Debela</given-names>
            <surname>Tesfaye</surname>
          </string-name>
          .
          <year>2012</year>
          .
          <article-title>Automatic index creation to support navigation in lexical graphs encoding part of relations</article-title>
          .
          <source>Proceedings of the 3rd Workshop on Cognitive Aspects of the Lexicon</source>
          (
          <article-title>CogALex-III)</article-title>
          ,
          <string-name>
            <surname>COLING</surname>
          </string-name>
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>