<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Using Fuzzy Logic For Multi-Domain Sentiment Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mauro Dragoni</string-name>
          <email>dragoni@fbk.eu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andrea G.B. Tettamanzi</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ce´lia da Costa Pereira</string-name>
          <email>celia.pereira@unice.fr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>FBK-IRST</institution>
          ,
          <addr-line>Trento</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Universite ́ Nice Sophia Antipolis</institution>
          ,
          <addr-line>I3S, UMR 7271, Sophia Antipolis</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Recent advances in the Sentiment Analysis field focus on the investigation about the polarities that concepts describing the same sentiment have when they are used in different domains. In this paper, we investigated on the use of fuzzy logic representation for modeling knowledge concerning the relationships between sentiment concepts and different domains. The developed system is built on top of a knowledge base defined by integrating WordNet and SenticNet, and it implements an algorithm used for learning the use of sentiment concepts from multi-domain datasets and for propagating such information to each concept of the knowledge base. The system has been validated on the Blitzer dataset, a multi-domain sentiment dataset built by using reviews of Amazon products, by demonstrating the effectiveness of the proposed approach.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Sentiment Analysis is a kind of text categorization task that aims to classify documents
according to their opinion (polarity) on a given subject [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This task has created a
considerable interest due to its wide applications. However, in the classic Sentiment
Analysis the polarity of each term of the document is computed independently with respect
to domain which the document belongs to. Recently, the idea of adapting terms polarity
to different domains emerged [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The rational behind the idea of such investigation is
simple. Let’s consider the following example concerning the adjective “small”:
      </p>
      <sec id="sec-1-1">
        <title>1. The sideboard is small and it is not able to contain a lot of stuff. 2. The small dimensions of this decoder allow to move it easily.</title>
        <p>In the first text, we considered the Furnishings domain and, within it, the polarity
of the adjective “small” is, for sure, “negative” because it highlight an issue of the
described item. On the other side, in the second text, where we considered the Electronics
domain, the polarity of such adjective can be considered “positive”.</p>
        <p>
          In literature, different approaches related to the Multi-Domain Sentiment
Analysis has been proposed. Briefly, two main categories may be identified: (i) the transfer
of learned classifiers across different domains [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ], and (ii) the use of propagation
of labels through graphs structures [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]. Independently from the kind of approach,
works using concepts rather than terms for representing different sentiments have been
proposed.
        </p>
        <p>Differently from the approaches already discussed in the literature, we address the
multi-domain sentiment analysis problem by applying the fuzzy logic theory for
modeling membership functions representing the relationships between concepts and
domains. Moreover, the proposed system exploits the use of semantic background
knowledge for propagating information represented by the learned fuzzy membership
functions to each element of the network.
2</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>System</title>
      <p>The main aim of the implemented system is the learning of fuzzy membership
functions representing the belonging of a concept with respect to a domain in terms of both
sentiment polarity as well as aboutness. The two pillars on which the system has been
though are: (i) the use of fuzzy logic for modeling the polarity of a concept with respect
to a domain as well as its aboutness, and (ii) the creation of a two-levels graph where
the top level represents the semantic relationships between concepts, while the bottom
level contains the links between all concept membership functions and the domains.</p>
      <p>Figure 1 shows the conceptualization of the two-levels graph. Relationships
between the concepts of the Level 1 (the Semantic Level) are described by the
background knowledge exploited by the system. The type of relationships are the same
generally used in linguistic resource: for example, concepts C1 and C3 may be connected
through an Is-A relationship rather than the Antonym one. Instead, each connection of
the Level 2 (the Sentiment Level) describes the belonging of each concept with respect
to the different domains taken into account.</p>
      <p>The system has been trained by using the Blitzer dataset3 in two steps: first, the
fuzzy membership functions have been initially estimated by analyzing only the explicit
information present within the dataset (Section 2.1); then, (ii) the explicit information
has been propagated through the Sentiment Level graph by exploiting the connections
defined in the Semantic Level.
2.1</p>
      <sec id="sec-2-1">
        <title>Preliminary Learning Phase</title>
        <p>The Preliminary Learning (PL) phase aims to estimate the starting polarity of each
concept with respect to a domain. The estimation of this value is done by analyzing
only the explicit information provided by the training set. This phase allows to define the
preliminary fuzzy membership functions between the concepts defined in the Semantic
Level of the graph and the domains that are defined in the Sentiment one. Such a value
is computed by the Equation 1
polarityi (C) = TkCii 2 [ 1; 1]</p>
        <p>C
8i = 1; : : : ; n;
(1)
where C is the concept taken into account, index i refers to domain Di which the
concept belongs to, n is the number of domains available in the training set, kCi is the
arithmetic sum of the polarities observed for concept C in the training set restricted to</p>
        <sec id="sec-2-1-1">
          <title>3 http://www.cs.jhu.edu/ mdredze/datasets/sentiment/</title>
          <p>Fig. 1: The two-layer graph initialized during the Preliminary Learning Phase (a) and
its evolution after the execution of the Information Propagation Phase (b).
domain Di, and T Ci is the number of instances of the training set, restricted to domain
Di, in which concept C occurs. The shape of the fuzzy membership function
generated during this phase is a triangle with the top vertex in the coordinates (x; 1), where
x = polarityi (C) and with the two bottom vertices in the coordinates ( 1; 0) and
(1; 0) respectively. The rationale is that while we have one point (x) in which we have
full confidence, our uncertainty covers the entire space because we do not have any
information concerning the remaining polarity values.</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>2.2 Information Propagation Phase</title>
        <p>The Information Propagation (IP) phase aims to exploit the explicit information learned
in the PL phase in order to both (i) refine the fuzzy membership function of the known
concepts, as well as, (ii) to model such functions for concepts that are not specified in
the training set, but that are semantically related to the specified ones. Figure 1 presents
how the two-levels graph evolves before and after the execution of the IP phase. After
the PL phase only four membership functions are modeled: C1 and C2 for the domain
D1, and C1 and C5 for the domain D2 (Figure 1a). However, as we may observe, in the
Semantic Level there are concepts that are semantically related to the ones that were
explicitly defined in the training set, namely C3 and C4; while, there are also concepts
for which a fuzzy membership function has not been modeled for some domains (i.e.
C2 for the domain D2 and C5 for the domain D1).</p>
        <p>Such fuzzy membership functions may be inferred by propagating the information
modeled in the PL phase. Similarly, existing fuzzy membership functions are refined by
the influence of the other ones. Let’s consider the polarity between the concept C3 and
the domain D2. The fuzzy membership function representing this polarity is strongly
influenced by the ones representing the polarities of concepts C1 and C5 with respect
to the domain D2.</p>
        <p>The propagation of the learned information through the graph is done iteratively
where, in each iteration, the estimated polarity value of the concept x learned during
the PL phase is updated based on the learned values of the adjoining concepts. At each
iteration, the updated values is saved in order to exploit it for the re-shaping of the fuzzy
membership function associating the concept x to the domain i.</p>
        <p>The resulting shapes of the inferred fuzzy membership functions will be trapezoids
where the extension of the upper base is proportional to the difference between the
value learned during the PL phase (Vpl) and the value obtained at the end of the IP
phase (Vip); while, the support is proportional to both the number of iterations needed
by the concept x to converge to the Vip and the variance with respect to the average of
the values computed after each iteration of the IP phase.
3</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Concluding Remarks</title>
      <p>The system have been validated on the full version of the Blitzer dataset4 and the results,
compared with the precision obtained by three baselines, are shown in Table 1.
0.8068
0.8227</p>
      <p>The results demonstrated that the modeled fuzzy membership functions may be
exploited effectively for computing the polarities of concepts used in different domains.
4 Detailed results and tool demo are available at http://dkmtools.fbk.eu/moki/demo/mdfsa/mdfsa demo.html</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Pang</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Vaithyanathan</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Thumbs up? sentiment classification using machine learning techniques</article-title>
          .
          <source>In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP)</source>
          , Philadelphia, Association for Computational Linguistics (
          <year>July 2002</year>
          )
          <fpage>79</fpage>
          -
          <lpage>86</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Blitzer</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dredze</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pereira</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification</article-title>
          . In Carroll, J.A., van den Bosch, A.,
          <string-name>
            <surname>Zaenen</surname>
          </string-name>
          , A., eds.: ACL,
          <article-title>The Association for Computational Linguistics (</article-title>
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bollegala</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weir</surname>
            ,
            <given-names>D.J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Carroll</surname>
            ,
            <given-names>J.A.</given-names>
          </string-name>
          :
          <article-title>Cross-domain sentiment classification using a sentiment sensitive thesaurus</article-title>
          .
          <source>IEEE Trans. Knowl. Data Eng</source>
          .
          <volume>25</volume>
          (
          <issue>8</issue>
          ) (
          <year>2013</year>
          )
          <fpage>1719</fpage>
          -
          <lpage>1731</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Xia</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Zong</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hu</surname>
            ,
            <given-names>X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cambria</surname>
          </string-name>
          , E.:
          <article-title>Feature ensemble plus sample selection: Domain adaptation for sentiment classification</article-title>
          .
          <source>IEEE Int. Systems</source>
          <volume>28</volume>
          (
          <issue>3</issue>
          ) (
          <year>2013</year>
          )
          <fpage>10</fpage>
          -
          <lpage>18</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Ponomareva</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Thelwall</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Semi-supervised vs. cross-domain graphs for sentiment analysis</article-title>
          . In Angelova, G.,
          <string-name>
            <surname>Bontcheva</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mitkov</surname>
          </string-name>
          , R., eds.: RANLP,
          <article-title>RANLP 2011 Organising Committee / ACL (</article-title>
          <year>2013</year>
          )
          <fpage>571</fpage>
          -
          <lpage>578</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Tsai</surname>
            ,
            <given-names>A.C.R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Wu</surname>
            ,
            <given-names>C.E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Tsai</surname>
          </string-name>
          , R.T.H., jen Hsu, J.Y.:
          <article-title>Building a concept-level sentiment dictionary based on commonsense knowledge</article-title>
          .
          <source>IEEE Int. Systems</source>
          <volume>28</volume>
          (
          <issue>2</issue>
          ) (
          <year>2013</year>
          )
          <fpage>22</fpage>
          -
          <lpage>30</lpage>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          <issue>7</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>C.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lin</surname>
            ,
            <given-names>C.J.:</given-names>
          </string-name>
          <article-title>Libsvm: A library for support vector machines</article-title>
          .
          <source>ACM TIST 2</source>
          (
          <issue>3</issue>
          ) (
          <year>2011</year>
          )
          <fpage>27</fpage>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>McCallum</surname>
            ,
            <given-names>A.K.</given-names>
          </string-name>
          :
          <article-title>Mallet: A machine learning for language toolkit</article-title>
          . http://mallet.cs. umass.edu (
          <year>2002</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>