<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>BroDyn</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Contradiction in Reviews: is it Strong or Low?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ismail Badache</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sebastien Fournier</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Adrian-Gabriel Chifu</string-name>
          <email>adrian.chifug@lis-lab.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>LIS UMR 7020 CNRS, Aix-Marseille University</institution>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2018</year>
      </pub-date>
      <volume>1</volume>
      <fpage>11</fpage>
      <lpage>22</lpage>
      <abstract>
        <p>Analysis of opinions (reviews) generated by users becomes increasingly exploited by a variety of applications. It allows to follow the evolution of the opinions or to carry out investigations on web resource (e.g. courses, movies, products). The detection of contradictory opinions is an important task to evaluate the latter. This paper focuses on the problem of detecting and estimating contradiction intensity based on the sentiment analysis around speci c aspects of a resource. Firstly, certain aspects are identi ed, according to the distributions of the emotional terms in the vicinity of the most frequent names in the whole of the reviews. Secondly, the polarity of each review segment containing an aspect is estimated using the state-of-the-art approach SentiNeuron. Then, only the resources containing these aspects with opposite polarities (positive, negative) are considered. Thirdly, a measure of the intensity of the contradiction is introduced. It is based on the joint dispersion of the polarity and the rating of the reviews containing the aspects within each resource. The evaluation of the proposed approach is conducted on the Massive Open Online Courses collection containing 2244 courses and their 73,873 reviews, collected from Coursera. The results revealed the e ectiveness of the proposed approach to detect and quantify contradictions.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Nowadays, web 2.0 has become a participatory platform where people can
express their opinions by leaving traces (e.g. review, rating, like) on web resources.
Social web (e.g. social networks) allow the generation of these traces. They
represent a rich source of social information, which can be analysed and exploited
in various applications [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. For example, opinion mining or sentiment
analysis [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], to know a customer's attitude towards a product or its characteristics,
or to reveal the reaction of people to an event. Such problems require rigorous
analysis of the aspects covered by the sentiment to produce a representative
and targeted result. Another issue concerns the diversity of opinions on a given
topic. For example, Wang and Cardie [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] aim to identify the sentiments of a
sentence expressed during a discussion and they use them as features in a
classi er that predicts dispute in discussions. Qiu et al. [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] automatically identify
debates between users from textual content (interactions) in forums, based on
latent variable models. There are other studies in the analysis of user
interactions, for example, extracting the agreement and disagreement expressions [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]
and deducing the user relations by looking at their textual exchanges [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>This paper investigates the entities (e.g. aspects, topics) for which the
contradictions can occur in the reviews associated with a web resource (e.g. movies,
courses) and how to estimate their intensity. The interest of estimating
contradiction intensity depends on application framework. For example, following
controversial political events/crises such as United States recognition of Jerusalem as
capital of Israel. This has generated contradictory (diverse) opinions (reviews),
in social networks, between di erent communities around the world. Estimating
the intensity of this con ict may be useful for better analyzing the trend and
the consequences of this political decision. In social information retrieval, for
some users' information needs, measuring contradiction intensity can be useful
to retrieve and rank the most controversial documents (e.g. news, events, etc).
In our case, knowing the intensity of con icting opinions on a speci c aspect
(e.g. speaker, slide, quiz) of an online course, may be helpful to know if there are
certain elements for this course that need to be improved. Table 1 presents an
instance of contradictory reviews about a \speaker" of a given coursera course.
Resource Review (left) Aspect Review (Right) Polarity Rating
Course1 The lecturer was an annoying speaker and very repetitive. -0.9 1</p>
      <p>Passionate speaker and truly amazing things to learn +0.7 4
Table 1: Example of contradictory opinions about a \speaker" of a coursera course</p>
      <p>Therefore, measuring the intensity of contradiction is for a better nuanced
understanding of the diversity (dispersion) of opinions around a speci c aspect.
In order to design our approach, fundamental tasks are performed. First, aspects
characterising these reviews are automatically identi ed. Second, opposing
opinions around each of these aspects through a model of sentiment analysis are
captured. Third, the intensity of contradiction in the reviews are estimated, using
a measure of dispersion based on ratings and polarities of reviews containing an
aspect. Finally, user studies experiments were conducted to evaluate the e
ectiveness of our approach, using a dataset collected from coursera.org. The main
contributions addressed in this paper are twofold:
(C1). A contradiction in reviews related to a web resource means contradictory
opinions expressed about a speci c aspect, which is a form of diversity of
sentiments around the aspect for the same resource. But in addition to detecting
the contradiction, it is desirable to estimate its intensity. Therefore, we try to
answer in this paper the following research questions:
- RQ1: How to estimate the intensity of contradiction?
- RQ2: What is the impact of the joint consideration of the polarity and the
rating of the reviews on the measurement of the intensity of the contradiction?
(C2). A development of a data collection collected from coursera.org which is
useful for the evaluation of contradiction intensity measurement systems. Our
experimental evaluation is based on user study.</p>
      <p>The rest of this paper is structured as follows: Section 2 presents related work
and background. Section 3 details our approach for detecting contradiction and
estimating the intensity. Section 4 reports the results of our experiments. Section
5 concludes this paper and launches perspectives.</p>
      <sec id="sec-1-1">
        <title>1 https://www.coursera.org/learn/dog-emotion-and-cognition</title>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Background and Related Work</title>
      <p>Contradiction detection is a complex process that requires the use of several
state of the art methods (aspect detection, sentiment analysis). Moreover, to the
best our knowledge, very few studies treat the detection and the measurement
of the intensity of contradiction. This section brie y presents some approaches
of detecting controversies close to our work and then presents the approaches
related to the detection of aspects and the sentiment analysis, which are useful
for introducing our approach.
2.1</p>
      <sec id="sec-2-1">
        <title>Contradiction and Controversy Detection</title>
        <p>
          The studies that are most related to our approach include [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ], [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ], [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ] and [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ],
which attempt to detect contradiction in text. There are two main approaches,
where contradictions are de ned as a form of textual inference (e.g. entailment
identi cation) and analyzed using linguistic technologies. Harabagiu et al. [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]
proposed an approach for contradiction analysis that exploits linguistic features
(e.g. types of verbs), as well as semantic information, such as negation (e.g. \I
love you - I do not love you") or antonymy (words that have opposite meanings,
i.e., \hot-cold" or \light-dark"). Their work de ned contradictions as textual
entailment, when two sentences express mutually exclusive information on the
same topic. Further improving the work in this direction, De Marne e et al.
[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ] introduced a classi cation of contradictions consisting of 7 types that are
distinguished by the features that contribute to a contradiction, e.g. antonymy,
negation, numeric mismatches which may be caused by erroneous data: \there
are 7 wonders of the world - the number of wonders of the world are 9". They
de ned contradictions as a situation where two sentences are extremely unlikely
to be true when considered together. Tsytsarau et al. [
          <xref ref-type="bibr" rid="ref28">28</xref>
          ], [
          <xref ref-type="bibr" rid="ref29">29</xref>
          ] proposed an
automatic and scalable solution for the contradiction detection problem. They
studied the contradiction problem using sentiments analysis. The intuition of
their contradiction approach is that when the aggregated value for sentiments (on
a speci c topic and time interval) is close to zero, while the sentiment diversity
is high, the contradiction should be high.
        </p>
        <p>
          Another theme related to our work concern the detection of controversies and
disputes. In the literature, the detection of controversies has been addressed both
by supervised methods as in [
          <xref ref-type="bibr" rid="ref20">20</xref>
          ], [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] and [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] or by unsupervised methods as in
[
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ] and [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. To detect controversial events on Twitter (e.g., David
Copper eld's charge of rape between 2007 and 2010)2, Popescu and Pennacchiotti
[
          <xref ref-type="bibr" rid="ref20">20</xref>
          ] proposed a decision-tree classi er and a set of features such as discourse
parts, the presence of words from opinion or controversial lexicons, and user
interactions (retweet and reply ). Balasubramanyan et al. [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] extended the
supervised LDA model to predict how members of a di erent political communities
will emotionally respond to the same news story. Support vector classi ers and
2 http://www.foxnews.com/story/2009/08/20/magician-david-copper
eld-accusedraping-woman-on-private-island.html
logistic regression classi ers have also been proposed in [
          <xref ref-type="bibr" rid="ref32">32</xref>
          ] and [
          <xref ref-type="bibr" rid="ref31">31</xref>
          ] to detect
disputes in Wikipedia page discussions. For example in the case of the comments
that surround the modi cations of Wikipedia pages.
        </p>
        <p>
          Other works have also exploited Wikipedia to detect and to identify
controversial topics on the web [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ], [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] and [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. Dori-Hacohen and Allan in [
          <xref ref-type="bibr" rid="ref7">7</xref>
          ], [
          <xref ref-type="bibr" rid="ref6">6</xref>
          ]
and Jang and Allan in [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ] proposed to align web pages to Wikipedia pages on
the assumption that a page deals with a controversial topic if the Wikipedia page
describing this topic is itself controversial. The controversial or non-controversial
nature of a Wikipedia page is automatically detected based on the metadata and
discussions associated with the page. Jang et al. [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ] constructed a controversial
topics language model learned from Wikipedia articles and then used to identify
if a web page is controversial.
        </p>
        <p>
          Detection of controversies in social networks was also discussed without
supervision based on interactions between di erent users [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. Garimella et al. [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]
proposed alternative measurement approaches based on the network, such as
the random walk and the betweenness centrality and the low-dimensional
embeddings. The authors tested simple content-based methods and noted their
ine ciency compared to user graph-based methods. Other studies try to detect
controversies on speci c domains, for example in news [
          <xref ref-type="bibr" rid="ref27">27</xref>
          ] or in debate analysis
[
          <xref ref-type="bibr" rid="ref22">22</xref>
          ]. However, to the best of our knowledge, none of the state-of-the-art works
attempt to estimate, explicitly and concretely, the intensity of the contradiction
or controversy. In this paper, unlike previous work, rather than only identifying
controversy in a single hand-picked topic (e.g., aspect related to political news),
we focus also on estimating the intensity of contradictory opinions around
speci c topics. We propose to measure the intensity of contradiction using some
characteristics of the opinion (e.g. rating, polarity).
2.2
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>Aspect Detection</title>
        <p>
          The rst attempts to detect aspects were based on the classical information
extraction approach using the frequent nominal sentences [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ]. Such approaches
work well for the detection of aspects that are in the form of a single name, but
are less useful when the aspects have low frequency. Similarly, other studies use
Conditional Random Fields (CRF) or Hidden Markov Models (HMM) [
          <xref ref-type="bibr" rid="ref9">9</xref>
          ]. Other
methods are unsupervised and have proven their e ectiveness, such as [
          <xref ref-type="bibr" rid="ref26">26</xref>
          ] that
built a Multi-Grain Topic Model and [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] that proposed HASM (unsupervised
Hierarchical Aspect Sentiment Model) which allows to discover a hierarchical
structure of the sentiment based on the aspects in the unlabelled online reviews.
In our work, the explicit aspects are extracted using the unsupervised method
presented in [
          <xref ref-type="bibr" rid="ref21">21</xref>
          ]. This method, based on the use of extraction rules for product
reviews, corresponds to our experimental data (coursera).
2.3
        </p>
      </sec>
      <sec id="sec-2-3">
        <title>Sentiment Analysis</title>
        <p>
          Sentiment analysis has been the subject of much previous research.As in the case
of aspect detection, the supervised and unsupervised approaches both propose
their solutions. Thus, some unsupervised approaches are based on lexicons, such
as the approach developed by [
          <xref ref-type="bibr" rid="ref30">30</xref>
          ], or corpus-based methods, such as in [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Pang
et al. [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] proposed supervised approaches, that perceive the task of sentiment
analysis as a classi cation task and therefore use methods such as SVM (Support
Vector Machines) or Bayesian networks. Other recent studies are based on RNN
(Recursive Neural Network), such as in [
          <xref ref-type="bibr" rid="ref24">24</xref>
          ]. In our work, sentiment analysis is
only a part of contradiction detection process, we were inspired by [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ] using
Bayesian classi er as baseline. Nave Bayes is a probabilistic model that gives
good results in the classi cation of sentiments and generally takes less time for
training compared to models like SVM or RNN.
3
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Intensity of Contradiction</title>
      <p>Our approach is based on both automatic detection of aspects within reviews
as well as sentiment analysis of these aspects. In addition to the
contradiction detection, our goal is also to estimate the intensity of these contradictions.
To measure the contradictory opinions intensity, two dimensions are jointly
exploited: the polarity around the aspect as well as the rating associated with the
review. The dimensions associated to the contradictory opinions (called in this
paper: reviews-aspect) are represented using a dispersion function (see gure 1).
3. Selection of terms having nominal category (NN, NNS)4,
4. Selection of nouns with emotional terms in their ve-neighborhoods (using</p>
      <p>SentiWordNet 5 dictionary),
5. Extraction of the most frequent (used) terms in the corpus among those
selected in the previous step. These terms will be considered as aspects.
Step Description
(1) course : 44219, material : 3286, assignments : 3118, content : 2947, speaker : 2705,.......termi
re = The/DT lecturer/NN was/VBD an/DT annoying/VBG speaker/NN and/CC very/RB
(2) roetpheetri/tJivJe/cJoJur.s/e.sI//NPNRSPIf/oPuRnPd/'VvBe/DVBthPe/tDakTenf/oVrBmNat,t/i,ntgh/aNt/NINsoi/t/RPBRdPi wearse/nVt/BJDJ fhraormd//IJNJ
to/TO get/VB started/VBN and/CC gure/VB things/NNS out/RP ./.
(3) lecturer, speaker, formatting, things
(4) lecturer, speaker
(5) speaker</p>
      <p>
        Once the list of aspects is de ned, the sentiment polarity around these aspects
must be estimated. The following section presents sentiment analysis models.
Sentiment Analysis. The sentiment of the review on aspect (review-aspect) is
estimated using two approaches: rst, Naive Bayes algorithm [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] which treats:
a) Negation (word preceded by no, not, n't ). The negative forms with respect
to the normal forms of the same words are balanced during the training. This
is to ensure that the number of \not " forms is su cient for the classi cation;
b) Combinations (bigrams and trigrams ) of adjectives with other words such
as adverbs \very bad" and \absolutely recommended". Second, an unsupervised
SentiNeuron6 model proposed by Radford et al. [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] to detect sentiment signals
in reviews. The model consisted of a single layer multiplicative long short-term
memory (mLSTM) cell and when trained for sentiment analysis it achieved state
of the art on the movie review dataset7. They also found a unit in the mLSTM
that directly corresponds to the sentiment of the output. SentiNeuron provides
very good results compared to several models of the state of the art. Especially
in the case of IMDb reviews as well as our case (coursera reviews).
De nition. There is a contradiction between two portions of review-aspect ra1
and ra2 containing an aspect, where ra1, ra2 2 D (Document), when the
opinions (polarities) around the aspect are opposite (i.e. pol(ra1) \ pol(ra2) = ).
We note that after several empirical experiments, the review-aspect ra is de ned
by an excerpt of 5 words before and after the aspect in review re.
      </p>
      <p>Contradiction intensity is estimated using 2 dimensions: polarity poli and
rating rati of the review-aspect rai. Let each rai be a point on the plane with
coordinates (poli; rati). Assuming, the greater is the distance (i.e. dispersion)
between these values related to each review-aspect rai of the same document</p>
      <sec id="sec-3-1">
        <title>4 https://cs.nyu.edu/grishman/jet/guide/PennPOS.html</title>
      </sec>
      <sec id="sec-3-2">
        <title>5 http://sentiwordnet.isti.cnr.it/</title>
      </sec>
      <sec id="sec-3-3">
        <title>6 https://github.com/openai/generating-reviews-discovering-sentiment</title>
      </sec>
      <sec id="sec-3-4">
        <title>7 https://www.cs.cornell.edu/people/pabo/movie-review-data/</title>
        <p>D, the contradiction intensity is more important. The dispersion indicator with
respect to the centroid racentroid with coordinates (pol; rat) is as follows:
(1)
(2)
q
i=1
(poli
Distance(poli; rati) =
pol)2 + (rati</p>
        <p>rat)2
n
Disp(rarpaoltii ; D) = 1 X
n</p>
        <p>Distance(poli; rati)</p>
        <p>Distance(poli; rati) represents the distance between the point rai of the
scatter plot and the centroid racentroid, and n is the number of rai. The two quantities
poli and rati have di erent scale, it is essential to normalize them. The polarity
poli is a probability, but the values of the ratings rati can be normalized as
follows: rati = rati 3 (rati 2 [ 1; 1]). The indicator Disp(rarpaoltii ; D) represents
2
the divergence of the points rai with respect to the centroid racentroid.
{ Disp is positive or zero; Disp = 0 means that all rai are merged into
racentroid (no dispersion).
{ Disp increases when rai moved away from racentroid (i.e. when the dispersion
is increased).</p>
        <p>The coordinates (pol; rat) of the centroid racentroid can be calculated in two
di erent ways. A simple way is to calculate the average of the points rai, in this
case the centroid racentroid corresponds to the average point of the coordinates
rai(poli; rati). Another ner way is to weigh this average by the di erence in
absolute value between the two coordinate values (polarity and rating).
a) Centroid based on average of dimensions. In this case, the coordinates
of the centroid racentroid are computed based on the average of polarities and
ratings as follows:</p>
        <p>pol1+pol2+:::+poln rat1+rat2+:::+ratn
pol= ; rat= (3)
n n
b) Centroid based on weighted average of dimensions. In this case, the
centroid coordinates racentroid are computed based on the weighted average of
polarities and ratings as follows:</p>
        <p>c1 pol1 + c2 pol2 + ::: + cn poln
pol =</p>
        <p>n
c1 rat1 + c2 rat2 + ::: + cn ratn
rat =
n
where n is the number of points rai. The coe cient ci is computed as follows:
(4)
ci = jrati</p>
        <p>polij
2n</p>
        <p>In this two-dimensional vector representation, our hypothesis is that a point
in this space is more important if the values of both dimensions are the most
distant. We believe that a negative aspect in a review with a high rating has
more weight and vice-versa. Consequently, a coe cient of importance for each
(5)
point in space is calculated. This coe cient is based on the di erence in
absolute value between the values of the dimensions. The division by 2n
represents a normalisation by the maximum value of the di erence in absolute value
(max(jrati polij) = 2) and n. For example, for a polarity of 1 and a rating
of 1, the coe cient is 1=n (j 1 1j=2n = 2=2n = 1=n), and for a polarity of 1
and a rating of 1, the coe cient is 0 (j1 1j=2n = 0).
4</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Experimental Evaluation</title>
      <p>In order to validate our approach, experiments were carried out on reviews
collected from the site of coursera.org. Our main objective in these experiments is
to evaluate the impact of considering the sentiment analysis and the rating on
the contradiction detection in the reviews around certain speci c aspects
identied automatically, as well as evaluating the impact of the averaged and weighted
centroid on the contradiction intensity estimation.
4.1</p>
      <sec id="sec-4-1">
        <title>Description of Test Dataset</title>
        <p>DATA. To the best of our knowledge, there is no standard data set to evaluate
the contradiction intensity. Therefore, 73,873 reviews and their ratings of 2244
English courses are extracted from coursera via its API8 and web pages parsing.
More details about the statistics on our coursera dataset are presented in table
3. Our full test dataset and its detailed statistics are publicly available9. Table
5 presents some stats on 4 aspects among 22 useful aspects, listed in table 4,
captured automatically from the reviews.
User Study. To obtain contradiction and sentiment judgements for a given
aspect, we conducted a user study as follows:
(a) 3 users were asked to assess the sentiment class for each review-aspect
provided by our system (see section 3.1). The users must judge just its polarity;</p>
        <sec id="sec-4-1-1">
          <title>8 https://building.coursera.org/app-platform/catalog</title>
          <p>9 https://www.irit.fr/~Ismail.Badache/#projects
(b) 3 other users assessed the degree of contradiction between these
reviewsaspect as shown in the gure 2.</p>
          <p>In average 6 reviews-aspect per course are judged manually for each aspect
(totally: 1320 reviews-aspect of 220 courses i.e. 10 courses for each aspect). To
evaluate sentiments and contradictions in the reviews-aspect of each course,
3points scale are used for sentiments: Negative, Neutral, Positive; and 5-points
scale for contradictions: Not Contradictory, Very Low, Low, Strong and Very
Strong (see gure 2). We computed the agreement degree between assessors for
each aspect using Kappa Cohen measure k. Since we have 3 assessors, the Kappa
value was calculated for each pair of assessors and then their average was
calculated. The average k is 0:76 for sentiment assessors and 0:68 for contradiction
assessors, which corresponds to a substantial agreement.
Correlation study was conducted (one of the o cial measures on SemEval tasks10),
by using the coe cient of Pearson, between the contradiction judgements given
by the assessors and our obtained results. In addition, the precision was
computed for each con guration. The con guration that consider Naive Bayes-based
sentiment analyser is considered as baseline in these experiments.
Remarks: First, the Naive Bayes sentiment analyser takes as a training set
50,000 reviews of IMDb movies11 (Due to the similarity of the vocabulary used
in the reviews on IMDb and coursera), and as a test set our reviews-aspect of
coursera. Second, this sentiment analysis system provides an accuracy of 79%.
Third, assessors' judgements on sentiments are considered as perfect (reference)
results and represent an accuracy of 100%.</p>
          <p>
            In order to check the signi cance of the results compared to the baseline, we
conducted the Student's t-test [
            <xref ref-type="bibr" rid="ref25">25</xref>
            ]. We attached * (strong signi cance against
Baseline) and ** (very strong signi cance against Baseline) to the performance
number of each row in the tables when p-value&lt;0.05 and p-value&lt;0.01 con
dence level, respectively. We discuss in the following the results of each con
guration we investigated (see table 6).
10 http://alt.qcri.org/semeval2016/task7/
11 http://ai.stanford.edu/~amaas/data/sentiment/
Con g (1): Averaged Centroid. The results show that the dispersion
measurement based on the averaged centroid provides a positive correlation with
judgements, Pearson: 0:45, 0:61, 0:68. Indeed, the more polarities between the
reviews-aspect are opposite, the more the set of reviews-aspect diverge from
the centroid, hence the increased intensity dispersion. In addition, the results
obtained using the users' sentiments judgements (table 6 (b)) surpass those
obtained using the sentiment analysis models (table 6 (a) and (b)) with an
approximate percentage of 35% for (a) (Pearson: 0.45 Vs 0.61) and of 50% for (b)
(Pearson: 0.45 Vs 0.68). In terms of precision, compared to baseline, we record
an improvement rate of 23% for (a) when SentiNeuron is used, and 34% for (b)
when the users' sentiments judgements are used in the estimation of
contradiction intensity. Therefore, losing 21% in sentiments (100% - 79%) involves a 34%
loss in precision.
          </p>
          <p>Con g (2): Weighted Centroid. The con guration (2) results are also
positive (Pearson: 0:51, 0:80, 0:87). The results obtained by considering the
importance coe cient ci for each point of the space (review-aspect rai) are better
compared to those obtained when this coe cient is ignored. These improvements
in terms of Pearson correlation value are 13% using Naive Bayes-based sentiment
model (table 6 (Baseline)) and 31% using SentiNeuron (table 6 (a)), and 28%
using manual sentiment judgements (table 6 (b)). Indeed, the more divergent
values of rating and polarity for every review-aspect, the higher is the impact on
contradiction intensity. Also, the results in terms of precision and correlations for
con guration (2) presented in table 6 (b) are much better (Precision: 0:91) than
(Baseline) (Precision: 0:70) and (a) when SentiNeuron is used (Precision: 0:88).
Therefore, sentiment model is an important factor that impacts the estimation
of contradictions.</p>
          <p>Finally, table 7 shows the distribution of contradictions according to their
level (Very Low, Low, Strong or Very Strong ) as well as the number of detected
and undetected contradictions for each con guration and for both systems (a)
and (b). We notice that also these results show that the best results are obtained
by con guration (2) which takes into account the weighted centroid. While we
were pleasantly surprised by the e cacy of our approach, we did not use the
best sentiment analysis model and aspect detection model of state-of-arts. We
believe that improving these pre-processing models enhance our contradiction
detection model signi cantly.
This paper introduced an approach that aims at estimating contradiction
intensity, drawing attention to aspects in which users have contradictory reviews.
Contradiction exists if the sentiments around these reviews-aspect for the same
resource are diverse. Additionally, to quantify the contradiction, reviews-aspect
are exploited using dispersion function, where more the dimensions polarities
and ratings are opposite, the more the impact is important on the contradiction
intensity. The experiments conducted on coursera data set reveal the e
ectiveness of our approach. Moreover, our dataset can be useful for the community.</p>
          <p>The potential problem of our approach is its dependency on the quality of
sentiment and aspect models. Moreover, the sentences are not processed, only a
prede ned window of 5 words before and after the aspect is considered. Further
scale-up experiments on other types of data sets are also envisaged. A supervised
approach based on the state-of-the-art learning approaches can improve
significantly the prediction of contradiction intensity level. Even with these simple
elements, the rst obtained results encourage us to invest more in this track.</p>
        </sec>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>I.</given-names>
            <surname>Badache</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Boughanem</surname>
          </string-name>
          .
          <article-title>Harnessing social signals to enhance a search</article-title>
          .
          <source>In IEEE/WIC/ACM</source>
          , volume
          <volume>1</volume>
          , pages
          <fpage>303</fpage>
          {
          <fpage>309</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>I.</given-names>
            <surname>Badache</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Boughanem</surname>
          </string-name>
          .
          <article-title>Emotional social signals for search ranking</article-title>
          .
          <source>In SIGIR</source>
          , pages
          <volume>1053</volume>
          {
          <fpage>1056</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>I.</given-names>
            <surname>Badache</surname>
          </string-name>
          and
          <string-name>
            <given-names>M.</given-names>
            <surname>Boughanem</surname>
          </string-name>
          .
          <article-title>Fresh and diverse social signals: any impacts on search? In CHIIR</article-title>
          , pages
          <volume>155</volume>
          {
          <fpage>164</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>R.</given-names>
            <surname>Balasubramanyan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.W.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Pierce</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.P.</given-names>
            <surname>Redlawsk</surname>
          </string-name>
          .
          <article-title>Modeling polarizing topics: When do di erent political communities respond di erently to the same news? In ICWSM</article-title>
          , pages
          <volume>18</volume>
          {
          <fpage>25</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>M-C. De Marne e</surname>
            , A. Ra erty, and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Finding contradictions in text</article-title>
          .
          <source>In ACL</source>
          , volume
          <volume>8</volume>
          , pages
          <fpage>1039</fpage>
          {
          <fpage>1047</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>S.</given-names>
            <surname>Dori-Hacohen</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Allan</surname>
          </string-name>
          .
          <article-title>Automated controversy detection on the web</article-title>
          .
          <source>In ECIR</source>
          , pages
          <volume>423</volume>
          {
          <fpage>434</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Shiri</surname>
            Dori-Hacohen and
            <given-names>James</given-names>
          </string-name>
          <string-name>
            <surname>Allan</surname>
          </string-name>
          .
          <article-title>Detecting controversy on the web</article-title>
          .
          <source>In CIKM</source>
          , pages
          <year>1845</year>
          {
          <year>1848</year>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>K.</given-names>
            <surname>Garimella</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. D. F.</given-names>
            <surname>Morales</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gionis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.</given-names>
            <surname>Mathioudakis</surname>
          </string-name>
          .
          <article-title>Quantifying controversy in social media</article-title>
          .
          <source>In WSDM</source>
          , pages
          <volume>33</volume>
          {
          <fpage>42</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>H.</given-names>
            <surname>Hamdan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Bellot</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Bechet</surname>
          </string-name>
          . Lsislif:
          <article-title>Crf and logistic regression for opinion target extraction and sentiment polarity analysis</article-title>
          .
          <source>In SemEval, page 753758</source>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>S.</given-names>
            <surname>Harabagiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Hickl</surname>
          </string-name>
          , and
          <string-name>
            <given-names>F.</given-names>
            <surname>Lacatusu</surname>
          </string-name>
          .
          <article-title>Negation, contrast and contradiction in text processing</article-title>
          .
          <source>In AAAI</source>
          , volume
          <volume>6</volume>
          , pages
          <fpage>755</fpage>
          {
          <fpage>762</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>A.</given-names>
            <surname>Hassan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abu-Jbara</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Radev</surname>
          </string-name>
          .
          <article-title>Detecting subgroups in online discussions by modeling positive and negative relations among participants</article-title>
          .
          <source>In EMNLP</source>
          , pages
          <volume>59</volume>
          {
          <fpage>70</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>A.</given-names>
            <surname>Htait</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Fournier</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bellot</surname>
          </string-name>
          . LSIS at semeval
          <article-title>-2016 task 7: Using web search engines for english and arabic unsupervised sentiment intensity prediction</article-title>
          .
          <source>In SemEval</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>M.</given-names>
            <surname>Hu</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Mining and summarizing customer reviews</article-title>
          .
          <source>In KDD</source>
          , pages
          <volume>168</volume>
          {
          <fpage>177</fpage>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <given-names>M.</given-names>
            <surname>Jang</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Allan</surname>
          </string-name>
          .
          <article-title>Improving automated controversy detection on the web</article-title>
          .
          <source>In SIGIR</source>
          , pages
          <volume>865</volume>
          {
          <fpage>868</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>M. Jang</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Foley</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Dori-Hacohen</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>J.</given-names>
            <surname>Allan</surname>
          </string-name>
          .
          <article-title>Probabilistic approaches to controversy detection</article-title>
          .
          <source>In CIKM</source>
          , pages
          <year>2069</year>
          {
          <year>2072</year>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <given-names>S.</given-names>
            <surname>Kim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Oh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>A hierarchical aspect-sentiment model for online reviews</article-title>
          .
          <source>In AAAI</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <surname>S.M Mohammad</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Kiritchenko</surname>
            , and
            <given-names>X.</given-names>
          </string-name>
          <string-name>
            <surname>Zhu</surname>
          </string-name>
          .
          <article-title>Nrc-canada: Building the state-ofthe-art in sentiment analysis of tweets</article-title>
          . In SemEval,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>A.</given-names>
            <surname>Mukherjee</surname>
          </string-name>
          and
          <string-name>
            <given-names>B.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Mining contentions from discussions and debates</article-title>
          .
          <source>In KDD</source>
          , pages
          <volume>841</volume>
          {
          <fpage>849</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>B.</given-names>
            <surname>Pang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lee</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Vaithyanathan</surname>
          </string-name>
          .
          <article-title>Thumbs up?: sentiment classi cation using machine learning techniques</article-title>
          .
          <source>In EMNLP</source>
          , pages
          <volume>79</volume>
          {
          <fpage>86</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <surname>A.M. Popescu</surname>
            and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Pennacchiotti</surname>
          </string-name>
          .
          <article-title>Detecting controversial events from twitter</article-title>
          .
          <source>In CIKM</source>
          , pages
          <year>1873</year>
          {
          <year>1876</year>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21.
          <string-name>
            <given-names>S.</given-names>
            <surname>Poria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Cambria</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ku</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gui</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Gelbukh</surname>
          </string-name>
          .
          <article-title>A rule-based approach to aspect extraction from product reviews</article-title>
          .
          <source>In SocialNLP</source>
          , pages
          <volume>28</volume>
          {
          <fpage>37</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          22.
          <string-name>
            <surname>M. Qiu</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Yang</surname>
            , and
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Jiang</surname>
          </string-name>
          .
          <article-title>Modeling interaction features for debate side clustering</article-title>
          .
          <source>In CIKM</source>
          , pages
          <volume>873</volume>
          {
          <fpage>878</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          23.
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Jozefowicz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I.</given-names>
            <surname>Sutskever</surname>
          </string-name>
          .
          <article-title>Learning to generate reviews and discovering sentiment</article-title>
          .
          <source>CoRR, abs/1704.01444</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          24.
          <string-name>
            <given-names>R.</given-names>
            <surname>Socher</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Perelygin</surname>
          </string-name>
          ,
          <string-name>
            <surname>J.Y Wu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Chuang</surname>
          </string-name>
          ,
          <string-name>
            <surname>C.D Manning</surname>
            ,
            <given-names>A.Y</given-names>
          </string-name>
          <string-name>
            <surname>Ng</surname>
            , and
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Potts</surname>
          </string-name>
          .
          <article-title>Recursive deep models for semantic compositionality over a sentiment treebank</article-title>
          .
          <source>In EMNLP</source>
          , volume
          <volume>1631</volume>
          , pages
          <fpage>1631</fpage>
          {
          <fpage>1642</fpage>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          25.
          <string-name>
            <surname>Student</surname>
          </string-name>
          .
          <article-title>The probable error of a mean</article-title>
          .
          <source>Biometrika</source>
          ,
          <volume>6</volume>
          (
          <issue>1</issue>
          ):1{
          <fpage>25</fpage>
          ,
          <year>1908</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          26. I. Titov and
          <string-name>
            <given-names>R.</given-names>
            <surname>McDonald</surname>
          </string-name>
          .
          <article-title>Modeling online reviews with multi-grain topic models</article-title>
          .
          <source>In WWW</source>
          , pages
          <volume>111</volume>
          {
          <fpage>120</fpage>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          27.
          <string-name>
            <surname>M. Tsytsarau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Palpanas</surname>
            , and
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Castellanos</surname>
          </string-name>
          .
          <article-title>Dynamics of news events and social media reaction</article-title>
          .
          <source>In KDD</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          28.
          <string-name>
            <surname>M. Tsytsarau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Palpanas</surname>
            , and
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Denecke</surname>
          </string-name>
          .
          <article-title>Scalable discovery of contradictions on the web</article-title>
          .
          <source>In WWW</source>
          , pages
          <volume>1195</volume>
          {
          <fpage>1196</fpage>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          29.
          <string-name>
            <surname>M. Tsytsarau</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          <string-name>
            <surname>Palpanas</surname>
            , and
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Denecke</surname>
          </string-name>
          .
          <article-title>Scalable detection of sentiment-based contradictions</article-title>
          .
          <source>DiversiWeb</source>
          , WWW,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          30.
          <string-name>
            <surname>Peter D Turney</surname>
          </string-name>
          .
          <article-title>Thumbs up or thumbs down?: semantic orientation applied to unsupervised classi cation of reviews</article-title>
          .
          <source>In ACL</source>
          , pages
          <volume>417</volume>
          {
          <fpage>424</fpage>
          ,
          <year>2002</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          31.
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          and
          <string-name>
            <given-names>C.</given-names>
            <surname>Cardie</surname>
          </string-name>
          .
          <article-title>A Piece of My Mind: A sentiment analysis approach for online dispute detection</article-title>
          .
          <source>In ACL</source>
          , pages
          <volume>693</volume>
          {
          <fpage>699</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          32. L.
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Raghavan</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Cardie</surname>
            , and
            <given-names>V.</given-names>
          </string-name>
          <string-name>
            <surname>Castelli</surname>
          </string-name>
          .
          <article-title>Query-focused opinion summarization for user-generated content</article-title>
          .
          <source>In COLING</source>
          , pages
          <volume>1660</volume>
          {
          <fpage>1669</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>