<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A Distributed Approach for the Analysis of Discussions in Twitter</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Teresa Alsinet</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Josep Argelich</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ramon Bejar</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jordi Planes</string-name>
          <email>jplanesg@diei.udl.cat</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Joel Cemeli</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Cristian Sanahuja</string-name>
          <email>csanahuja10@gmail.com</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>25001 Lleida</institution>
          ,
          <country country="ES">Spain</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>INSPIRES Research Center</institution>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Polytechnic School, University of Lleida Jaume II</institution>
          ,
          <addr-line>69</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2017</year>
      </pub-date>
      <fpage>45</fpage>
      <lpage>56</lpage>
      <abstract>
        <p>In a recent work we have developed an argumentative approach for discovering relevant opinions in Twitter. A Twitter discussion is modeled as a weighted argument graph where each node denotes a tweet, each edge denotes a criticism relationship between a pair of tweets of the discussion and each node is attached with a weight that denotes the social relevance of the corresponding tweet in the discussion. In the social network Twitter, a tweet always refers to previous tweets in the discussion, so the obtained underlying argument graph is acyclic. Based on this structural feature, in this work, we introduce a distributed algorithm for computing the set of globally accepted opinions of a Twitter discussion. The set of accepted opinions is extracted by mapping the weighted argument graph into a valued argumentation framework and it is computed as the biggest set of tweets of the discussion that satis es that it is consistent.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Motivation and Antecedents</title>
      <p>
        In order to understand what are the major accepted and rejected opinions in
di erent domains by Twitter users, in a recent work [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] we have developed a
system for analysis of discussions in Twitter.
      </p>
      <p>
        The system architecture has two main components: a discussion retrieval and
a reasoning system. The discussion retrieval component allows us to move from
a discussion in Twitter (a set of tweets) in natural language to a weighted graph
which is computed taking into account criticism relationships between tweets
and three di erent attributes of a tweet: the number of followers of the author,
the number of retweets and the number of favorites. The reasoning system
component maps the weighted graph into a weighted argumentation framework and
the set of socially accepted tweets in the discussion is evaluated from the weight
assigned to each tweet and the criticism relationships between the tweets of the
discussion, and it is computed as the ideal semantics [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] of a valued abstract
argumentation framework [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        In abstract argumentation [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], a graph is used to represent a set of
arguments and counterarguments. Each node is an argument and each edge denotes
an attack between arguments. Several di erent kinds of semantics for abstract
argumentation frameworks have been proposed that highlight di erent aspects
of argumentation (for reviews see [
        <xref ref-type="bibr" rid="ref15 ref3 ref4">3,4,15</xref>
        ]). Usually, semantics are given to
abstract argumentation frameworks in terms of extensions. For a speci c extension
an argument is either accepted, rejected, or undecided and, usually, there is a
set of extensions that is consistent with the semantic context.
      </p>
      <p>
        The system developed in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] builds a weighted argument graph for a Twitter
discussion, where each node denotes a tweet, each edge denotes a criticism
relationship between a pair of tweets of the discussion and each node is attached
with a weight that denotes the social relevance of the corresponding tweet in the
discussion and it is computed from some tweet's attributes. In the social network
Twitter, a tweet always answers or refers to previous tweets in the discussion,
so the obtained underlying argument graph is acyclic.
      </p>
      <p>Based on the fact that the graphs we obtain are acyclic, in this work, we
introduce and investigate a distributed implementation of the skeptical approach
based on the ideal semantics of a valued abstract argumentation framework. A
tweet defeats a second one if it criticizes the second one and it has at least the
same social relevance as the second one. The ideal semantics for valued abstract
argumentation guarantees that the set of tweets in the solution is the maximal
set of tweets that satis es that it is consistent, in the sense that there are no
defeaters among them. But the solution also satis es that it defends against
defeaters from the tweets outside the solution, in the sense we will see in the
formal de nition of ideal semantics.</p>
      <p>The social relevance is measured with a social valuation function that for
each tweet considers di erent information sources from the social network, such
as the number of followers of the author, the number of retweets and the number
of favorites. The distributed approach can be of special relevance for assessing
Twitter discussions that involve a large number of tweets and the system can
be applied in elds where identifying groups of tweets globally compatible or
consistent, but at the same time that are widely accepted, is of particular interest,
such as for instance for the assistance and guidance of marketing and policy
makers.</p>
      <p>After this introduction, in the next section, we formalize the structure to
model Twitter discussions and, in Section 3, we de ne the reasoning model for
computing their solutions. Then, in Section 4, we present a novel distributed
strategy for the implementation of the reasoning model based on the ideal
semantics for a valued abstract argumentation framework. We end the paper with
some conclusions and a discussion of future work. As far as we know, this is the
rst distributed algorithm presented to solve such problem.
2</p>
    </sec>
    <sec id="sec-2">
      <title>A weighted graph for Twitter discussions</title>
      <p>
        Following the approach proposed in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], in this section, we introduce a
computation structure (a weighted graph) to represent a Twitter discussion considering
only criticism relationships between pairs of tweets.
      </p>
      <p>De nition 1. (Twitter Discussion) A Twitter discussion is a non-empty set
of tweets. A tweet t 2 is a tuple t = (m; a; ; r; fv), where m is the up to 140
characters long message of the tweet, a is the author's identi er of the tweet,
r 2 N is the number of retweets and fv 2 N is the number of favorites. Let
t1 = (m1; a1; 1; r1; fv1) and t2 = (m2; a2; 2; r2; fv2) be a pair of tweets of a
Twitter discussion . We say that t1 answers t2 i t1 is a reply to the tweet t2
or t1 mentions (refers to) tweet t2.</p>
      <p>De nition 2. (Discussion Graph) The Discussion Graph (DisG) for a Twitter
discussion is the directed graph (T; E) such that
{ for every tweet in there is a node in T and
{ if tweet t1 = (m1; a1; 1; r1; fv1) answers tweet t2 = (m2; a2; 2; r2; fv2), with
a1 6= a2, and m1 criticizes the claim expressed in m2, there is a directed edge
(t1; t2) in E.</p>
      <p>Only the nodes and edges obtained by applying this process belong to T and E,
respectively.</p>
      <p>Although our system allows us to analyze any discussion in Twitter (set of
tweets), in this work we deal with discussions where a tweet answers previous
tweets in the discussion. Moreover, in our approach, (t1; t2) 2 E i t1 answers
t2 and the message of tweet t1 does not agree with the claim expressed in the
message of tweet t2. So, the answers between tweets whose messages are not
classi ed as criticisms, do not give rise to edges and, therefore, some nodes
(tweets) of a discussion graph can be disconnected. Thus, we can nd nodes for
which the input or the output degree, or both, is zero.</p>
      <p>Since the social network we are considering in this work is Twitter, every
tweet of a discussion can reply at most one tweet, but can mention many tweets,
and all of them are prior in the discussion. So, every tweet can answer and, in
turn, can criticize many prior tweets of the discussion, and thus, every tweet can
criticize many prior tweets from a same author and from di erent authors.</p>
      <p>
        From an implementation point of view, in order to check if a tweet criticizes
another tweet, we can use some of the components we have developed in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. On
the one hand, to check if a tweet t1 replies a tweet t2, the system can use the
data structure in the JSON format provided by the Twitter API. In particular,
this fact can be easily checked from the attribute in reply to status id of the
object of t1, which provides the tweet identi er to which t1 replies. To check
the set of mentions of t1, the system searches for all authors mentions in the
message of t1. Every mention of an author is stored in the message with a label
of the form: @&lt;author identifier&gt;. So, t1 mentions t2 whenever the author's
identi er of t2 is in the set of mentions of t1.
      </p>
      <p>On the other hand, to check if a tweet does not agree with the claim
expressed in a di erent tweet, the system uses an automatic labeling system based
on Support Vector Machines (SVM). Our SVM model for labeling relations
between tweets considers di erent attributes obtained from the tweets of an answer
(t1; t2). On the one hand, we have attributes that count the number of
occurrences of relevant words in the tweets t1 and t2. We have considered two kinds of
words: regular words and stopwords. We have considered the inclusion of
stopwords as attributes because the typical tweet is very short and the fraction of
stopwords that can be giving information about the kind of answer could be
relevant. For example, in the next tweet
the stopwords ... and !!! give information about the sentiment associated to
the tweet.</p>
      <p>
        On the other hand, we also consider attributes that have to be computed
from the text and from the additional information that comes with the tweets.
In particular, for each tweet these attributes are the number of images and the
number of URLs mentioned in the tweet, the number of positive and negative
emoticons and the sentiment expressed by the tweet. Our labeling system
incorporates a sentiment analysis computation module [
        <xref ref-type="bibr" rid="ref12 ref14">12,14</xref>
        ] that given the set of
words in a tweet it provides a sentiment value in the range [ 5; +5], where -5
is the most negative sentiment and +5 is the most positive sentiment. Finally,
the sentiment value is incremented (or decremented) considering every positive
(negative) emoticon.
      </p>
      <p>Since SVM follows a supervised learning approach, we rst have to train
a model from an already labeled data set of answers. To this aim, we have
collected a set of several Twitter discussions, on the Spanish language, and we
have manually labeled the answers in the discussions to be able to train a SVM
labeling model for Spanish discussions.</p>
      <p>
        The training collection contains 12 discussions and a total of 582 pairs of
tweets (answers). We have considered the creation of SVM models with di erent
number of regular words (w) and di erent number of stopwords (s). The words
in the collection are sorted by number of occurrences, and for an SVM model
with w regular words, we select the rst w most frequent regular words. The
stopwords have been obtained from the natural language toolkit (NLTK) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]
and have been also ordered by number of occurrences, so we also select the s
most frequent stopwords.
      </p>
      <p>
        Using this training collection, we have trained four models with di erent
values for w and s, in order to get a good labeling model. In order to compute
the sentiment for the tweets in this collection, we have taken the AFINN data3
used in [
        <xref ref-type="bibr" rid="ref12 ref14">12,14</xref>
        ] and we have translated the words to Spanish. See [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] for a detailed
description of the SVM training model that we have implemented for checking
criticism relationships between tweets from Spanish Twitter discussions.
De nition 3. (Weighted Discussion Graph) A weighted discussion graph (WDisG)
for a Twitter discussion is a tuple hT; E; R; W i, where
{ (T; E) is the DisG graph for ,
{ R is a nonempty set of ordered values and
{ W is a weighting scheme W : T ! R that assigns a weight value in R to
each tweet in T , representing the social relevance of the tweet.
      </p>
      <p>Regarding the implementation, we instantiate the set of ordered values R
to the natural numbers N and, for each node with tweet t = (m; a; ; r; fv), we
consider the following weighting scheme:</p>
      <p>
        W (t) = blog2( + 20 r + 40 fv + 1)c;
which takes into account not only the number of followers of the author, but
also the number of retweets and favorites. This function allows us to quantify
the orders of magnitude of the social relevance of tweets following the statistics
about tweets and retweets de ned in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], trying to give each attribute a weight
proportional to its relevance. From the statistics shown in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], we observe that
on weighting with twenty times the value of retweets and forty times the value
of favorites, the magnitudes of the three attributes are comparable and one
attribute does not dominate the others, since the number of followers is usually
much bigger than the number of retweets and favorites. We nally compute the
log2 function of the combined value, since we want to consider that one tweet is
more relevant than another only if such combined weight is at least two times
bigger for the rst tweet. We will refer to this weighting scheme as fl1r20fv40.
      </p>
      <p>Figure 1 shows a WDisG graph instance for a Twitter discussion obtained
from the political domain using the fl1r20fv40 weighting scheme. Each tweet
is represented as a node and each criticism answer as an edge. The root tweet
of the discussion is labeled with 0 (the tweet that starts the discussion) and the
other nodes are labeled with consecutive identi ers according to the temporal
generation order of the tweets in the social network. The nodes that appear
disconnected in the graph correspond to tweets that have participated in the
discussion in response to other tweets, but have not been classi ed as criticisms
answers (nodes 3 and 5). The discussion has a simple structure, possibly one
of the most frequent in Twitter. A root tweet starts the discussion and some
answers criticize it, and there are not many criticisms between non-root tweets.
The discussion contains 13 tweets and 14 criticisms answers. Nodes are colored
3 http://www2.compute.dtu.dk/~faan/data/AFINN.zip
in blue scale, where the darkness of the color is directly proportional to its weight
with respect to the maximum value in the discussion.</p>
      <p>Since our reasoning model de nes the set of socially accepted tweets of a
discussion as a set of tweets without e ective con icts, the solution is mainly
de ned by tweets that generate or receive criticism of other tweets and it is
presented in next section.</p>
    </sec>
    <sec id="sec-3">
      <title>An argumentation-based reasoning model</title>
      <p>
        Once we have introduced our formal representation model of Twitter discussions
with criticism relationships between pairs of tweets, the next key component is
the de nition of the reasoning model used to obtain the set of socially accepted
tweets. To this end, we use a valued abstract argumentation framework [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] for
modeling the weighted argumentation problem associated with a WDisG graph
and ideal semantics [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] for de ning the solution (the set of socially accepted
tweets).
      </p>
      <p>A valued argumentation framework (VAF) is a tuple hA; attacks; R; Val; Valprefi
where A is a set of arguments, attacks is an irre exive binary relation on A, R is
a nonempty set of values, Val is a valuation function Val : A ! R that assigns
to each argument in A a weight value in R, and Valpref R R is a
preference relation on R (transitive, irre exive and asymmetric), re ecting the value
preferences of arguments.</p>
      <p>Given a Twitter discussion and its WDisG graph G =hT; E; N; fl1r20fv40i
based on the weighting scheme fl1r20fv40 : T ! N, the VAF for G is the
tuple F = hT; E; N; fl1r20fv40;&gt;i, where each tweet in T results in an argument
of F , each criticism answer in E results in an attack between the arguments
of F , N is the set of valuations or weights of the arguments, the weighting
scheme fl1r20fv40 is the weighting valuation function of F , and the order
relation &gt; of N is the preference relation between the valuations or weights
of the arguments; i.e. a tweet t2 is more valued than a tweet t1 whenever
fl1r20fv40(t2) &gt; fl1r20fv40(t1).</p>
      <p>Then, a defeat relation (or e ective attack relation) between tweets
(arguments) is de ned as follows: defeats = f(t1; t2) 2 E j fl1r20fv40(t2) 6&gt;
fl1r20fv40(t1)g.</p>
      <p>Moreover, a set of tweets (arguments) S T is con ict-free if for all t1; t2 2
S; (t1; t2) 62 defeats, and a con ict-free set of tweets S T is maximally
admissible if for all t1 62 S, S [ ft1g is not con ict-free and, for all t2 2 S, if
(t1; t2) 2 defeats then there exists t3 2 S such that (t3; t1) 2 defeats. Finally, the
set of socially accepted tweets of , referred as the solution of , is computed as
the largest admissible con ict-free set of tweets S T in the intersection of all
maximally admissible con ict-free sets. Remark that the tweets of a discussion
that do not generate nor receive criticism, are always part of the solution.</p>
      <p>Figure 2 shows the VAF solution for the WDisG graph instance of Figure 1.
The nodes colored in blue are the tweets in the solution and the nodes colored
in gray are the rejected tweets, where the darkness of the color is directly
proportional to its weight. According to the fl1r20fv40 valuation function, the
tweets of the discussion are strati ed in three levels denoting their relevance
in the discussion. For each level, we nd the following sets of tweets: level 0:
f3; 2; 4; 5; 9; 10; 11g, level 1: f6; 7; 8; 12g, level 2: f0; 1g, being level 0 the lowest
level and level 2 the highest one. The solution contains 7 tweets (Tweets 1, 4,
7, 8, 9, 11 and 12) of the 13 tweets of the discussion, and 6 tweets are rejected
(Tweets 0, 2, 3, 5, 6 and 10). On the one hand, Tweet 1 defeats the root tweet,
since Tweet 1 attacks the root tweet and both have the same weight. The same
happens with Tweets 3, 5, 6 and 10, since Tweet 4 attacks Tweet 3, Tweet 11
attacks Tweets 5 and 10, and Tweets 7 and 12 attack Tweet 6. On the other
hand, Tweet 8 defeats Tweet 2, since Tweet 8 attacks Tweet 2 and Tweet 8 is
heavier than Tweet 2. Finally, Tweets 1, 4, 7, 8, 9, 11 and 12 are accepted since
they do not have any defeater in the solution. Notice that in this discussion with
high controversy around Tweets 0 and 6 (with a high number of attacks), we
end up rejecting both tweets due to the weight they get during the discussion.</p>
      <p>
        In [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] we implemented a reasoning system for computing the VAF solution of
a WDisG graph based on the algorithm for computing the ideal extension for an
argumentation framework presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], but adapting it to work with valued
arguments. Regarding the implementation we used an approach based on Answer
Set Programming (ASP) available in the argumentation system ASPARTIX [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ],
but we extended it to work with VAFs, as the current implementation in
ASPARTIX only works with non-valued arguments. To develop such extension we
modi ed the manifold ASP program explained in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] incorporating:
{ the valuation function for arguments,
{ the preference relation between argument valuations and
{ the defeat relation relation between arguments (e ective attack).
      </p>
      <p>Now in this work, we de ne a distributed strategy for implementing the
underlying reasoning algorithm for computing the ideal extension for a VAF.
The algorithm takes a WDisG graph of a Twitter discussion and outputs the set
of accepted tweets based on the valuation function, the preference relation and
the computation of the largest admissible con ict-free set of tweets according to
the defeat relation.
4</p>
    </sec>
    <sec id="sec-4">
      <title>Distributed solution computation</title>
      <p>
        We design a distributed strategy for computing the solution of a WDisG graph
using the distributed model of computation of Pregel [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. This model is
appropriate for our problem, because the input for a Pregel algorithm is a directed
graph, where the nodes can be in di erent states, and the goal of a distributed
algorithm in Pregel is to compute the state of each node based on the state of
the nodes' neighbors. Any Pregel algorithm starts initializing each node to some
initial state. Then, the distributed computation follows a sequence of supersteps
separated by global synchronization points until the algorithm nishes a point
where every node is happy with its current state. This computation model is
actually inspired by Valiant's Bulk Synchronous Parallel model [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ].
      </p>
      <p>Within each superstep the nodes compute their state in parallel, executing
a speci c function that computes the new state of the node taking into account
the possible messages sent by the nodes' neighbors in the previous superstep.
Then, as a byproduct of the computation of the node state, some messages may
be sent to some of the nodes' neighbors, that they will be processed in the
next superstep. The idea is that the messages are used by nodes to indicate
some change in their state, that could have some in uence on the state of their
neighbors in the next superstep. The superstep nishes when every node has
computed its state and has sent the necessary messages to its neighbors.</p>
      <p>In the computation model of Pregel, the input can be any directed graph.
However, the algorithm we present here works only with discussion graphs that
are acyclic, such that it allows us to solve discussions of big size with an e cient
polynomial time distributed algorithm, where the state of each node depends
only on the state of its attacking nodes. Observe that in the case of Twitter,
where a tweet only answers previous tweets, the discussion graphs are always
acyclic, so restricting the algorithm to consider only acyclic discussion graphs is
not a real restriction for Twitter.</p>
      <p>For the distributed computation of the solution (ideal extension) for a
discussion acyclic graph, we propose a Pregel algorithm where each node can be
in two states: accepted or not accepted (rejected), but a node also stores a
defeaters count, to keep track of the number of accepted defeaters the node has, to
actually compute its acceptance state. Initially, every node starts in the rejected
state and with its defeaters counter equal to zero.</p>
      <p>Algorithm 1 Compute the acceptance state for a node a
1: procedure a.CompAcceptance(i) . Update acceptance state of a at superstep i
2: a.storeCurrentAcceptanceState()
3: for all msg 2 a.received(i-1) do . Check defeaters count update
4: if (msg.type == attacker) then a.updateDefeaters( msg.value )
5: a.updateAcceptanceState()
6: if ( a.StateChanged() ) then . a changed to accepted or to not accepted
7: sendToAllNeighbors( msg( attacker, (a.weight(), a.isAccepted() ) ) )
8: else
9: a.VoteToHalt() . Vote to nish distributed computation</p>
      <p>Algorithm 1 shows the pseudocode of the function used by each node a
to compute its state in a superstep i. 4 It works as follows. In the superstep
i, a node a checks if its state has to be changed after updating its defeaters
counter. The node a updates its defeaters counter based on the received messages
(from the previous superstep i 1) from any nodes v connected with a with an
incoming edge (v; a). These messages are processed in the for loop of lines 3
and 4. There are two possible changes that every message can produce on the
defeaters counter of a. On the one hand, if in the previous superstep a node
v, and such that (v; a) is an edge of the graph, changed its state to accepted
then v sent a message to a of type attacker and with value (v:weight(); +1),
indicating to a that its defeaters counter should be increased by one but only
if v is a defeater of a (if v:weight() a:weight()). On the other hand, such
node v could instead have changed its state to not accepted, so in that case the
message sent to a will be also of type attacker but with value (v:weight(); 1),
indicating that its defeaters counter should be decreased by one if v is a defeater
of a. The possible addition or subtraction to its defeaters counter, caused by
a message msg sent by a node v is checked (and performed when needed) by
calling the function a:updateDef eaters(msg:value), where the value has the
format indicated previously. Then, after updating the defeaters counter the state
4 The pseudocode is written using object oriented notation, as the Pregel API is
written in C++. However, our actual implementation is based on the Pregel
implementation found on the Spark distributed programming framework, graphX, that is
written in Scala.
of a is updated calling the function a:updateAcceptanceState(), that checks if
the updated defeaters counter is greater than zero. Finally, we call the function
a:StateChanged() (in line 6) to check whether the state of a changed (from not
accepted to accepted or viceversa). If it changed, a sends an attacker message
to all its outgoing neighbors with value (a:weight(); a:isAccepted()), where the
function a:isAccepted() will return either +1 or -1 depending on whether the
state of a is accepted or not accepted. These sent messages will be processed
by the outgoing neighbors of a when the next superstep begins. If the state of
a did not change, then a votes towards nishing the distributed computation
of the solution (line 9), but only when all the nodes agree to nish, in a same
superstep, will the algorithm nish.</p>
      <p>Consider the execution of the algorithm when solving the example discussion
we have presented in Section 2, and whose solution is shown on Figure 2:
1. Before the rst superstep, all the nodes start in the not accepted state and
with defeaters counter equal to zero, so in the rst superstep, as there are
no incoming messages for any node, all of them will change their state to
accepted. Then, any node with outgoing edges (all except 3 and 0), sends a
message to its attacked nodes with value (node:weight(); +1). Observe that
nodes 4; 7; 8; 9; 11 and 12 will remain accepted for the rest of the execution
of the algorithm, as they will never receive any messages.
2. In the second superstep, the received messages produce that nodes 0; 1; 2; 3; 5; 6
and 10 change to not accepted, as they receive messages from attacking
accepted nodes that defeat them, so each one of these nodes (except 0 and 3)
will send a message with value (node:weight(); 1). Nodes 2; 3; 5; 6 and 10
will remain not accepted for the rest of the execution of the algorithm.
3. In the third superstep, the messages sent in the previous superstep produce
that nodes 0 and 1 change to accepted, and so node 1 will send a message
to node 0 with value (1:weight(); +1). Node 1 will remain accepted for the
rest of the execution.
4. Finally, in the fourth superstep the message received by node 0 from 1, as 1
is a defeater for 0, will change the state of 0 to rejected, being this its nal
state and the end of the execution of the algorithm as no more messages are
sent by any nodes.</p>
      <p>Observe that the number of supersteps is equal to the maximum path length of
the discussion graph, and although one can think about variants of this algorithm
where less messages are sent, it seems that in the worst case it is not possible the
have less supersteps than the maximum path length of the discussion graph. So,
for typical Twitter discussions, where one has many branches in the discussion
graph but not too deep, this algorithm may solve big discussions.</p>
      <p>We have started to evaluate the performance of this algorithm on real
Twitter discussions of di erent sizes, working with a small Spark cluster with ve
computers working with Linux. A table with preliminary results for a test set
of Twitter discussions can be found at the URL: https://www.dropbox.com/
s/u4yrz5an78d6dfw/lasttable.pdf?dl=0.
5</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusions and future work</title>
      <p>In this paper we introduce a distributed system for mining the set of globally
accepted tweets of a Twitter discussion. We model discussions with a weighted
argumentation graph, where each node denotes a tweet, each edge denotes a
criticism relationship between a pair of tweets of the discussion and each node
is attached with a weight, that denotes the social relevance of the corresponding
tweet in the discussion and it is computed from some tweet's attributes, such as
the number of followers of the author, the number of retweets and the number
of favorites.</p>
      <p>The set of accepted tweets is de ned following an skeptical approach based on
the ideal semantics of a valued argumentation framework. The ideal semantics
for valued argumentation guarantees that the set of tweets in the solution is the
maximal set of tweets that satis es that it is consistent, in the sense that there
are no defeaters among them, and that all of the tweets outside the solution are
defeated by a tweet within the solution.</p>
      <p>Our distributed strategy is based on the distributed computation model of
Pregel which is appropriate for our problem, because the input for a Pregel
algorithm is a directed graph, where the nodes can be in di erent states, and
the goal is to compute the state of each node based on the state of the nodes'
neighbors.</p>
      <p>In our system each node (tweet) can be in two states, accepted or not
accepted (rejected), and the algorithm works only with discussion graphs that are
acyclic, such that it allows us to solve discussions of big size with an e cient
polynomial time distributed algorithm, where the state of each node depends
only on the state of its attacking nodes. As far as we know, the system is the
rst distributed implementation of an argumentative reasoning algorithm for
social network analysis.</p>
      <p>As future work, we plan to extend the representation model to consider
support relationships between tweets and also to explore an e cient implementation
of a distributed algorithm for bipartite graphs.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Alsinet</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Argelich</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bejar</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Fernandez</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mateu</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Planes</surname>
          </string-name>
          , J.:
          <article-title>Weighted argumentation for analysis of discussions in Twitter</article-title>
          .
          <source>Int. J. Approx. Reasoning</source>
          <volume>85</volume>
          ,
          <issue>21</issue>
          {
          <fpage>35</fpage>
          (
          <year>2017</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Bench-Capon</surname>
            ,
            <given-names>T.J.M.:</given-names>
          </string-name>
          <article-title>Persuasion in practical argument using value-based argumentation frameworks</article-title>
          .
          <source>Journal of Logic and Computation</source>
          <volume>13</volume>
          (
          <issue>3</issue>
          ),
          <volume>429</volume>
          {
          <fpage>448</fpage>
          (
          <year>2003</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Bench-Capon</surname>
            ,
            <given-names>T.J.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dunne</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          :
          <article-title>Argumentation in arti cial intelligence</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>171</volume>
          (
          <issue>10</issue>
          -
          <fpage>15</fpage>
          ),
          <volume>619</volume>
          {
          <fpage>641</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Besnard</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hunter</surname>
            ,
            <given-names>A.:</given-names>
          </string-name>
          <article-title>A logic-based theory of deductive arguments</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>128</volume>
          (
          <issue>1-2</issue>
          ),
          <volume>203</volume>
          {
          <fpage>235</fpage>
          (
          <year>2001</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Bild</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          , Liu,
          <string-name>
            <given-names>Y.</given-names>
            ,
            <surname>Dick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.P.</given-names>
            ,
            <surname>Mao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.M.</given-names>
            ,
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <surname>D.S.:</surname>
          </string-name>
          <article-title>Aggregate characterization of user behavior in Twitter and analysis of the retweet graph</article-title>
          .
          <source>ACM Trans. Internet Techn</source>
          .
          <volume>15</volume>
          (
          <issue>1</issue>
          ), 4:
          <issue>1</issue>
          {4:
          <issue>24</issue>
          (
          <year>2015</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Bird</surname>
            ,
            <given-names>S.:</given-names>
          </string-name>
          <article-title>NLTK: the natural language toolkit</article-title>
          .
          <source>In: Proceedings of the 21st International Conference on Computational Linguistics</source>
          and
          <article-title>44th Annual Meeting of the Association for Computational Linguistics</article-title>
          ,
          <string-name>
            <surname>ACL</surname>
          </string-name>
          <year>2006</year>
          , Sydney, Australia. pp.
          <volume>17</volume>
          {
          <issue>21</issue>
          (
          <year>2006</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Dung</surname>
            ,
            <given-names>P.M.</given-names>
          </string-name>
          :
          <article-title>On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games</article-title>
          .
          <source>Arti cial Intelligence</source>
          <volume>77</volume>
          (
          <issue>2</issue>
          ),
          <volume>321</volume>
          {
          <fpage>357</fpage>
          (
          <year>1995</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Dung</surname>
            ,
            <given-names>P.M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Mancarella</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Toni</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          :
          <article-title>Computing ideal sceptical argumentation</article-title>
          .
          <source>Artif. Intell</source>
          .
          <volume>171</volume>
          (
          <issue>10</issue>
          -
          <fpage>15</fpage>
          ),
          <volume>642</volume>
          {
          <fpage>674</fpage>
          (
          <year>2007</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Dunne</surname>
            ,
            <given-names>P.E.</given-names>
          </string-name>
          :
          <article-title>The computational complexity of ideal semantics I: abstract argumentation frameworks</article-title>
          .
          <source>In: Proceedings of Computational Models of Argument, COMMA</source>
          <year>2008</year>
          , Toulouse, France. pp.
          <volume>147</volume>
          {
          <issue>158</issue>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Egly</surname>
            ,
            <given-names>U.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Gaggl</surname>
            ,
            <given-names>S.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woltran</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          : Aspartix:
          <article-title>Implementing argumentation frameworks using answer-set programming</article-title>
          .
          <source>In: Proceedings of the 24th International Conference on Logic Programming</source>
          ,
          <string-name>
            <surname>ICLP</surname>
          </string-name>
          <year>2008</year>
          . pp.
          <volume>734</volume>
          {
          <issue>738</issue>
          (
          <year>2008</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Faber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Woltran</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          :
          <article-title>Manifold answer-set programs for meta-reasoning</article-title>
          .
          <source>In: Proceedings of Logic Programming and Nonmonotonic Reasoning</source>
          ,
          <string-name>
            <surname>LPNMR</surname>
          </string-name>
          <year>2009</year>
          . pp.
          <volume>115</volume>
          {
          <issue>128</issue>
          (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <surname>Hansen</surname>
            ,
            <given-names>L.K.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Arvidsson</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nielsen</surname>
            ,
            <given-names>F.A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Colleoni</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Etter</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          :
          <article-title>Good friends, bad news - a ect and virality in Twitter</article-title>
          . In: International Workshop on Social Computing, Network, and
          <string-name>
            <surname>Services</surname>
          </string-name>
          (SocialComNet
          <year>2011</year>
          ) (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <surname>Malewicz</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Austern</surname>
            ,
            <given-names>M.H.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Bik</surname>
            ,
            <given-names>A.J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Dehnert</surname>
            ,
            <given-names>J.C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Horn</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Leiser</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Czajkowski</surname>
          </string-name>
          , G.:
          <article-title>Pregel: a system for large-scale graph processing</article-title>
          .
          <source>In: Proceedings of the ACM SIGMOD International Conference on Management of Data</source>
          ,
          <string-name>
            <surname>SIGMOD</surname>
          </string-name>
          <year>2010</year>
          . pp.
          <volume>135</volume>
          {
          <issue>146</issue>
          (
          <year>2010</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14.
          <string-name>
            <surname>Nielsen</surname>
            ,
            <given-names>F.A.:</given-names>
          </string-name>
          <article-title>A new anew: Evaluation of a word list for sentiment analysis in microblogs</article-title>
          .
          <source>In: Proceedings of the ESWC2011 Workshop on 'Making Sense of Microposts'</source>
          . pp.
          <volume>93</volume>
          {
          <issue>98</issue>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>Rahwan</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Simari</surname>
            ,
            <given-names>G.R.</given-names>
          </string-name>
          :
          <source>Argumentation in Arti cial Intelligence</source>
          . Springer Publishing Company, 1st edn. (
          <year>2009</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16.
          <string-name>
            <surname>Valiant</surname>
            ,
            <given-names>L.G.</given-names>
          </string-name>
          :
          <article-title>A bridging model for multi-core computing</article-title>
          .
          <source>J. Comput. Syst. Sci</source>
          .
          <volume>77</volume>
          (
          <issue>1</issue>
          ),
          <volume>154</volume>
          {
          <fpage>166</fpage>
          (
          <year>2011</year>
          )
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>