Introduction and motivations

An Argument-based Approach to Mining Opinions from Twitter?

Kathrin Grosse

kgrosse@uos.de 2

Carlos I. Chesn~evar

0 1

Ana G. Maguitman

0 1 0 Arti cial Intelligence Research and Development Laboratory Department of Computer Science and Engineering Universidad Nacional del Sur Av. Alem 1253, (8000) Bah a Blanca , Argentina 1 Consejo Nacional de Investigaciones Cient cas y Tecnicas (CONICET) , Argentina 2 Institut fur Kognitionswissenschaft 3 Universitat Osnabruck. Osnabruck , Germany

Social networks have grown exponentially in use and impact on the society as a whole. In particular, microblogging platforms such as Twitter have become important tools to assess public opinion on di erent issues. Recently, some approaches for assessing Twitter messages have been developed, identifying sentiments associated with relevant keywords or hashtags. However, such approaches have an important limitation, as they do not take into account contradictory and potentially inconsistent information which might emerge from relevant messages. We contend that the information made available in Twitter can be useful to extract a particular version of arguments (called \opinions" in our formalization) which emerge bottom-up from the social interaction associated with such messages. In this paper we present a framework which allows to mine opinions from Twitter based on incrementally generated queries. As a result, we will be able to obtain an \opinion tree", rooted in the rst original query. Distinguished, con icting elements in an opinion tree lead to so-called \con ict trees", which resemble dialectical trees as those used traditionally in defeasible argumentation.

Introduction and motivations

Social networks have grown exponentially in use and impact on the society as a whole, aiming at di erent communities and providing di erentiated services. In particular, microblogging has become a very popular communication tool among Internet users, being Twitter1 by far the most widespread microblogging platform. Twitter, created in 2006, enables its users to send and read text-based posts of up to 140 characters, known as \tweets". It has grown into a technology which allows to assess public opinion on di erent issues. Thus, for example, it is common to read nowadays newspapers articles referring to the impact of political decisions measured by their associated positive or negative comments in Twitter. Symmetrically, policy makers make public many of their claims and opinions, having an in uence on the citizenry,2 prompting their \tweeting back" with further comments and opinions. As the audience of microblogging platforms and services grows everyday, data from these sources can be used in opinion mining and sentiment analysis tasks [ 1 ].

As pointed out in [ 2 ], microblogging platforms (in particular Twitter) o er a number of advantages for opinion mining. On the one hand, Twitter is used by di erent people to express their opinion about di erent topics, and thus they are a valuable source of people's opinions. Given the enormous number of text posts, the collected corpus can be arbitrarily large. On the other hand, Twitter's audience varies from regular users to celebrities, company representatives, politicians, and even country presidents. Therefore, it is possible to collect text posts of users from di erent social and interests groups. According to Merriam Webster online dictionary,3 an opinion can be seen as: a) a view, judgment, or appraisal formed in the mind about a particular matter; b) belief stronger than impression and less strong than positive knowledge; a generally held view; c) a formal expression of judgment or advice by an expert. Clearly, there is a natural link between opinion and argument. In many cases, opinions by themselves do not provide arguments, as they do not necessarily imply giving reasons or evidence for accepting a particular conclusion. However, from a meta-level perspective, policy makers devote much e ort in analyzing the reasons underlying complex collections of opinions from the citizenry, as they indicate the willingness of the people to accept or reject some particular issue. A well-known example in this setting is the analysis of public opinion (e.g. through the quantitative measurement of opinion distributions through polls and the investigation of the internal relationships among the individual opinions that make up public opinion on an issue).

A fundamental need for policy makers is to back their decisions and agreements on reasons or opinions provided by citizens. They might even argue with other policy makers about why making a particular decision is advisable (e.g. \according to the last poll, 80% of the people are against the health system reform; therefore, the reform should not be carried out"). From this perspective, social networks like Twitter provide a fabulous knowledge base from which information could be collected and analyzed in order to enhance and partially automatize decision making processes. In particular, tweets have a rich structure, providing a number of record elds which allow to detect provenance of the tweet (author), number of re-tweets, followers, etc. 2 E.g. the current UK Prime Minster David Cameron and the current US President Barack Obama can be followed on Twitter at @Number10gov and @BarackObama, respectively. 3 http://www.merriam-webster.com

We contend that the information made available from such tweets can be useful for modeling opinions which emerge bottom-up from the social interaction existing in Twitter. In our analysis, we will assume that opinions are arguments, which can be seen as particular instances of the \Argument from Majority" schema [ 3 ]. In this paper, we analyze the main elements characterizing an argument-based approach to mining opinions from Twitter based on incrementally generated queries. Given a query, we will model an opinion supporting it as a set of aggregated tweets along with a prevailing sentiment,4 which can be attacked by alternative counter-opinions. As a nal result, we will be able to obtain a \con ict tree", rooted in the rst original query, in a way that resembles dialectical trees in argumentation.

The rest of the paper is structured as follows. In Section 2 we present our proposal for characterizing Twitter-based arguments and their interrelationships. We will formalize the notion of opinion tree, which can be constructed from user queries, allowing to assess alternative opinions associated with incrementally generated queries. A high-level algorithm for computing opinion trees is presented, along with a case study to illustrate our proposal. Section 3 discusses the relationship emerging from opinions in con ict, modeled as a con ict tree. Section 4 generalizes previous results using superior lattices, for both opinion and con ict trees. The bene ts of applying this mathematical approach are discussed in Section 5. Section 6 reviews related work, and nally Section 7 summarizes the conclusions. 2

Twitter-based Argumentation Framework: viewing aggregated tweets as arguments

In this Section we will describe how di erent elements in Twitter can be captured under an argumentative perspective [ 4, 5 ]. First we will characterize distinguished collections of tweets (obtained on the basis of a given query) as arguments with an associated prevailing sentiment. Such arguments will be called TB-arguments (Twitter-based arguments). Then, we will formalize interrelationships between TB-arguments, which lead to the notion of opinion tree. 2.1

Formalizing Aggregation of Twitter Messages

Twitter messages (Tweets) are 140 character long, with a number of additional elds which help identify relevant information within a message (sender, number of retweets associated with the message, etc.). In particular, we will focus on the presence of descriptors which are either hashtags (words or phrases pre xed with the symbol #, a form of metadata tag) or terms that tend to occur often in the context of a given topic. Hashtags are used within IRC networks to identify groups and topics and in short messages on microblogging social networking 4 Several software tools have been recently developed for such an association, such as www.sentiment140.com or tweetsentiments.com. services such as Twitter, identi.ca or Google+ (which may be tagged by including one or more with multiple words concatenated). Other good descriptors can be dynamically found by looking for terms that are frequently used in tweets related to the topic at hand. In the sequel we will assume that the term \descriptor" refers to either actual hashtags in Twitter or to relevant keywords found in tweets.

De nition 1 (Tweet. Twitter Query). We de ne a tweet T as a bag (or multiset) of terms ft1, t2, . . . tk g, where every ti 2 T is a string. A Twitter query (or just query) is a non-empty set Q = fd1; d2; : : : ; dkg of descriptors, where every di 2 Q is a string.

In the analysis that follows, we will assume that a tweet is just a bag of words, not taking into account the actual order of terms in the tweet. Additionally, we assume that the set of all currently existing tweets corresponds to a snapshot of Twitter messages at a given xed time, as the Twitter database is highly dynamic. In our approach, a query Q is any set of descriptors used for ltering some relevant tweets from Tweets based on a given criterion C. In order to abstract away how such selection is performed, we will de ne an aggregation operator AggTweets(Q; C). Formally: De nition 2 (Tweet set. Aggregation Operator). Let Tweets be the set of all currently existing tweets. We will write 2Tweets to denote the set of all possible subsets of Tweets. Any element in 2Tweets will be called a tweet set. Given a query Q, and a criterion C, we will de ne an aggregation operator AggTweets(Q; C) which returns an element (tweet set) in 2Tweets based on Q and C.

The aggregation operator could be de ned in several ways. For instance, suppose that C1 is a criterion that indicates that only tweets posted between time timestamp1 and timestamp2 are to be selected. Then AggTweets(Q; C1) =def f T 2 Tweets such that Q T and T satis es C1 g will be the set of tweets that contain all the terms of query Q and have been posted in the time period [timestamp1,timestamp2]. Other examples of criteria that can be naturally applied are, for instance, requiring that those tweets T were retweeted more than n times, requiring that every user that posted tweets T has at least m followers, etc.

Note that for the same query Q, di erent alternative criteria (C1, C2, . . . , Ck) can lead to di erent distinguished elements in 2Tweets. As explained before, tweet sets can be associated with di erent feelings or sentiments. Even if in real life there may be a lot of emotions in tweets (such as anger, happiness, and so on), we will assume here that there is a distinguished set S of possible sentiments. Thus, given a query Q and a criterion C, we assume that the tweet set AggTweets(Q; C) is associated with a prevailing sentiment in S.5 We will consider that some 5 A possible range for S could be positive, negative and neutral (as done for example in platform Sentiment140.com) In this platform, prevailing sentiments associated with a tweet set are expressed by percentages. sentiments might convey di erent, possibly con icting feelings or emotions (e.g. anger and happiness; boredom and excitement, etc.). As before, we will abstract away which are potentially con icting sentiments as follows.

De nition 3 (Sent and conf lict mappings). Let T 2 2Tweets be a tweet set, and let Sent : 2Tweets ! S and conf lict : S ! 2S be mappings. The sentiment Sent(T) will be called the prevailing sentiment (or just sentiment) for T. For any sentiment s 2 S, we will de ne conf lict(s) as a subset of S, such that: a) s 62 S (a sentiment is not in con ict with itself ); b) for any s0 2 conf lict(s), then s 2 conf lict(s0) (the notion of con ict is symmetrical). Given two sentiments s1 and s2, we will say that they are in con ict whenever s2 2 conf lict(s1). For simplicity, given a sentiment s 2 S, we will write s to denote any s0 2 conf lict(s).

The previous elements will allow us to characterize the notion of TBframework and TB-argument as follows: De nition 4 (TB-framework). A Twitter-based argumentation framework (or TB-framework) is a 5-tuple (Tweets; C; S; Sent; conf lict), where Tweets is the set of available tweets, C is a selection criterion, S is a non-empty set of possible sentiments and Sent and conf lict are sentiment prevailing and con ict mappings.

De nition 5 (TB-argument). Let (Tweets; C; S; Sent; conf lict) be a TB-framework. A Twitter-based argument (or TB-argument) for a conclusion (query) Q is a 3-tuple hArg; Q; Sentimenti, where Arg is AggTweets(Q; C) and Sentiment is Sent(AggTweets(Q; C)).

Example 1. Consider a TB-framework (Tweets; C; S; Sent; conf lict), where C is de ned as \all T 2 Tweets j timestamp(T ) 2012-01-01T00:00:00", Q = f\#greece", \#eurozone"g, S = fpos, neg, neutralg, and conf lict(pos) =def fneg; neutralg, conf lict(neg) =def fpos; neutralg and conf lict(neutral) =def fpos; negg. Then Arg = AggTweets(Q; C) is the set of all possible tweets containing f\#greece",\#eurozone"g that have been published since January 1, 2012. Suppose that Sent(AggTweets(Q; C)) = negative. Then hArg; f\#greece"; \#eurozone"g; negativei is a TB-argument. 2.2

Speci city in a TB-framework. Opinion trees

In the previous section we have shown how to express arguments for queries associated with a given prevailing sentiment. Such arguments might be attacked by other arguments, which on their turn might be attacked, too. In argumentation theory, this leads to the notion of dialectical analysis [ 5 ], which can be associated with a tree-like structure in which arguments, counter-arguments, counter-counter-arguments, and so on, are taken into account. Our approach will be more generic, in the sense that for a given argument, the children nodes will correspond to more speci c arguments that are not necessarily in con ict with the parent argument. Next we will formalize these notions.

De nition 6 (Query Equivalence). Let (Args; Tweets; C; S; s) be a TBframework. Given two queries Q1 and Q2, we will say that Q1 is equivalent to Q2 whenever AggTweets(Q2; C) = AggTweets(Q1; C).

De nition 7 (Query Subsumption). Let (Args; Tweets; C; S; s) be a TBframework. Given two queries Q1 and Q2, we will say that Q1 subsumes Q2 whenever it holds that AggTweets(Q2; C) AggTweets(Q1; C).

Example 2. A query Q1 formed by f\#greece"g subsumes the query Q2 formed by f\#greece", \#eurozone"g, as all the tweets that are returned by Q2 will be part of the tweets returned by Q1, but not the other way around. Note that the subsumption relation is more general than the inclusion relation, since Q1 subsumes Q2 whenever Q1 Q2 (as AggTweets(Q2; C) AggTweets(Q1; C)). However, it is possible that Q1 subsumes Q2 even when Q1 6 Q2.

De nition 8 (Argument Speci city). Given a TB-framework with TBarguments hArg1; Q1; Sent1i and hArg2; Q2; Sent2i, we say that hArg2; Q2; Sent2i is strictly more speci c than hArg1; Q1; Sent1i and we denote it hArg2; Q2; Sent2i hArg1; Q1; Sent1i if Q1 subsumes Q2. We will write hArg2; Q2; Sent2i hArg1; Q1; Sent1i when Q1 subsumes Q2 or Q1 is equivalent to Q2.

Suppose that a TB-argument supporting the query \#greece" is obtained, with a prevailing sentiment neutral. If the original query Q is extended in some way into a new query Q0 that is more speci c than Q (i.e. Q0 = Q [ fdg), it could be the case that a TB-argument supporting Q0 has a di erent (possibly con icting) prevailing sentiment. For example, more speci c opinions about Greece are related to other topics, like for example vacations, politics, philosophy, etc. To explore all possible relationships associated with TB-argument returned for a speci ed query Q and criteria C, we can de ne an algorithm to construct an \opinion tree" recursively as follows: 1. We start with a TB-argument obtained from the original query Q (i.e., hA; Q; Senti ), which will be the root of the tree. 2. Next, we compute within A all relevant descriptors that might be used to \extend" Q, by adding a new element (N ewT erm) to the query, obtaining Q0 = Q [ fN ewT ermg. 3. Then, a new argument for Q0 is obtained, which will be associated with a subtree rooted in the original argument A.

The high-level algorithm can be seen in Fig. 1. As stated before, note that our approach to opinion trees is more generic than the one used for dialectical trees in argumentation (as done e.g. in [ 6 ]), in the sense that for a given argument, the children nodes will correspond to more speci c arguments that are not necessarily in con ict with the parent argument.

It is also easy to see that for any query Q, the algorithm BuildOT nishes in nite time: given that a tweet may not contain more than 140 characters, the number of contained descriptors is nite, and therefore the algorithm will eventually stop, providing an opinion tree as an output

ALGORITHM BuildOT INPUT: Tweets, R, Q, C

f Initially, R = Tweets. Each Ti 2 Tweets is represented as a multiset g OUTPUT: Opinion Tree OTQ

f opinion tree rooted in RootOTQ g RootOTQ := hArg; Q; Sentimenti, where Arg is AggTweets(Q; C) and Sentiment is Sent(AggTweets(Q; C)). IF jRj > threshold f Cardinality of R determines maximum depth level g THEN

REPEAT

W := f d j d is the most frequent word in UTi2R Ti

such that d 2= Q [ Stopwords ga IF W 6= ; THEN

Qnew := Q [ W T weetsQNew := AggTweets(Qnew; C) OTQnew := BuildOT (T weetsQNew ; R; Qnew; C) P utSubtree(RootOTQ ; OTQnew )

R := R T weetsQNew

UNTIL W = ;

RETURN OTQ a For simplicity, we assume W is a singleton. In a more general case, a distinguished element from W could be selected according to some criterion (e.g. overall frequency in Twitter, etc.). The operation U is the union operation on multisets. As discussed before, the algorithm shown in Fig. 1 allows to obtain an opinion tree from a given query Q, a criterion C, and the set Tweets of all possible tweets. An additional parameter R allows us to specify the set of tweets to be considered when searching for a new descriptor. Initially, R = Tweets. The cardinality of R determines the threshold associated with the depth of the tree.

Consider the query Q = \#greece", and a criterion C = f T 2 Tweets j T is among the 1000 most recent Tweets g. A root TB-argument is computed for Q, C and Tweets, obtaining an associated prevailing sentiment (neutral). If jRj is above a given threshold value, the algorithm computes the most frequent word d in R whenever d is not already present in Q [ Stopwords. The underlying idea is that any sibling node in the same level of the tree refers to terms not appearing already in previous siblings (from left to right). The set Stopwords will usually include terms such as the, as, which, etc. In our example, d = \coalition". A new TB-argument can now be built for query Q0 = f\#greece"g [ f\coalition"g, criterion C and the preserving sentiment calculated for the new subset of tweets T weetsQNew . In the recursive call, the most frequent word is calculated for this subset (obtaining the result \democratic"), so that a new TB-argument for the query f\#greece", \coalition", \democratic"g is obtained, with a new associated

|Q| |Q| + 1 |Q| + 2 positive negative neutral coalition #greece

Original Query

Q =#greece via #spain #eurozone #holiday government democratic #italy #ecb #austerity #crete prevailing sentiment. Note that within a particular instance of the recursive call, the REPEAT loop takes care that alternative ways of \extending" Q are considered by adding a particular descriptor d, not repeating d within the sibling nodes at a given level of the tree. The process is repeated until the threshold has been reached. At the end, the resulting opinion tree OTQ is returned.

Figure 2 illustrates how the construction of an opinion tree for the query Q = f\#greece"g looks like. Distinguished symbols (\+", \-", \=") are used to denote positive, negative and neutral sentiments, respectively. Note that the original query Q has cardinality 1, and further levels in the opinion tree refer to incrementally extended queries (e.g. f\#greece", \coalition"g, or f\#greece", \#spain"g). Leaves correspond to arguments associated with a query Q0 which cannot be further expanded, as the associated number of tweets is too small for any possible query Q0 [ W . Furthermore, we can identify some subtrees in OTf\#greece"g which consist of nodes which have all the same sentiment. In other words, further expanding a query into more complex queries does not change the prevailing sentiment associated with the root node. In other cases, expanding some queries results in a sentiment change (e.g. from f\#greece"g into f\#greece", \#spain"g ). This situation will allow us to characterize con ict trees, in which we take into account opinions that attack each other, as discussed in the next Section. 3

Con ict trees

Next we will provide a formal de nition of con ict between TB-arguments. Intuitively, a con ict will arise whenever two arguments for similar queries lead to con icting sentiments assuming that the involved queries are related to each other by the subsumption relationship.

De nition 9 (Argument Attack). Given a TB-framework with TBarguments hArg1; Q1; Sent1i and hArg2; Q2; Sent2i such that Q1 subsumes Q2, we say that hArg2; Q2; Sent2i attacks hArg1; Q1; Sent1i whenever Sent1 and Sent2 are in con ict.

Example 3. Consider two queries Q1 = f\#greece"g and Q2 = f\#greece", \#eurozone"g, such that hArg1; Q1; neutrali and hArg2; Q2; negativei. Then hArg2; Q2; negativei attacks hArg1; Q1; neutrali.

Note that in the previous situation, adding the descriptor \eurozone" to the the original query \#greece" involves a sentiment change. We will formalize this situation as follows: De nition 10 (Sentiment-Preserving and Sentiment-Shifting Descriptor). Given an argument hA1; Q; Sent1i, d is a sentiment-preserving (resp. sentiment-preserving) descriptor wrt Q whenever there exists an argument hA2; Q [ fdg; Sent2i such that Sent1 and Sent2 are non-con icting (resp. con icting). Argument hA2; Q [ fdg; Sent2i will be called sentiment-preserving (resp. sentiment-shifting argument).

Given a particular descriptor Q, note that several alternative expansions (supersets of Q) can be identi ed. We are interested in identifying which is the smallest superset of Q which is associated with a sentiment-shifting argument. This gives rise to the following de nition: De nition 11 (Minimal-Shift Descriptor. Minimal-Shifting Relation). Let (Args; Tweets; C; S; s) be a TB-framework. Given two con icting arguments hA1; Q1; Senti and hA2; Q2; Senti, we will say that Q2 is a minimal shift descriptor wrt Q1 i hA2; Q2; Senti is a sentiment-shifting argument wrt Q1 and 6 9Q0 Q2 such that hA0; Q0; Senti is a sentiment-shifting argument wrt Q1.

We de ne a minimal-shifting relation \ min " as follows: hA1; Q1; Sent1i min hA2; Q2; Sent2i i hA2; Q2; Sent2i attacks hA1; Q1; Sent1i and Q2 is a minimal-shifting descriptor wrt Q1.

De nition 12 (Con ict tree). Let (Args; Tweets; C; S; s) be a TB-framework. Given a query Q, and its associated argument hA; Q; Senti we will de ne a con ict tree for Q (denoted C(Q)) recursively as follows: 1. If there is no hAi; Qi; Sentii such that hA; Q; Senti min hAi; Qi; Sentii, then C(Q) is a con ict tree consisting of a single node hA; Q; Senti. 2. Let hA1; Q1; Sent1i, hA2; Q2; Sent2i, . . . hAk; Qk; Sentki be those arguments in (Args; Tweets; C; S; s) such that hA; Q; Senti min hAi; Qi; Sentii (for i = 1 : : : k). Then C(Q) is a con ict tree consisting of hA; Q; Senti as the root node and C(Q1), . . . C(Qk) are its immediate subtrees.

Intuitively, a con ict tree depicts all possible ways of extending the original query Q such that every extension (child node in the tree) corresponds to a sentiment change. Figure 2 illustrates how the construction of a con ict tree for the query Q = f\#greece"g looks like, depicting nodes and arcs with dotted lines. Every node in the tree (except the root) is associated with a TB-argument which is a sentiment-shifting argument wrt its parent. Leaves correspond to nodes for which no further sentiment shift can be found. 4

Generalizing Opinion and Con ict Trees as Superior Lattices

Next we will show a formal lattice-based characterization of our approach. Superior lattices will account for a more generic view of opinion trees, identifying relevant sublattices based on an equivalence relation between TB-arguments. First we will review some background de nitions to make our presentation selfcontained.

De nition 13 (Partial Order. Partially Ordered Set). A partial order is a binary relation \ " over a set A which is re exive, antisymmetric, and transitive, i.e., for all a, b, and c in A, we have that (1) a a (re exivity); if a b and b a then a = b (antisymmetry); if a b and b c then a c (transitivity). A set with a partial order is called partially ordered set (or just ordered set).

De nition 14 (Cover Relation). Given an ordered set (A; ), for two elements a; b 2 A we use a b to specify that a b and a 6= b. Let (A; ) be an ordered set. Then for any a; b 2 A we say that a covers b if b a and there is no c 2 A such that b c a.

De nition 15 (Tree Order). An ordered set (A; ) is a tree if (1) there is a unique a 2 A such that b a for all b 2 A, and (2) for all a; b; c 2 A, if b covers a and c covers a, then b = c.

De nition 16 (Superior Lattice. Inferior Lattice. Lattice). Let (A; ) be an ordered set. Then for any a; b 2 A, we will say that c 2 A is the least upper bound of a and b (also called the join of a and b), denoted c = a _ b, whenever i) a c and b c; ii) if for x 2 A, it holds that x a and x b, then c x. An ordered set (A; ) is a superior lattice whenever for any pair of elements a; b 2 A there is a least upper bound element in A. The notions of greatest lower bound (or meet), denoted c = a ^ b, and inferior lattice are de ned analogously as the duals of the notions of least upper bound and superior lattice. An ordered set (A; ) that is both a superior lattice and an inferior lattice is called a lattice. De nition 17 (Join-Homomorphism. Meet-Homomorphism. Lattice Homomorphism). The mapping h from (X; ) to (Y; ) is a joinhomomorphism provided that for any a; b 2 x, h(a _ b) = h(a) _ h(b). It is also said that \h preserves joins." The notion of meet-homomorphism is de ned analogously as the dual of the notion of join-homomorphism. The mapping h is a lattice homomorphism if it is both a join-homomorphism and a meet-homomorphism.

+ + + + +

In the rest of this section we will show that the above de nitions provide a solid mathematical foundation for the study of TB-arguments. Note, in the rst place, that the ordered set (2Tweets; ) is a lattice, as is the case for any power set of a given set, ordered by inclusion. The join is given by the union and the meet by the intersection of the subsets. More interestingly, it can be shown that for any query Q, the resulting opinion tree OTQ associated with a query Q de nes a tree order (see Def. 15).

Lemma 1. Let Q be a query and let OTQ be an opinion tree for Q in a TBframework (Args; Tweets; C; S; s). Then (OTQ; ) de nes a tree order.6

On the left-hand side of Figure 3 we illustrate an opinion tree as a tree order (OTQ; ). Note that each element in OTQ is of the form hArgi; Qi; Sentimentii, while the order relation \ " is de ned as hArg1; Q1; Sentiment1i hArg2; Q2; Sentiment2i if and only if Q2 Q1. In this opinion tree we have indicated that some queries are equivalent (see Def. 6). As a consequence, we can identify a quotient set, where each member is an equivalence class [hArgi; Qi; Sentimentii] de ned as follows: [hArgi; Qi; Sentimentii] = fhArgj ; Qj ; Sentimentj i j Qj is equivalent to Qig:

We show on the right-hand side of Figure 3 the quotient set resulting from the given opinion tree. Note that this new set is a superior lattice (see Def. 16). In general, any opinion tree induces a superior lattice OLQ, which we will refer to as opinion lattice. This is formally stated in the following lemma: Lemma 2. Let Q be a query and let OLQ be the quotient set of OTQ by the query equivalence relation. Then (OLQ; ) is a superior lattice.

Although an opinion lattice is typically more compact than an opinion tree, we might be interested in nding the minimal structure that re ects all existing 6 Proofs not included for space reasons.

+ - + + + con icts between opinions for a given query Q. In other words, we want to build a minimal superior lattice (CLQ; ) such that it is possible to de ne a joinhomomorphism h (see Def. 17) from (OLQ; ) to (CLQ; ). In addition, we will require that whenever h(hAi; Qi; Sentii) = hAj ; Qj ; Sentj i then Senti and Sentj are non-con icting. We will call CLQ the con ict lattice for Q. By applying a partitioning algorithm it is possible to obtain a con ict lattice from any opinion lattice. This transformation is illustrated in Figure 4. 5

Discussion

Our mathematical characterization of opinion and con ict trees as superior lattices provides a natural foundation for the analysis of important concepts prevailing in argumentation theory. In particular, the use of con ict lattices to represent diverging arguments leads to the identi cation of the minimal structure that re ects the existing collective positions with respect to a topic of interest.

Argument speci city is a key notion in argumentation theory, as it is the rst purely syntactic preference criterion proposed to compare arguments. In our framework, speci city goes hand by hand with the \ " relation identi ed in the resulting superior lattices. The use of minimal structures to represent conicting views facilitates the identi cation of speci city relations as well as the recognition of relevant (or irrelevant) elements in the argumentation space, as it is formalized by the notions of sentiment-shifting descriptors (or sentimentpreserving descriptors). Similarly, the minimal-shift relation \ min" can be intuitively studied in the light of the resulting mathematical structures.

It must be remarked that our dialectical analysis of TB-arguments aims at modeling the possible space of alternatives associated with di erent (incrementally more speci c) queries. In contrast, the dialectical analysis in standard argumentation frameworks [ 4, 5 ] aims at determining the ultimate status of a given argument at issue (in terms of some acceptability semantics).

Related Work

Our approach is inspired by recent research in integrating argumentation and social networks. In the last years, there has been growing interest in assessing meaning to streams of data from microblogging services such as Twitter, as well as some recent research on using argumentation for social networks.

To the best of our knowledge, Torroni & Toni [ 7 ] were the rst that combined social networks and argumentation in a uni ed approach, coining the term bottom-up argumentation for the grass-root approach to the problem of deploying computational argumentation in online systems. In this novel view, argumentation frameworks are obtained bottom-up starting from the users' comments, opinions and suggested links, with no top-down intervention of or interpretation by \argumentation engineers". As the authors point out \topics emerge, bottomup, during the underlying process, possibly serendipitously". In contrast with that proposal, in this paper we generalize this view by identifying arguments automatically from Twitter messages, establishing as well con ict relationships in terms of sentiment analysis (and not speci ed at the meta-level using rules, as is the case of [ 7 ]). In [ 8 ], Abbas and Sawamura formalize argument mining from the perspective of intelligent tutoring systems. In contrast with our approach, they rely on a relational database, and their aim is not related with identifying arguments underlying social networks as done in this paper. In [ 9 ], Leite and Martins introduce a novel extension to Dung's abstract argumentation model, called Social Abstract Argumentation. Their proposal aims at providing a formal framework for social networks and argumentation, incorporating social voting and de ning a new class of semantics for the resulting frameworks. In contrast with our approach, the automatic extraction of arguments from social networks data is not considered (as done in this paper), nor the modeling of con icts between arguments in terms of sentiment analysis. In [ 10 ], Amgoud and Serrurier propose a formal argumentation-based model for classi cation, which generalizes the well-known concept learning model based on version spaces [ 11 ]. The framework shares some structural similarities with our approach (as a latticebased characterization is also involved when contrasting hypotheses). However, the aims of the two approaches are di erent, as our proposal is not focused on solving classi cation tasks in a machine learning sense.

A related research area is formal concept analysis [ 12 ], which is a method for deriving conceptual structures out of data. As done in our approach, the theory of partial orders is used to formally characterize these structures. However, it di ers from our proposal in dealing with concepts rather than opinions and in not attempting to associate sentiments with the elements of the partial order. In addition, it does not deal with notions such as arguments, con ict and attack.

It must be remarked that the rise of social media such as blogs and social networks has fueled interest in sentiment analysis. With the proliferation of reviews, ratings, recommendations and other forms of online expression, online opinion has turned into a kind of virtual currency for businesses looking to market their products, identify new opportunities and manage their reputations. Several research teams in universities around the world currently focus on understanding the dynamics of sentiment in e-communities through sentiment analysis. The EU funded Cyberemotions consortium7 was created in 2009 to better understand collective emotional phenomena in cyberspace, with the help of knowledge and methods from natural, social, and engineering sciences. Within this project, Thelwall et al.[ 13, 14 ] carried out a number of experiments to assess the feasibility of sentiment analysis within social networks, with a particular focus on Twitter. In contrast with our approach, no opinion mining was considered in this context, nor the analysis of alternative opinions (as modelled by con ict trees in our proposal). 7

Conclusions and Future Work

In this paper we have presented a novel approach which integrates argumentation theory and microblogging technologies, with a particular focus on Twitter. To the best of our knowledge, no other approach has been developed in a similar direction. We have also presented a de nition of a Twitter-based argument for a query Q that considers as a support the bunch of tweets which are associated with Q according to a given criterion. For such an argument, we also de ne a prevailing sentiment, obtained in terms of sentiment analysis tools. This allowed us to characterize the notion of opinion tree, which can be recursively built by considering arguments associated with incrementally extended queries . We have implemented a prototype of our proposal as a proof of concept, which was used to compute the opinion tree for the case study presented in the paper.

We have also presented a theoretical setting for analyzing Twitter-based arguments, associating a superior lattice rooted in the initial argument for the rst given query. Based on the notion of attack between arguments, we have established as well a re ned order relationship between con icting arguments. As a result, from every superior lattice associated with a given query Q, a con ict tree rooted in Q can be built, in which alternating opinions can be better contrasted. Given a node A (argument) associated with query Q0 with a prevailing sentiment s, every children node for A in a con ict tree corresponds to an argument for a more speci c query Q0, which is in con ict with A as it is associated with a sentiment shift. Con ict trees allow us to explore the space of possible confronting opinions associated with a given opinion, using the speci city principle as traditionally used in argumentation for preferring arguments.

Part of our future work is associated with deploying the ideas presented in this paper in a software product. As a basis for such deployment, visual tools for displaying and analyzing dialectical trees have been already developed for Defeasible Logic Programming [ 15 ]. We expect to use the underlying algorithms from this tool in our framework. Additionally, we expect to perform di erent experiments with hashtags associated with relevant topics, assessing the applicability of our approach in a real-world context. Additionally, we are working on extending the current Twitter-based model to a more generic setting, in which

7 http://www.cyberemotions.eu/

opinions are collected from other social networks (such as Facebook).8 Research in this direction is currently being pursued.

Acknowledgments: This research is funded by Projects LACCIR R1211LAC004 (Microsoft Research, CONACyT and IDB), PIP 112-200801-02798, PIP 112-20090100863 (CONICET, Argentina), PGI 24/ZN10, PGI 24/N006, PGI 24/N029 (SGCyT, UNS, Argentina) and Universidad Nacional del Sur.

8 http://www.facebook.com

1. Martineau , J.: Identifying and Isolating Text Classi cation Signals from Domain and Genre Noise for Sentiment Analysis . PhD thesis , University of Maryland, Baltimore County, USA ( 2011 )

2. Pak , A. , Paroubek , P. : Twitter as a corpus for sentiment analysis and opinion mining . In Calzolari, N., Choukri , K. , Maegaard , B. , Mariani , J. , Odijk , J. , Piperidis , S. , Rosner , M. , Tapias , D., eds.: LREC, European Language Resources Association ( 2010 )

3. Prakken , H. , Reed , C. , Walton , D. : Argumentation schemes and generalizations in reasoning about evidence . In: ICAIL . ( 2003 ) 32 { 41

4. Besnard , P. , Hunter , A. : The Elements of Argumentation . The MIT Press. London, UK ( 2008 )

5. Rahwan , I. , Simari , G. In: Argumentation in Arti cial Intelligence. Springer Verlag ( 2009 )

6. Garc

, A.J. , Simari , G.R.: Defeasible logic programming: An argumentative approach . TPLP 4 ( 1 -2) ( 2004 ) 95 { 138

7. Torroni , P. , Toni , F. : Bottom up argumentation . In: Prof. of First Intl. Workshop on Theoretical and Formal Argumentation (TAFA) . IJCAI 2011 , Barcelona, Spain. ( 2011 )

8. Abbas , S. , Sawamura , H.: Argument mining based on a structured database and its usage in an intelligent tutoring environment . Knowl. Inf. Syst . 30 ( 1 ) ( 2012 ) 213 { 246

9. Leite , J. , Martins , J.: Social abstract argumentation . In Walsh, T., ed. : IJCAI, IJCAI/AAAI ( 2011 ) 2287 { 2292

10. Amgoud , L. , Serrurier , M. : Agents that argue and explain classi cations . Autonomous Agents and Multi-Agent Systems 16(2) ( 2008 ) 187 { 209

11. Mitchell, T.M.: Generalization as search . Artif. Intell . 18 ( 2 ) ( 1982 ) 203 { 226

12. Ganter , B. , Wille , R.: Formal concept analysis - mathematical foundations . Springer ( 1999 )

13. Thelwall , M. , Buckley , K. , Paltoglou , G.: Sentiment strength detection for the social web . JASIST 63 ( 1 ) ( 2012 ) 163 { 173

14. Thelwall , M. , Buckley , K. , Paltoglou , G.: Sentiment in twitter events . JASIST 62 ( 2 ) ( 2011 ) 406 { 418

15. Modgil , S. , Toni , F. , Bex , F. , Bratko, I., Chesn~evar, C. , Dvorak , W. , Falappa , M.A. , Gaggl , S.A. , Garc a , A.J. , Gonzalez , M.P. , Gordon , T.F. , Leite , J. , Mozina , M. , Reed , C. , Simari , G.R. , Szeider , S. , Torroni , P. , Woltran , S.: The Added Value of Argumentation: Examples and Challenges . In: Handbook of Agreement Technologies. Springer ( 2012 ) (in press)