Expressing Opinion Diversity Andreea Bizău Delia Rusu Dunja Mladenić Faculty of Mathematics and Computer Artificial Intelligence Laboratory Artificial Intelligence Laboratory Science Babeș-Bolyai University Jožef Stefan Institute Jožef Stefan Institute Cluj-Napoca, Romania Ljubljana, Slovenia Ljubljana, Slovenia andreea.bizau@gmail.com delia.rusu@ijs.si dunja.mladenic@ijs.si ABSTRACT public opinion has taken on a new form. Social networking The focus of this paper is describing a natural language processing encourages the exchange of information and sharing of opinions methodology for identifying opinion diversity expressed within between individuals, friends and communities. Therefore, in our text. We achieve this by building a domain-driven opinion case study we directly address movie comments, as posted on vocabulary, in order to be able to identify domain specific words Twitter, a popular social networking and microblogging website, and expressions. As a use case scenario, we consider Twitter and aim at identifying the diversity of opinions expressed in comments related to movies, and try to capture opinion diversity tweets related to movies. We determine a variety of polarized by employing an opinion vocabulary, which we generate based on opinion words about a certain movie, and use these word a corpus of IMDb movie reviews. frequency counts to obtain an overall aggregated opinion about the movie. Moreover, we can observe variations in opinions over Categories and Subject Descriptors time, related to a certain movie, by comparing the word frequency I.2.7 [Natural Language Processing]: Text analysis. counts obtained from tweets belonging to a time interval (e.g. an hour, day, week). General Terms The paper is structured as follows: in Section 2 we describe our Algorithms, Design. algorithm for constructing a domain-driven opinion vocabulary, Keywords while Section 3 presents the Twitter movie comments use-case. The last section of the paper is dedicated to conclusions and future Opinion mining, natural language processing, social networks. work. 1. INTRODUCTION 2. DOMAIN DRIVEN OPINION Information is expressed on the Web under a variety of forms, some of them more formal and standardized, like news articles, VOCABULARY others more spontaneous, ad-hoc, like blogs or microblogs. One We start from the idea that expressing opinions is dependent on challenge is to tap into these sources, and allow for a diverse the topic’s context and we focus on the role of adjectives as representation of information on the same topic, presenting opinion indicators; in the future we plan to broaden this line of different points of view, opinions, arguments. work by including verbs and adverbs. The starting point is represented by a domain-specific corpus, from which we In this work we are describing a natural language processing determine a small number of seed opinion words that we further methodology for discovering the diversity of opinions expressed extend, thus forming a domain-driven opinion vocabulary. within text, which we deem to be an essential step to expressing and presenting diverse information on the Web. In this context, There are three main approaches to constructing an opinion we consider an opinion as a subjective expression of sentiments, vocabulary: manual, dictionary based and corpus based. The appraisals or feelings, and opinion words as a set of manual approach is not really in line with our work, as we are keywords/phrases used in expressing an opinion. As such, the considering automatic, scalable approaches. The dictionary based orientation of an opinion word indicates whether the opinion approach provides a simple and efficient way of obtaining a good expressed is positive, negative or neutral, while the totality of vocabulary. SentiWordNet [3] is a publicly available lexical opinion words forms an opinion vocabulary. While opinion words resource. It provides tags of all WordNet [4] synsets with three can be analyzed in their base form (describe and convey the numerical scores (objective, positive, negative), offering a general opinion directly) and comparative form (convey the opinion opinion vocabulary with good coverage. However, the dictionary- indirectly, by comparison with other entities), this research based approach cannot account for the domain specific orientation focuses only on base type opinion words. of words, nor can it identify domain specific words and expressions. As an example, consider the word unpredictable. In In the context of the ever expanding world of social media and most situations it will express an undesirable quality (e.g. user generated content, instant access, world-wide coverage and unpredictable car behavior), thus its orientation will be negative; diversity of perspective are the norm of the information flow. As but in the movie domain, an unpredictable plot is something an application of our approach, we propose to study the movie desired and indicates a positive opinion. In order to account for domain. There is a strong user interest in watching, tracking and domain specificity, we decided to employ a corpus based discussing movies, generating highly diverse opinion content. approach. Movies are subject to a variety of classifications, expanding the field of analysis. Moreover, the lifespan of a movie topic is longer V. Hatzivassiloglou et al [6] showed the relevance of using than for usual topics, thus introducing a temporal dimension that connectives in gathering information about the orientation of can be further explored. Nowadays, accessing and assessing the conjoined adjectives. They emphasized that conjoined adjectives are of the same orientation, for most connectives, but reversing The result of this step is an expanded seed word list together with the relationship. The connectives are conjunctions used to join one their orientation score. or more adjectives together. In our algorithm we used a subset of 2. From a corpus of documents, we parse and extract all the possible conjunctions (and, or, nor, but, yet), that cover many adjectives and conjunctions, constructing a set of relationships common syntactic patterns and are easier to correlate with the between the determined words. There can be two types of adjectives that they connect. relationships, indicating if two or more words have the same Other lines of research, like S.-M. Kim and E. Hovy [7] try to context orientation (words connected by and, or, nor) or opposite identify opinion expressions together with their opinion holder orientation (words connected by but, yet). We will refer to them in starting from a word seed list and use the WordNet synsets to the following algorithms as ContextSame and ContextOpposite determine the strength of the opinion orientation for the identified relations, respectively. opinion words. M Gamon and A. Aue [5] extend the Turney-style [9] approach of assigning opinion orientation to the determined 1. G = ({}, {}) candidate words, working under the assumptions that in the 2. foreach document d in corpus opinion domain, opinion terms with similar orientation tend to co- 3. foreach sentence s in d occur, while terms with opposite orientation do not tend to co- 4. parseTree = GetParseTree(s) occur at sentence level. 5. {w,c} = RetrieveWordsAndConjunctions(parseTree) V Jijkoun et al [10] propose a different style of approach, by 6. ConstructRelationGraph(G, {w, c}) starting from an existing lexicon (clues) and focusing it. They 7. HandleNegation(G, s) perform a dependency parsing on a set of relevant documents, Figure 1. The algorithm for constructing the relationship resulting in triplets (clue word, syntactic context, target of graph G. sentiment) that represent the domain specific lexicon. H. Based on the determined relations, we can then construct a Kanayama and T. Nasukawa [11] apply the idea of context relationship graph G(W, E), where coherency (same polarity tend to appear successively) to the  W={set of determined adjectives} and Japanese language. Starting from a list of polar atoms (minimum syntactic structure specifying polarity in a predicative expression),  E={wiwj, where wi, wj from W if there is a determined they determine a list of domain specific words using the overall relationship between wi and wj, each edge having a positive density and precision of coherency in the corpus. Sinno Jialin Pan weight for the ContextSame relationship and a negative et al [12] propose a cross-domain classification method. Starting weight for the ContextOpposite relationship}. from a set of labeled data in a source domain and determining In what follows, we describe the algorithm for building the domain-independent words (features) that occur both in the source relationship graph G (see Figure 1). and the target domain, they construct a feature bipartite graph that models the relationship between domain-specific words and independent words. To obtain the domain specific words they use an adapted spectral clustering algorithm on the feature graph Based on these premises, we propose a method to construct an opinion vocabulary by expanding a small set of initial (seed) words with the aid of connectives. The method consists of four steps, as follows: 1. Given a positive word seed list and a negative word seed list and making use of WordNet’s synsets, we expand the initial seed lists based on the synonymity / antonymy relations. The initial words will be assigned a score of 1 for positive words and -1 for negative words, respectively. We compute the orientation score for each newly found word by recursively processing the synsets for each seed word. A word can be found in synsets corresponding to different seed words, either in a synonymity or antonymy relations. Another factor we take into Figure 2. The parse tree and analysis of the sentence “The account is the distance between the seed word and the currently action is mindless and cliché, but amusing”. We identify processed word, as provided by the WordNet hierarchy. From mindless, cliché, amusing as adjectives (having the JJ tags) these two considerations, a more formal way to compute the score connected by and, but (having the CC tags). of a word (sw) to be added to the seed list is: ( ) ( ( )) We used a maximum entropy parser1 to retrieve a sentence’s parse tree that we then analyze in the RetrieveWordsAndConjunctions where procedure. We construct an adjective stack w and a conjunction stack c by extracting the relevant nodes according to their part-of- { speech tags and group them together based on the common parent and o is a seed word, while f is a parameter for which we node between the adjective nodes and the conjunctions nodes. In empirically assigned values between 0 and 1 (in our current the ConstructRelationGraph, we will add the nodes for each implementation f = 0.9); in our future work we plan to determine newly found adjective and add new edges to the relationship its value by optimization. 1 http://sharpnlp.codeplex.com/ graph G according to each conjunction’s behavior. Each edge has 3. USE CASE: TWITTER MOVIE an associated weight with values between 0 and 1, determined by optimization. We handle the presence of negation in the sentence COMMENTS by reversing the type of the relation, if a negation is detected. For Concerning the movie domain, research was done in classifying example, considering the sentence “Some of the characters are movie reviews by overall document sentiment [8], but there are fictitious, but not grotesque”, the initial relation between fictitious few lines of research connecting the movie domain with social and grotesque would be a ContextOpposite relationship, but the media. Sitaram Asur and Bernardo A. Huberman [1] demonstrate presence of the negation is converting it to a ContextSame how sentiments extracted from Twitter can be used to build a relationship. We depict another example visually, in Figure 2. prediction model for box-office revenue. 3. The third step implies cleaning the resulting set of words and Our aim is to see how well a domain specific vocabulary relationship graph by removing stop words and self-reference constructed from movie reviews performs when applied to relations. Consider the example “The movie has a good casting analyzing tweets. We used a document corpus of 27,886 IMDb and a good plot”. The algorithm detects a ContextSame (Internet Movie Database) movie reviews 3 and constructed a relationship between the adjective good and itself. Since there is movie domain specific vocabulary according to the approach no useful context information we can use, we do not want them to presented in Section 2. We retrieved 9,318 words, from which influence the results of the scoring done in the next step. 4,925 have a negative orientation and 4393 have a positive orientation. Table 1 shows a few examples of positive and 4. In the fourth step, we determine the orientation of the words negative adjectives extracted from the movie review corpus. extracted from the corpus by applying an algorithm on the relationship graph obtained in the previous steps, which was Table 1. Examples of adjectives that were extracted. inspired by the well-known PageRank algorithm [2]. For this, we define two score vectors, a positivity score sPos and a negativity Positive words Negative words score sNeg, respectively. We choose the final score to be the sum surprised, original, breathless, syrupy, uninspiring, of the positivity and negativity score. The sign of the score chilling, undeniable, disturbing, forgettable, frustrating, represents the word’s orientation, that is, a positive score irresistible, speechless, mild, contrived, laughable, characterizes a positive opinion orientation, while a negative score stylized, amazed, provoking, restrained, showy, preachy, characterizes a negative opinion orientation. The algorithm is shocking, undisputed, amateur, dogmatic, presented in Figure 3, and described in what follows. unforgettable, electrifying, edgeless, foreseeable, enraptured, explosive, ordinary, standard, saleable, unanticipated, unforeseen, usual, predictable 1. InitializeScoreVectors(sPos(W), sNeg(W)) recommended 2. do { 3. foreach word wi in W Table 2. Top opinion words identified for the highest and lowest ranking movies in our search 4. foreach relation relij in relationship graph G that contains wi Inception (2010) Meet the Spartans (2008) Positive words: good, great, Positive words: funny, 5. if relij is a ContextSame relation awesome, amazing, favorite, awesome, great 6. sPos(wi) += weigth(relij) * prevSPos(wj) fantastic, incredible, thrilling, Negative words: bad, stupid, different, speechless 7. sNeg (wi) += weigth(relij) * prevSNeg(wj) dumb, weird, silly, common, Negative words: bad, ridiculous, terrible 8. else if relij is a ContextOpposite relation confusing, weird, stupid, 9. sPos(wi) += weigth(relij) * prevSNeg(wj) dumb, boring, predictable, 10. sNeg (wi) += weigth(relij) * prevSPos(wj) horrible, disappointing 11. NormalizeScores(sPos(wi), sNeg(wi)) For our tests, we crawled 220,387 tweets, using the Twitter 12. } while more than 1% of the words wi in W change Search API 6 , over a two month interval, keyed on 84 movies, orientation spanning different genres and release dates. As search keywords Figure 3. The algorithm for determining the orientation of we used the movie name and the movie tag, in order to increase words extracted from a corpus. the relevance of the results. We used a simple tokenizer to split We initialize the score vectors based on the orientation scores of the text of the retrieved tweets and kept the tokens that had a the expanded seed word list (see step 1). We will assign the dictionary entry as adjectives. We then matched the tweet corresponding positivity or negativity score swj for each adjective adjectives to our domain specific vocabulary. For all subsequent w found in the seed list. For the opposite score we assign a very analysis we only considered adjectives that were used in tweets small value (ε), in order to allow for meaningful values when and also appeared in our vocabulary, since we were intersected to computing the score for ContextOpposite relations. see the relevance of our vocabulary in terms of actual usage and A ContextSame relation enforces the existing positive and frequency over time. Without actually classifying each tweet, we negative scoring of wi proportionally with the scoring of wj. A counted the frequency of positive and negative opinion words that ContextOpposite enforces the negativity score of wj with respect we identified in the collection of tweets. An example of top to the positivity of wi, and the positivity score of wj with respect to opinion words that we identified for the highest and lowest the negativity score of wi. 3 http://www.cs.cornell.edu/people/pabo/movie-review-data/ 6 http://search.twitter.com/api/ ranking movies are shown in Table 2. Table 3 presents a sample 4. CONCLUSION AND FUTURE WORK of the movies that we analyzed, showing for each movie the In this paper, we presented an approach to identifying opinion genre, number of tweets, our score obtained by counting the diversity expressed within text, with the aid of a domain-specific positive opinion words and the IMDb score. In Figure 4 we vocabulary. As a use case, we processed a corpus of IMDb movie represent graphically the positive and negative opinion word reviews, extracted a set of adjectives together with their opinion counts for the movie Inception. orientation and used the generated opinion lexicon to analyze a different opinion source corpus, i.e. a tweet collection. For future Table 3. A sample of the movies that we analyzed, showing for work, we plan to further extend our algorithm to include opinion each movie the genre, number of tweets, our score obtained by words expressed by verbs and adverbs, as well as more complex counting the positive opinion words and the IMDb score. expressions. A second item point is carrying out a set of Movie Genre Our IMDb Tweets experiments in order to determine the correlation between positive score score opinion words for a given movie and the IMDb movie rating. Inception mystery, sci-fi, 66.52 8.9 19,256 Thirdly, from the lessons learned, we would look into applications (2010) thriller in other domains like product reviews. Megamind animation, 67.71 7.3 8,109 (2010) comedy, family 5. ACKNOWLEDGMENTS Unstoppable drama, thriller 63.67 7 15,349 The research leading to these results has received funding from (2010) the Slovenian Research Agency and the European Union's Burlesque drama, music, 70.78 6.2 1,244 Seventh Framework Programme (FP7/2007-2013) under grant (2010) romance agreement n°257790. Meet the comedy, war 40.67 2.5 44 Spartans 6. REFERENCES (2008) [1] Asur, S. and Huberman, B. A. 2010. Predicting the Future Pootie Tang comedy, 45.88 4.5 79 With Social Media. In Proceedings of the ACM International (2001) musical Conference on Web Intelligence. Matrix action, sci-fi 56.65 8.7 1,947 [2] Brin, S. and Page, M. 1998. Anatomy of a large-scale (1999) hypertextual Web search engine. In Proceedings of the 7th Blade drama, sci-fi, 56.65 8.3 407 Conference on World Wide Web (WWW). Runner thriller [3] Esuli, A. and Sebastiani, F. 2006. SENTIWORDNET: A (1982) Publicly Available Lexical Resource for Opinion Mining. In Metropolis sci-fi 66.23 8.4 419 Proceedings of the 5th LREC. (1927) [4] Fellbaum, Ch. 1998. WordNet: An Electronic Lexical Database. MIT Press. [5] Gamon, M and Aue, A. 2005. Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms. In Proceedings of the ACL Workshop on Feature Engineering for Machine Learning in NLP. [6] Hatzivassiloglou, V. and McKeown, K. 1997. Predicting the semantic orientation of adjectives. In Proceedings of the 35th Annual Meeting of the ACL. [7] Kim, S-M. and Hovy, E. 2004. Determining the sentiment of opinions. In Proceedings of COLING. [8] Pang, B. and Lee, L. 2002. Thumbs up? Sentiment Figure 4. Word distribution for the movie Inception over Classification using Machine Learning Techniques. In 19,256 tweets. Proceedings of EMNLP. In the cases presented in Table 3, there is a relationship between [9] Turney, P. D. 2002. Thumbs up or thumbs down? Semantic the number of positive opinion words and the rating from IMDb. orientation applied to unsupervised classification of reviews. One thing to notice is that in IMDb the movie ratings can be In Proceedings of the 40th Annual Meeting on ACL. roughly grouped in three categories: ratings between seven and [10] Jijkoun, V., de Rijke, M. and Weerkamp, W. 2010. ten points accounting for good and very good movies, between Generating Focused Topic-Specific Sentiment Lexicons. In five and seven points for average movies, and below five points Proceedings of the 48th Annual Meeting of the ACL. for poor quality movies. Our positive opinion word count has a [11] Kanayama, H. and Nasukawa, T. 2006. Fully Automatic maximum of approximately 70 (or seven on a scale from zero to Lexicon Expansion for Domain-Oriented Sentiment Analysis. ten). In our future work we plan to conduct a series of In Proceedings of the EMNLP. experiments in order to determine if there exists a correlation between the two numbers: the IMDb rating and the number of [12] Pan, S. J., Ni, X., Sun, J-T, Yang, Q. and Chen, Z. 2010. positive opinion words. This involves collecting a higher number Cross-Domain Sentiment Classification via Spectral Feature of movie related tweets (in the order of hundreds) in order to be Alignment. In Proceedings of the World Wide Web able to report significant results. Conference (WWW).