Usage of Abstract Features in Semantic Sentiment Analysis Mohammed Almashraee1 , Kia Teymourian1 , Dagmar Monett-Diaz2 , and Adrian Paschke1 1 Freie Universität Berlin, Corporate Semantic Web Working Group 2 Department of Computer Science, Berlin School of Economics and Law {almashraee,kia,paschke}@inf.fu-berlin.de dagmar.monett-diaz@hwr-berlin.de Abstract. Feature-based sentiment analysis can be realized on di↵erent types of object features. Some of these features might be about technical aspects of the objects and some others might be application-oriented fea- tures. The application-oriented features are more abstract features and can be of interest to the broad number of people than only the technical experts of specific products. In this paper, we propose an approach for extraction of abstract object features from a set of sub-features. In our approach we use a knowledge base about the application domain to ex- tract related sub-features of high-level abstract features. On the basis of such related sub-features our approach performs the extraction of more abstract features that are only implicitly included in the analysis text. 1 Introduction People express their opinions about certain objects using features. For example, in photography application domain, users of digital cameras express their opin- ions about features of cameras like flash or lens. Consumers make use of the opinions expressed to know about the quality of a product and its aspects so that they can make the right purchase decisions [1,2]. Sentiment analysis of technical-oriented features like flash, lens, optical zoom, shutter, sensor quality are interesting for professional photographers who are familiar with the technical details of cameras and know which one of the fea- tures are important for which kind of photography modes. Contrary to that, non-expert users are interested in features which are more abstract and are application-oriented. For example non-experts are interested to know if the cam- era can make good pictures of kids or if the camera can take pictures of landscape during their vacations. Such implicit high-level abstract features are mostly not explicitly mentioned in the review corpus or are only implicitly mentioned in some of the review items. The main di↵erence between abstract features and sub-features is that abstract features are non-technical aspects which are rarely available among related reviews, while sub-features are technical aspects which can be found frequently in the review corpus. In this paper, we propose an approach for the semantic sentiment analysis of abstract features based on the related sub-features. The abstract features can be derived from the explicitly mentioned sub-features that are related to the abstract feature. The system in general performs sentiment analysis on the 14 review text on the basis of an extracted set of sub-features. Table 1 shows some examples of such abstract futures and the related sub-features. Table 1. Examples of Abstract- and Sub-Features of Digital Cameras Abstract-Features Related Sub-Features Night photo Flash, Lens, Image Processor, Sensors Portrait Optical Zoom, Lens, Image Processor Sports Shutter, Image Processor, Flash, Sensors Landscape Optical Zoom, Shutter, Flash, Sensors Kids Photography Shutter, Image Processor, Sensors, Flash As an example consider the following of review text3 : “It automatically selects the best shooting settings for optimal quality based on the environmental factors (lightning I guess) to provide point’n’ shoot simplicity. 16.0 Megapixels, with loads of resolution pictures are still clear. High resolution is also good for producing biggest printouts. 5x Optical Zoom is sufficient in most cases. DIGIC 4 Image Processor is not as fast as DIGIC 5 though fast and powerful enough to give you advanced system options, provide quick-shoot with reliable performance and low battery consumption. As far as I know DIGIC 4 is currently Canon’s most efficient processor for budget cameras. BTW it has some Eco mode, that is said to be providing even faster warm-up times and saves the standard battery, but I haven’t tested it yet. Very lightweight, just put it into your pocket, can take it everywhere. Like A2300 it lacks optical image stabilization, though it’s got digital image stabilization. 1/2.3” sensor, well, entry level CCD providing good pictures, not of a DSLR quality, that’s all I can say.” 2 Abstract Features in Semantic Sentiment Analysis Our approach consist of the following processing tasks: 1. Feature Extraction: In this task the related sub-features are identified. 2. Knowledge-based Feature Annotation: By using a knowledge-based annota- tor the sub-features can be annotated with their background knowledge re- sources. 3. Feature Preparation: The background knowledge for each annotated resource is retrieved from knowledge base and enriched to them. 4. Sentiment Relation Calculation: Based on the specified relations of sub- features to the abstract features in a background knowledge base the senti- ment of abstract feature are calculated. As a general conceptual solution, we propose to parse the text to collect features, names, name phrases and other parts which constitute the features. We split each review into sentences and then parse each sentence to extract the feature(s) it contains. For knowledge-based feature extraction we propose to 3 Review Example from Amazon Online Store http://www.amazon.com/Canon-PowerShot-A2500-Stabilized-2-7-Inch/dp/ B00B5HE2UG/ 15 use a knowledge-based feature annotation that can recognize names of concepts or entities have been mentioned in the text. Using knowledge-based resource annotation systems like DBpedia Spotlight4 or AlchemyAPI5 it is possible to collect the target features from the review text. Such entity annotation system can be used with a knowledge base specially made for the application domain. Knowledge-based feature annotation and feature preparation system can ex- tract from the given example features like: “best shooting settings”, “lightning”, “good result”, “shoot simplicity”, “16.0 Megapixels”, “DIGIC 4 Image Proces- sor” and “faster warm-up times”. Annotation is a task of adding more information to an existing object like text, image and video. The major advantage of using semantic annotation is that we can relate the entities to their knowledge base resources so that we can extract background knowledge about them. As a general conceptual solution, the set of extracted features from the feature extraction task is enriched and extended using entity recognition and ontological reasoning. The feature enrich- ment process is realized using a knowledge-based annotator. The examples of such features and their knowledge base types are shown in Table 2. Table 2. Examples of Features and their Knowledge Base Types Enriched Features Knowledge Base Types Shoot Setting camera-onto:Camera Setting Megapixels camera-onto:Image Quality Eco mode camera-onto:Camera Shooting Mode DIGIC 4 camera-onto:Image Processor DIGIC 5 camera-onto:Image Processor We propose to start with a set of ontological relationships that can be used to extract further knowledge resources like equivalence, direct hypernyms and direct hyponyms. This list can be extended with additional relationships depending on the structure of the ontology in use and on its granularity. The sentiment value of each resource can be computed based on the sentiment of the related sub- feature. We propose the following correspondences for ontological relationships: 1. equivalence = the same sentiment value is given to the sub-feature 2. hyperonymy = a factor to be applied to the sentiment value of sub-feature 3. hyponymy = a factor to be applied to the sentiment value of sub-feature These factors should be specified manually in the ontology by the domain ex- perts who are familiar with the relations of sub-features to abstract features. The ontology should include the knowledge required about the application domain, e.g., in our example it should conceptualize the camera concept and photogra- phy world in general so that one can extract the related concepts, e.g., for the “Night Photography”. As an example for abstract features sentiment calculation, we consider the calculation for the night photography and kids photography. By inferencing on an 4 http://spotlight.dbpedia.org./ 5 http://www.alchemyapi.com 16 ontology about the relations of sub-features to each other and to abstract fea- tures, we can calculate di↵erent a↵ecting factors that can be used for abstract features sentiment calculation. We extract sentiments of related sub-features in the whole product corpus. For example, we use the subsequent sentiment calcu- lation of the abstract features “Night Photography” and “Kids Photography”. SN ightP hotography = 0.1⇤SF lash +0.3⇤SLens +0.4⇤SImageP rocessor +0.2⇤SSensors SKidsP hotography = 0.1⇤SShutter +0.5⇤SImageP rocessor +0.3⇤SSensors +0.1⇤SF lash In the above example the sentiment factors (e.g., 0.1, 0.5) of sub-features are extracted by using an ontology that include the relations between abstract features and sub-features. Our approach depends highly on the existence of an ontology that can describe relations between features and can be used for infer- encing on feature relations. 3 Conclusion and Future Work Our main research question in this research was ”To which extent is it possible to use ontological background knowledge to derive abstract upper-level features based on more technical sub-features”?. To answer this question, we structured the solution into three main tasks and from each task we tackled a number of sub-tasks. The first task extracts features from reviews using natural language processing tools. The second task extends features collected based on entity recognition and ontological reasoning. The third task finds relations between features and maps sub-features into related abstract features. We have been working on approaches for recognition of features relevant to application domain and extraction of relations between sub-features to their related abstract-features. Our future work is to specify details of background knowledge usage in the process of feature extraction, e.g., the reasoning on background knowledge can help to understand about the features that are not explicitly connected to the abstract features in the ontology. We also need to find methods to relate and evaluate specific features to more abstract ones. Furthermore, we have to evalu- ate our approach on a larger corpus using a domain ontology. References 1. S. Kim, J. Zhang, Z. Chen, A. H. Oh, and S. Liu, “A hierarchical aspect-sentiment model for online reviews,” in AAAI (M. desJardins and M. L. Littman, eds.), AAAI Press, 2013. 2. Y. Jo and A. H. Oh, “Aspect and sentiment unification model for online review anal- ysis,” in Proceedings of the Fourth ACM International Conference on Web Search and Data Mining, WSDM ’11, (New York, NY, USA), pp. 815–824, ACM, 2011.