=Paper= {{Paper |id=Vol-2620/paper6 |storemode=property |title=Aspect-Oriented Analytics of Big Data |pdfUrl=https://ceur-ws.org/Vol-2620/paper6.pdf |volume=Vol-2620 |authors=No’aman M. Ali |dblpUrl=https://dblp.org/rec/conf/balt/Ali20 }} ==Aspect-Oriented Analytics of Big Data== https://ceur-ws.org/Vol-2620/paper6.pdf
         Aspect-Oriented Analytics of Big Data

                      No’aman M. Ali1,2[0000−0002−3922−7136]
                         1
                           Saint Petersburg State University,
             Faculty of Mathematics & Mechanics, St. Petersburg, Russia
                               2
                                 Port Said University,
    Faculty of Management Technology and Information Systems, Port Said, Egypt
                         no3man mohamed@himc.psu.edu.eg




       Abstract. Social media platforms are one of the most significant con-
       tributors to big data; it enables consumers to provide their views or
       opinions about products and services. These abundant reviews contain
       substantial and valuable knowledge and have become a significant re-
       source for both consumers and firms. Therefore, enterprises seek real-
       time insights and relevant information on how the market responds to
       products and services. The proposed framework employs the sentiment
       analysis and aspect-based sentiment analysis in parallel to customer re-
       views to support decision-makers regarding Marketing and Manufactur-
       ing domains. Our proposal presents a multilayer classifier for consumers’
       reviews. The first layer is used to categorize reviews into the aspect
       and non-aspect classes. The second layer is used to break every review
       involved in the aspect-based category into opinion units based on the
       product aspects. Next, we plan to measure the polarity of the reviews
       and opinion units. Finally, we plan to visualize the results in the form
       of domain-oriented reports. Also, we present a description of the testing
       and evaluation criteria.

       Keywords: Big Data Analytics · Sentiment Analysis · Aspect-based
       Sentiment Analysis · Decision Making.



1    Introduction

Traditionally, organizations recognized that the analytics of owned data could
broadly improve their business performance through the means of Business Intel-
ligence (BI) [1]. Several decisions making and forecasting domains depend on big
data such areas involve business analysis, product development, loyalty, health
care, tourism marketing, transportation, etc. Big Data can help organizations
to employ smart and effective business decisions by choosing the most appropri-
ate informative strategic direction, increasing operational efficiency, providing
better customer service, etc. [2].
    Recently, there is a steady increase in customers’ desire to express their
views or opinions about products and services. These abundant reviews that
contain substantial and valuable knowledge become a significant resource for


Copyright © 2020 for this paper by its authors. Use permitted under Creative
Commons License Attribution 4.0 International (CC BY 4.0)

                                     41
both consumers and firms. Therefore, enterprises seek real-time insights and
relevant information on how the market responds to products and services [3–5].
    The proposed framework employs the Sentiment Analysis (SA) and Aspect-
Based Sentiment Analysis (ABSA) in parallel on customer reviews to support
decision-makers regarding the Marketing and Manufacturing domains. Our pro-
posal introduces two classes of reports; Market-oriented reports, and Product-
oriented reports as depicted in section 3. Initially, we limit our analysis plan to
includes electrical products; in the future, the analysis may expand to include
other types of products or services as well.
    In this proposal, we present our ongoing research on the developing ABSA
model that comprises a multilayer classifier. Also, we explore some of the related
works in the section of state of the art. We have designed an implementation and
evaluation plan for our research to follow within the next period of the PhD.
We desire to develop a solution comparable to other models by following the
mentioned plan.
    The rest of our proposal has the following structure: in Section 2, we briefly
introduce some of the related works and directions regarding SA and ABSA.
Section 3 comprises the proposed framework and considerations. In Sections 4
and 5, respectively, we state our research process in the form of clarified steps
with identified objectives, as well as describe the desired testing and evaluation
scheme.


2   State of the Art

The comprehensive development of e-commerce promotes the growing expansion
of using online markets via electronic platforms like Amazon, eBay, Walmart,
Best Buy, Wish, etc. Besides, the evolution in using social media platforms plays
a significant role in encouraging enterprises to give a high priority to analyze
users’ activities through such platforms [6].
    Big data analytics offers various solutions to get insights in real-time and
provides valuable information about how the market is responding to products
and campaigns [7–12]. Several works introduced for analyzing consumers’ re-
views to get insights such as sentiment analysis. This technique comprises the
automated process of analyzing textual data and classifying opinions, as well
as the extraction of properties of reviews like Polarity, subject, and opinion
holder [13–15]. On the other hand, aspect-based sentiment analysis examines
each review to recognize distinct aspects and identify the corresponding senti-
ment for each one [16–18]. Unlike sentiment analysis, it enables the association
of specific sentiments with various aspects of a product or service [19].
    Generally, the use of these techniques enables enterprises to realize how the
public feels about something at a particular moment by analyzing their emotions,
attitudes, or opinions toward various products or issues. Also, it enables enter-
prises to track how consumers’ opinions change over time. There exist several
approaches that are either based on linguistic resources or machine learning [20–
23].




                                   42
    Chong, A.Y.L., et al. [24] proposed a combination of sentiment analysis and
a neural network to examine the importance of every predictors’ variables for
online retailers’ sales predictions. They used datasets that contain predictor vari-
ables like online reviews, consumer sentiments, and online promotional strate-
gies. They observed that retailers could increase sales by specifying ”how” and
”where” to display online reviews carefully and increasing their social interac-
tions with consumers.
    Salehan, M., and Kim, D.J. [25] introduced an approach to discuss the predic-
tors of readership and helpfulness of online consumer reviews (OCR) using sen-
timent analysis for big data analytics. The presented approach could be adopted
by online vendors to develop scalable, automated systems for sorting and classi-
fying of big OCR data that will be useful to both vendors and consumers.
    Wallaart, O., and Frasincar, F. [26] proposed a two-stage sentiment anal-
ysis algorithm based on ABSA. The introduced algorithm employed a lexical-
ized domain ontology beside neural networks with a rotatory attention mech-
anism to work on sentence-level. They applied their model to SemEval-2015,
and SemEval-2016 datasets, which include restaurant reviews. They found that
machine learning methods can effectively find words that carry sentiment, with
different performance and accuracy regarding the given aspect.
    Similarly, industrial enterprises seek to analyze user reviews to determine the
suitability of the product to their requirements. Besides, to monitor the prod-
uct life cycle in the market to support sustainable smart manufacturing [27].
Moreover, to develop future strategies for the design of new products in addi-
tion to the possibility of offering other versions of existing products after their
redesign and to ensure that the problems in the current versions are addressed
successfully [28].


3   The Proposed Framework

The proposed work strives to support enterprises through exploiting the exis-
tence of tremendous amounts of consumer reviews available over social media
platforms, electronic markets, etc., by providing decision-makers with oriented
feedbacks. The processing of massive amounts of data represents a challenging
task due to the diversity of data types and structures that impose difficulties
in data integration and storage. Here we plan to make implementation using
Apache Hadoop and MapReduce as an open-source framework for distributed
storage and processing of data.
     The categorization of reviews into aspect-bassed and non-aspect classes is
still a bit tricky task since the identification of entities represents a challenge.
Performing a binary classification becomes more appropriate for this task. We
plan to apply SA, and ABSA, on the first and second classes, respectively, to
assist decision-makers concerning two primary Fields: Marketing and Manufac-
turing.
     Regarding the ABSA, the main task is to extract and identify the entity and
attribute pairs. It involves the extraction of opinion units corresponding to the




                                    43
target entity. Additionally, the recognition of sentiment words and classifying
into predefined sets are vital. Here we plan to investigate a new scheme for
measure polarity based on fuzzy sets so that each review has a scored polarity.
That scheme will be used as well in the second class to assign a membership
degree with suitable items from the set.
    Finally, we plan to support decision-makers in these fields by providing them
with up to date valuable insights in the form of Domain-oriented reports. This
task involves the visualization and summarization of data using Python to cre-
ate clear charts. Also, a comparison will be made between extracted and real
attributes to state the missing and required set.

3.1   Case Study: An Electric Clothes Iron
Product Aspects. The common parts of an electric clothes iron may include
Sole plate, pressure plate, heating element, the cover plate, handle, pilot lamp,
etc. The main features of choosing electric irons may include Iron surface (e. g.
Stainless steel coating, Teflon coating, Ceramic coating), availability of steam,
electric power of iron, weight, etc. Product parts represent the number of pieces
that come with the original products (e.g., Portfolio, base, extra cable).

Sentiment Analysis. We plan to perform the analysis to measure the overall
performance and consumer satisfaction concerning the product by performing
SA in conjunction with users’ ratings and merge results with the results realized
from the next section.

Aspect-based Sentiment Analysis. Concerning computing the performance and
quality of the distinct components and parts of the product, we plan to extract a
list of product aspects. For every list will associate with only one product, which
has a unique identifier as well. On the other hand, each item will assign a unique
identifier to eliminate repetition, which enables the sharing of one part over
several lists of products that have one or more identical parts or features. During
the analysis process, we will use the mentioned lists to measure the aspects
polarity separately. Next, we investigate the collection of aspects sentiments to
conclude the results.

Output. Results involve the evaluation of consumers’ satisfaction based on their
reviews. Reports will state the degree of suitability of a particular component
like the handle; indicators may differ among users like comfortable, regular, or
hard. Another important part, is the anticipation of the needed parts or features
by consumers, in addition to identifying the main competing products for the
current release. This information allows the redesign of the product to eliminate
disadvantages and the design of new products that meet consumers’ needs.

4     Research Method
We have identified clear steps that we plan to follow in our proposed work to
achieve work objectives, as depicted in Fig. 1:




                                   44
flow.png

           Fig. 1. System design and work flow for the proposed framework.


Step 1. Gathering data from various sources like social media platforms, online
reviews (e.g., Amazon.com), surveys, etc. We plan to develop a web scraper to
collect data from the internet using Scrapy as it is an open-source framework
written in Python.

Step 2. Preprocessing Data to be suitable for analysis purposes, besides noises
cleaning. That involves the process of breaking a stream of text up into words
”Tokens,” using Python regular expressions.

Step 3. Classification of collected reviews into two classes: Aspect-Based reviews
and Non-aspect reviews. Such a process concerning the type of analysis to be
applied.

Step 4. Extracting aspects from reviews, this step concerning the aspect-based
analysis only. That involves the identification of every entity attribute pairs
(opinion units).

Step 5. Extracting and identifying the polarity of sentences by detecting senti-
ment words. This task will run over two levels: Opinion-Unit level and Sentence
level according to the type of analysis.




                                    45
Step 6. Visualization of the results and deliver this feedback to decision-makers.
This includes the creation of an easy-to-understand visual report with simulta-
neous interpretation in a way that everyone in the company can understand.
Visualization can represent either a combination of results or a separation for
each data source.


5   Testing and Evaluation Criteria

Given the proposed framework that states the idea for solving the research prob-
lem and objectives to be achieved, in addition to the working mechanism of our
research method, we can compose the following plan for testing and evaluation
through various steps of work.


Testing. Hereabouts we plan to monitor the ongoing processes through the
implementation phases. We plan to input a random set of data for every dis-
joint task to ensure the quality and efficiency of outputs and make a comparison
with human-made processing (e.g., Tokenization, classification, aspect extrac-
tion, etc.).


Evaluation. We plan to get published sales data regarding a particular product
during a specific period that suffixes to the period in which collected reviews be-
long, and compare our results and recommendation with this data to assert the
consistency of real data with achieved results. Similarly, concerning the needed
features, we plan to survey similar products that have already added these fea-
tures. On the other hand, to comprehensively evaluate the performance of the
proposed work, we desire to experiment with a widely used ABSA dataset; the
Laptops and Restaurant datasets of SemEval-16 Track 2 Task 5.


6   Summary and Future Work

Producing large amounts of reviews by consumers via expressing their views
or opinions about products and services represents a significant resource for
both consumers and firms; it contains substantial and valuable knowledge. There
are growing interests to analyze such behavior. SA and ABSA are promising
approaches to analyze these reviews. Researchers introduced various models to
realize this task that comprise a combination with other techniques like neural
networks and machine learning.
    In this paper, we have outlined our plan of study that aims to develop a con-
sumers’ review analysis model for electrical products. We have explored some of
the existing works and briefly discussed them. Then, we presented a description
of our approach with possible directions and objectives. In the future, we plan
to continue our studies by executing the steps outlined in Section 4 with the
obligation of evaluation criteria.




                                   46
Acknowledgments. My thanks to my supervisor Prof. Boris Novikov, for his
guidance, encouragement, and advice he has provided throughout the previous
period of my doctoral studies and is still ongoing. He provided me valuable
comments and feedback at various stages of this research.

References
1. Gandomi, A., Haider, M.: Beyond the Hype: Big Data Concepts, Methods, and
   Analytics. International Journal of Information Management 35(2), 137–144 (2015).
2. Wang, H., Xu, Z., Fujita, H., Liu, S.: Towards Felicitous Decision Making: An
   Overview on Challenges and Trends of Big Data. Information Sciences 367-368,
   747–765 (2016). https://doi.org/10.1016/j.ins.2016.07.007
3. Hammou, B.A., Lahcen, A.A., Mouline, S.: Towards A Real-Time Processing Frame-
   work Based on Improved Distributed Recurrent Neural Network Variants with Fast-
   Text for Social Big Data Analytics. Information Processing & Management 57(1),
   102122 (2020). https://doi.org/10.1016/j.ipm.2019.102122
4. Zhao, Z., Wang, J., Sun, H., Liu, Y., Fan, Z., Xuan, F.: What Factors Influence On-
   line Product Sales? Online Reviews, Review System Curation, Online Promotional
   Marketing and Seller Guarantees Analysis. IEEE Access 8, 3920–3931 (2020).
5. Duan, Y., Edwards, J.S., Dwivedi, Y.K.: Artificial Intelligence for Decision Making
   in the Era of Big Data – Evolution, Challenges and Research Agenda. International
   Journal of Information Management 48, 63–71 (2019).
6. Akter, S., Wamba, S.F.: Big Data Analytics in E-Commerce: A Systematic Review
   and Agenda for Future Research. Electronic Markets 26(2), 173–194 (2016).
7. Jabbar, A., Akhtar, P., Dani, S.: Real-time Big Data Processing for Instantaneous
   Marketing Decisions: A Problematization Approach. Industrial Marketing Manage-
   ment, (2019). https://doi.org/10.1016/j.indmarman.2019.09.001
8. Malhotra, D., Rishi, O.P.: An Intelligent Approach to Design of E-Commerce
   Metasearch and Ranking System Using Next-Generation Big Data Analytics. Jour-
   nal of King Saud University - Computer and Information Sciences, (2018).
9. Zhaoa, Y., Xu, X., Wang, M.: Predicting Overall Customer Satisfaction: Big Data
   Evidence From Hotel Online Textual Reviews. International Journal of Hospitality
   Management 76, 111–121 (2019).
10. Liu, X., Shin, H., Burns, A.C.: Examining the Impact of Luxury Brand’s Social
   Media Marketing on Customer Engagement: Using Big Data Analytics and Natural
   Language Processing. Journal of Business Research, (2019).
11. Kumar, A., Shankar, R., Aljohani, N.R.: A Big Data Driven Framework for
   Demand-driven Forecasting with Effects of Marketing-mix Variables. Industrial
   Marketing Management, (2019). https://doi.org/10.1016/j.indmarman.2019.05.003
12. Zheng, K., Zhang, Z., Song, B.: E-Commerce Logistics Distribution Mode in Big-
   Data Context: A Case Analysis of JD.COM. Industrial Marketing Management,
   (2019). https://doi.org/10.1016/j.indmarman.2019.10.009
13. Taylor, E.M., O., C.R., Velásquez, J.D., Ghosh, G., Banerjee, S.: Web Opinion
   Mining and Sentimental Analysis. In: Velásquez, J.D., Palade, V., Jain, L.C. (eds.)
   Advanced Techniques in Web Intelligence-2: Web User Browsing Behaviour and
   Preference Analysis, pp. 105–126. Springer, Berlin, Heidelberg (2013).
14. Ramanujam, R.S., Nancyamala, R., Nivedha, J., Kokila, J.: Sentiment Analysis
   Using Big Data. In: International Conference on Computation of Power, Energy,
   Information and Commuincation (ICCPEIC), pp. 480–484. IEEE, Chennai, India
   (2015). https://doi.org/10.1109/ICCPEIC.2015.7259528




                                     47
15. Yadav, K., Rautaray, S.S., Pandey, M.: A Prototype for Sentiment Analysis Using
   Big Data Tools. In: Mandal, J.K., Dutta, P., Mukhopadhyay, S. (eds.) First Inter-
   national Conference on Computational Intelligence, Communications, and Business
   Analytics (CICBA 2017), vol. 775, pp. 103–117. Springer, Singapore (2017).
16. Ma, Y., Peng, H., Cambria, E.: Targeted Aspect-Based Sentiment Analysis via Em-
   bedding Commonsense Knowledge into an Attentive LSTM. In: The Thirty-Second
   AAAI Conference on Artificial Intelligence (AAAI-18), pp. 5876–5883. Association
   for the Advancement of Artificial Intelligence, New Orleans, Louisiana, USA (2018).
17. Liu, N., Shen, B., Zhang, Z., Zhang, Z., Mi, K.: Attention-based Sentiment Rea-
   soner for Aspect-based Sentiment Analysis. Human-centric Computing and Infor-
   mation Sciences 9(35), 17 (2019). https://doi.org/10.1186/s13673-019-0196-3
18. Sun, C., Huang, L., Qiu, X.: Utilizing BERT for Aspect-Based Sentiment Analysis
   via Constructing Auxiliary Sentence. In: Burstein, J., Doran, C., Solorio, T. (eds.)
   2019 Annual Conference of the North American Chapter of the Association for
   Computational Linguistics: Human Language Technologies (NAACL-HLT), vol. 1,
   pp. 380–385. Association for Computational Linguistics, MN, USA (2019).
19. Bandari, S., Bulusu, V.V.: Survey on Ontology-Based Sentiment Analysis of Cus-
   tomer Reviews for Products and Services. Data Engineering and Communication
   Technology: Proceedings of 3rd ICDECT-2K19, pp. 91–101. Springer (2020).
20. Yang, H., Zeng, B., Yang, J., Song, Y., Xu, R.: A Multi-task Learning Model for
   Chinese-oriented Aspect Polarity Classification and Aspect Term Extraction. arXiv
   e-prints, (2019).
21. See-To, E.W.K., Ngai, E.W.T.: Customer Reviews for Demand Distribution and
   Sales Nowcasting: A Big Data Approach. Annals of Operations Research 270(1-2),
   415–431 (2018). https://doi.org/10.1007/s10479-016-2296-z
22. Schouten, K., Frasincar, F.: Ontology-Driven Sentiment Analysis of Product and
   Service Aspects. The Semantic Web: 15th International Conference, ESWC 2018,
   pp. 608–623. Springer International Publishing, Cham (2018).
23. Ghosh, M., Sanyal, G.: An Ensemble Approach to Stabilize the Features for
   Multi-Domain Sentiment Analysis Using Supervised Machine Learning. Journal of
   Big Data 5(1), 44 (2018). https://doi.org/10.1186/s40537-018-0152-5
24. Chong, A.Y.L., Li, B., Ngai, E.W.T., Ch’ng, E., Lee, F.: Predicting Online Prod-
   uct Sales Via Online Reviews, Sentiments, and Promotion Strategies: A Big Data
   Architecture and Neural Network Approach. International Journal of Operations &
   Production Management 36(4), 358–383 (2016).
25. Salehan, M., Kim, D.J.: Predicting the Performance of Online Consumer Reviews:
   A Sentiment Mining Approach to Big Data Analytics. Decision Support Systems
   81, 30–40 (2016). https://doi.org/10.1016/j.dss.2015.10.006
26. Wallaart, O., Frasincar, F.: A Hybrid Approach for Aspect-Based Sentiment Analy-
   sis Using a Lexicalized Domain Ontology and Attentional Neural Models. In: Hitzler,
   P.D.P., Fernández, M., Janowicz, K., Zaveri, A., Gray, A.J.G., Lopez, V., Haller,
   A., Hammar, K. (eds.) The Semantic Web: 16th International Conference, ESWC
   2019, pp. 363–378. Springer International Publishing (2019).
27. Ren, S., Zhang, Y., Liu, Y., Sakao, T., Huisingh, D., Almeida, C.M.V.B.: A Com-
   prehensive Review of Big Data Analytics Throughout Product Lifecycle to Support
   Sustainable Smart Manufacturing: A Framework, Challenges and Future Research
   Directions. Journal of Cleaner Production 210, 1343–1365 (2019).
28. Rehman, M.H.u., Yaqoob, I., Salah, K., Imran, M., Jayaraman, P.P.,
   Perera, C.: The Role of Big Data Analytics in Industrial Internet
   of Things. Future Generation Computer Systems 99, 247–259 (2019).
   https://doi.org/10.1016/j.future.2019.04.020




                                     48