An Advice Recommender System Based on Complaint Data
                           Analysis
                     Liang Yang                                        Daisuke Kitayama                              Kazutoshi Sumiya
      Kwansei Gakuin University, Japan                              Kogakuin University, Japan                Kwansei Gakuin University, Japan
         dui93794@kwansei.ac.jp                                    kitayama@cc.kogakuin.ac.jp                     sumiya@kwansei.ac.jp

ABSTRACT
Nowadays, there are a large number of users who post complaints
about a certain service on the Internet. Because users have vari-
ous values and views, even if they receive the same service, they
may complain in different ways. However, it is quite difficult to
respond to various user demands for service in real time and there
are almost no direct solutions when users feel dissatisfied with a
certain service. Therefore, in this paper, we propose an advice rec-
ommender system by analyzing complaint data from Fuman Kaitori
Center. First, the system generates query keywords according to
various user complaints about a certain service by calculating the
score of each query. Then suitable web pages containing advice
are recommended from the results of the query. This advice could                             Figure 1: Example of Advice Recommendation
address users’ dissatisfaction and respond to their various demands
in a comprehensive way. Also, we verify the usability of proposed
system by using a questionnaire survey evaluation.                                    2 RELATED WORK
CCS CONCEPTS                                                                          2.1 FKC Dataset
• Information System → Information Retrieval; • Query Pro-                            The FKC dataset has been used for several studies in recent years.
cessing → Query Suggestion.                                                           Mitsuzawa et al. [1] presented the FKC dataset which is from Fuman
                                                                                      Kaitori Center (FKC). "Fuman" means dissatisfaction in Japanese.
                                                                                      The FKC is a Japanese consumers’ negative opinion data collection
KEYWORDS                                                                              and analysis service. In our work, we used and analyzed the FKC
Recommender System, Query Extraction, Advice, Complaint Data                          dataset.
                                                                                         Hasegawa et al. [2] analyzed and visualized the contents of the
                                                                                      FKC dataset such as the distribution of users’ ages, jobs, and gender.
1    INTRODUCTION                                                                     In our work, we determined the target of the experiment based on
In recent years, many users post negative reviews about a certain                     their results.
service online. However, it is quite difficult to respond to various
user demands for service in real time as the service is provided by                   2.2    Topic Word Extraction
the company. In addition, there are almost no direct solutions when                   Sakai et al. [5] proposed a method to extract negative words as the
users feel dissatisfied with a certain service. Therefore, this paper                 expressions of dissatisfaction from blogs. They extracted nouns, ad-
is focused on user complaints related to services and proposed a                      jectives to make a dissatisfaction expression dictionary. In our work,
system to search for advice that could address users’ dissatisfaction                 we only extract nouns because nouns can explain and represent
by generating query keywords from complaint reviews[11]. This                         the content of users’ complaints.
advice contains merits of the service users may not be aware of and                      Hashimoto et al. [6] proposed a method to extract important
could respond to their different demands in a comprehensive way.                      topics from newspaper and detect social problems based on doc-
An example of advice recommendation is described in Figure1.                          ument clustering. Ustumi et al. [4] proposed a method to extract
    The remainder of this paper is structured as follows. Section                     technological solutions to social problems such as medical issues
2 presents a brief summary of related work. Section 3 introduces                      from the news. They extracted technological solution words by
the dataset we use for research and explains the proposed system.                     calculating the relevance of problems and technologies. They de-
Section 4 discusses the experimental results and the evaluation of                    fined the relevance calculation as problem relevancy and technical
the proposed system. Finally, Section 6 concludes this paper and                      relevancy. A higher value of relevancy indicated a higher possibility
discusses future work.                                                                of being able to extract a technological solution word. In our work,
                                                                                      we use this concept and extract the advice topic word by calculating
                                                                                      the relevance of the company and complaint topic. However, we
ComplexRec 2019, 20 September 2019, Copenhagen, Denmark
Copyright ©2019 for this paper by its authors. Use permitted under Creative Commons   hypothesize that a lower relevancy indicates a higher probability
License Attribution 4.0 International (CC BY 4.0)..                                   that a word is an advice topic word.
ComplexRec 2019, 20 September 2019, Copenhagen, Denmark                                                                          Yang, et al.


                                                        Figure 2: System Overview


   Yoshida et al. [7] proposed a method to extract features terms       3.2    Dataset
from the customer reviews of e-commerce sites in order to recom-        In this study, we analyze a dataset of complaints from the Fuman
mend similar items to users. They used polarity analysis to calculate   Kaitori Center, which is provided by Insight Tech Inc. from the
the degree of importance of feature words by counting the number        National Institute of Informatics. In this paper, we refer to the Fu-
of positive reviews, negative reviews, and positive ratings. In our     man Kaitori Center’s dataset as the FKC dataset. The Fuman Kaitori
work, we also use polarity analysis to evaluate advice topic words.     Center is a website on which users can post their complaints about
In addition, we weight words according to the result of polarity        topics such as products, services, education, work, and relation-
analysis.                                                               ships. Moreover, users get points when they post complaints that
                                                                        they can exchange for coupons for online shopping websites. This
2.3    Query Generation                                                 dataset contains about 5 million negative reviews that were posted
                                                                        from 18 March 2015 to 12 March 2017 by around 100,000 users.
Song et al. [9] and Kajinami et al. [10] proposed a system to gen-
                                                                        Each negative review contains the information shown in Table 1.
erate query keywords that can support a user’s search intention.
                                                                        In FKC dataset, each category contains several subcategories, and
Kakimoto et al. [8] proposed a system to extract query keywords
                                                                        each subcategory contains several companies.In this paper, because
from the closed caption data of TV programs to recommend web
                                                                        we focus on user service complaints, our proposed system uses the
pages related to tourism and events based on users’ preferences.
                                                                        data fields for “company" and “text" .
In our work, we extract query keywords from negative reviews to
recommend web pages of advice with the aim of addressing a user’s
                                                                                  Table 1: Data Structure of the FKC dataset
dissatisfaction with a certain service.

                                                                                    Data Item                 Content
2.4    Recommender System using Complaint
       Data                                                                           post_id              complaint ID
                                                                                      user_id       Fuman Kaitori Center ID
Hayashi et al. [3] proposed a system to recommend appropriate                        category          complaint category
products for users according to their complaints. This system could                subcategory     detailed complaint category
directly resolve users’ dissatisfaction by recommending certain sub-                company               company name
stitute product. In our work, we proposed a system to resolve user                   product               product name
complaints about services instead of products in an indirect way.                      text              negative review

3 PROPOSED SYSTEM
3.1 System Overview                                                     3.3    Extraction of Company Names and
In this paper, we propose a system by analyzing complaint data                 Complaint Topic Words
from Fuman Kaitori Center for recommendation of advice in order         Our proposed system extracts company names from FKC dataset
to address users’ dissatisfaction about a certain service. Figure2      directly from the company field of each record. Next, we extract
shows the system flow of our proposed method. First, we extract         the complaint topic word by analyzing negative reviews. In this
the company name and complaint topic words by calculating the           paper, we only use the negative reviews that are labeled with the
importance of the nouns in the negative reviews. Second, we obtain      company name. To extract complaint topic words, we first extract
candidate search keywords of these extracted words. Then, the           all companies’ negative reviews for one subcategory and extract all
system extracts the advice topic word by calculating the relevancy      nouns from the negative reviews. Next, we calculate the importance
of the candidate keywords to the FKC dataset and score them using       of each noun using the following equation.
morphological and polarity analyses. Third, we create the query by
combining the company name, complaint topic word, and advice                                       tf      tf
                                                                                                      ×Í                                (1)
topic word according to their various complaints about a certain                                  |A|    d ∈D t fd
service. Finally, suitable web pages containing advice that could         Here, t f is defined as the number of occurrences of a particular
address a user’s complaint are recommended from the results of          noun in the complaints for a certain company, |A|is defined as the
the query.                                                              number of all nouns in the complaints for a certain company, and
An Advice Recommender System Based on Complaint Data Analysis              ComplexRec 2019, 20 September 2019, Copenhagen, Denmark


  d ∈D t fd is defined as the number of occurrences of certain noun     3.5    Generation of Web Search Queries
Í
for the complaints for all companies. Finally, we extract all nouns     In this study, each company name and complaint topic word are
whose importance values are above the determined threshold value        matched with several advice topic words. To search for suitable
and define them as that company’s complaint topic words.                websites, We use an OR-based search method to acquire advice
                                                                        websites. Our proposed system generates the query based on one
3.4    Extraction of Advice Topic Words                                 company name, one complaint topic word, and one advice topic
To extract the advice topic word, we first obtain candidate search      word.
keywords of the company name and complaint topic word. Because
the FKC dataset is full of negative reviews, we hypothesize that        3.6    Recommendation of Advice
candidate keyword that are less relevant to the FKC dataset will        Our proposed system recommends suitable web pages containing
make better advice topic words. To verify this hypothesis, we calcu-    advice from the results of the query which is based on users’ com-
lated the relevance of these candidate keywords for each company        plaints. Figure3 shows the user interface of our proposed system.
and each complaint topic word and define as “company relevancy”         First, the system generates several queries by analyzing user’s neg-
and “complaint topic relevancy”. It is calculated using the following   ative review. Next, user can choose and browse the web page based
equation.                                                               on their needs by a web search using the offered queries. The sys-
                                                                        tem recommends the advice information that could address user’s
                                       R                                dissatisfaction expressed in the negative review.
                    company relevancy = cd                       (2)
                                        Rc

                                            R
                 complaint topic relevancy = td                  (3)
                                             Rt
   Here, Rcd is defined as the number of occurrences of certain can-
didate keyword in complaints for the company in the FKC dataset
and Rc is defined as the number of negative reviews of that com-
pany. R td is defined as the number of occurrences of the candidate
keyword with the complaint topic word in the negative reviews                               Figure 3: User Interface
of the FKC dataset and R t is defined as the number of negative
reviews with that complaint topic word.
   After that, to exclude some negative words as well as verbs          4 EXPERIMENT AND EVALUATION
and adjectives which do not help users acquire advice, we weight
candidate keywords using morphological and polarity analyses, as        4.1 Experiment
shown in Table 2.                                                       In this study, we conducted an experiment to extract the complaint
                                                                        and advice topic words in order to verify the feasibility of proposed
          Table 2: Weight for Candidate Keywords                        system. For this experiment, we analyzed the subcategory of “IT
                                                                        web services" of the FKC dataset, which is under the category
                                                                        “industry.” We analyzed 1,000 negative reviews for each of three
                   Result of Analysis            Weight                 companies.
                      negative                      0.8                    First, we extracted the complaint topic words and determined
                         verb                       0.7                 different threshold values for each of the three companies. For com-
                      adjective                     0.7                 pany A, we extracted 186 complaint topic words above the threshold
              proper noun (place name)              0.7                 value of 0.00080. For company B, we extracted 144 complaint topic
           proper noun (organization name)          0.3                 words above the threshold value of 0.00076. For company C, we ex-
                    common noun                     0.3                 tracted 86 complaint topic words above the threshold value 0.00080.
                     verbal noun                    0.1                 Table 3 shows examples of the complaint topic words for each com-
                                                                        pany. These examples show that each complaint topic word implies
                                                                        the object of different users’ dissatisfaction.
   Finally, we calculate the final score of the candidate keywords         Next,we extracted advice topic words from the candidate key-
by combining the arithmetic mean of the company and complaint           words that had a score less than 0.0043, 0.0020, and 0.0033 for
topic relevancies with the weight as the following equation.            companies A, B, and C, respectively. Some examples of these words
                                                                        are shown in Table 4. As TABLE 4 shows, the proposed method is
                            relevancies                                 sufficient for ranking candidate keywords.
                  Score =               × W eiдht                (4)
                                 2
   After calculating the final score of each candidate keyword, we      4.2    Evaluation
determine the threshold value for each company. Candidate key-          In this paper, we conducted a questionnaire-based survey to eval-
words those scores are under the threshold value become the advice      uate the usability and effectiveness of the proposed method.The
topic words.                                                            questionnaire-based survey contained following 3 questions. For
ComplexRec 2019, 20 September 2019, Copenhagen, Denmark                                                                               Yang, et al.

              Table 3: Example Advice Topic Words                     negative review. It not only shows that the proposed method is
                                                                      effective, but also explains the method to rank nouns by calculating
 Company                       Complaint Topic Words                  the importance performed well.
                 purchase, prime, delivery, review,delivery fee,                               Table 6: Result of Q2
      A                order, membership, return, post, gift,
                     cardboard box, price, sign, yamato, book
                          stamp, code, block, coin, group,                                                  p@1      p@3      p@5
      B           backup, lock, telephone call, camera, setting,                      Average of p@k         0.70    0.67     0.60
                    post, input, message, content, commercial                         Average of r@k         0.18    0.50     0.75
                    news, question, premium,answer, auction,                            F-measure            0.28    0.57     0.67
      C            article, title, navigation, mail, shopping, ID,
                 weather forecast, transaction, search, comment          The result of Q2 showed that if we search advice by using the
                                                                      query which is with the lowest score for one time only, 70% of
              Table 4: Example Advice Topic Words                     appropriate queries can be offered to make web search in order
                                                                      to address the complaints expresses in the negative reviews. This
                    Complaint                                         result demonstrated that scoring candidate keyword is effective.
   Company                               Advice Topic Words           However, we found out that the longer the candidate keyword was,
                   Topic Words
                                                                      the lower the score will be when making the candidate keywords .
                                           charge, present,           In the future, we plan to develop a method to ensure if the candidate
          A            point               how to save up,            keyword is related to the complaint topic word to better exclude
                                        how to use, credit card       noise in the results.
                                        security, group friend,          For Q3, the result showed that 75% of the answer felt satisfied
          B           setting           privacy, initialization,      with the contents of advice they searched with the queries they’ve
                                          recommendation              chosen. From this result, it is observed that by using the proposed
                                           privilege, merit           system could address users’ dissatisfaction and the recommendation
          C          premium          cancellation of agreement,      of advice respond to the demands of different users. Moreover, it
                                              magazine                implied by using proposed system can help users to release their
                                                                      burden when searching for advice comparing to traditional search
                                                                      engine.
Q1, we extracted all nouns from the negative review for respondents
to choose from. For Q2, we provided 15 queries for each negative      5    CONCLUSION
review to choose from. 5 of the queries’ candidate keywords were
                                                                      In this paper, we proposed a recommender system by analyzing
made by ourselves. For Q3, we evaluated the satisfaction of the
                                                                      complaint data to recommend suitable advice. We extracted query
result of advice recommendation.
                                                                      keywords from various user complaints about a certain service
Q1:Please choose one word which you think could represent the
                                                                      by calculating the score of each query. Then suitable web pages
dissatisfaction of the following reviews.
                                                                      containing advice are recommended from the results of the query.
Q2:please choose the query that you think the contents returned
                                                                      In addition, we evaluated the effectiveness and usability of the
by a search using this query keyword could address the complaints
                                                                      proposed system through a questionnaire survey, and the results
found in the negative review.(multiple choices are allowed)
                                                                      shows that the generated query keywords would be useful for col-
Q3:Please make a web search with the queries you have chosen. Do
                                                                      lecting advice. In addition, the recommendation of advice returned
you feel satisfied with the contents of the advice?
                                                                      by query keywords could address users’ dissatisfaction with a ser-
4.3   Result and Discussion                                           vice and respond to different user demands in a comprehensive
                                                                      way.
We collected the answers of 10 respondents, and the results are          In the future, we plan to evaluate the satisfaction of each query
shown in Table 5 and Table 6. We defined those nouns and queries      and analyze the result. Furthermore, we will consider new meth-
were chosen by over 5 answers as true positive.                       ods to obtain candidate query keywords which users are hard to
                                                                      associate to enhance the usability of the proposed system.
                      Table 5: Result of Q1
                                                                      ACKNOWLEDGMENTS
                                          p@1                         In this paper, we used FKC Data Set provided for research purposes
                      Average of p@k      0.60                        by National Institute of Informatics in cooperation with Insight
                                                                      Tech Inc.
   The result of Q1 showed that if we search by using those nouns     REFERENCES
are with the highest value of importance for one time only, 60%        [1] Kensuke Mitsuzawa, Maito Tauchi, Mathieu Domoulin, Masanori Nakashima and
of appropriate complaint topic word can be extracted from the              Tomoya Mizumoto. “ FKC Corpus: a Japanese Corpus from New Opinion Survey
An Advice Recommender System Based on Complaint Data Analysis                                  ComplexRec 2019, 20 September 2019, Copenhagen, Denmark


    Service,”In proceedings of the Novel Incentives for Collecting Data and Annotation      [7] Tomoshi Yoshida and Daisuke Kitayama. “An Evaluation of Feature Term Extrac-
    from People: types, implementation, tasking requirements, workflow and results,             tion Method based on Polarity Analysis from Customer Reviews,” IEICE-DE2016-5
    Portoro, Slovenia pp.11-18, May. 2016                                                       vol. 116, no. 105, pp. 19–24, Jun. 2016.
[2] Tooru Hasegawa and Daisuke Kitayama. “The Visualization of Dissatisfaction              [8] Honoka Kakimoto, Toshinori Hayashi, Yuanyuan Wang, Yukiko Kawai, and
    Groups using Dissatisfaction Dataset,” DEIM Forum 2017, P7-1. (In Japanese)                 Kazutoshi Sumiya. “Query Keyword Extraction from Video Caption Data based
[3] Toshinori Hayashi, Yuanyuan Wang, Yukiko Kawai, and Kazutoshi Sumiya. “An                   on Spatio-Temporal Features ,” Lecture Notes in Engineering and Computer Sci-
    E-Commerce Recommender System using Complaint Data and Review Data,”                        ence: Proceedings of the International MultiConference of Engineers and Computer
    Proc. of ACM IUI2018 Workshop on Web Intelligence and Interaction (WII 2018).               Scientists 2018, pp. 405–408, Mar.2018
[4] Kazuo Utsumi, Takashi Inui, Taiichi Hashimoto, Koji Murakami and Masamichi              [9] Ximei Song and Masao Takaku. “Study on Navigation support System based on
    Ishikawa. “Extraction of Critical Knowledge concerning Social Problems and their            User’s Search Intents,” ARG WI2, no. 9, 2016.
    Technological Solutions,” Socio Technology Research Journal, vol. 6, pp. 187–198,      [10] Tomoki Kajinami, Toshiyuki Ogasawara, Jhoji Komiya and Yasufumi Takama.
    Mar. 2009.                                                                                  “Application of Keyword Map to Decision Support through Exploratory Search,
[5] T. Sakai and Ko Fujimura. “Discovering Latent Solutions from Expressions of                 ”2008 IEEE International Conference on Systems, Man and Cybernetics, 2177-2181,
    Dissatisfaction in Blogs,” Information Processing Society of Japan, vol. 52, no. 12,        2008
    pp. 3806–3816, Dec. 2011.                                                              [11] Liang Yang, Daisuke Kitayama, and Kazutoshi Sumiya.“Query Keyword Extrac-
[6] Taiichi Hashimoto, Koji Murakami, Takashi Inui, Kazuo Utsumi and Masamichi                  tion from Complaint Data for Collecting Advice,” Lecture Notes in Engineering and
    Ishikawa. “Topic Extraction and Social Problem Detection based on Document                  Computer Science: Proceedings of the International MultiConference of Engineers
    Clustering,” Socio Technology Research Journal, vol. 5, pp. 216-226, Mar. 2008.             and Computer Scientists 2019, pp. 347–351, Mar.2019