=Paper=
{{Paper
|id=Vol-1383/paper1
|storemode=property
|title=Deployment of Semantic Analysis to Call Center
|pdfUrl=https://ceur-ws.org/Vol-1383/paper1.pdf
|volume=Vol-1383
|dblpUrl=https://dblp.org/rec/conf/semweb/Kawamura14
}}
==Deployment of Semantic Analysis to Call Center==
<pdf width="1500px">https://ceur-ws.org/Vol-1383/paper1.pdf</pdf>
<pre>
     Deployment of Semantic Analysis to Call Center

                           Takahiro Kawamura                                                   Akihiko Ohsuga
                           and Shinichi Nagano                                   Graduate School of Information Systems,
               Corporate Research & Development Center,                           University of Electro-Communications,
                             Toshiba Corp.                                                         Japan


                      I.   I NTRODUCTION
    In this paper, we present an application of text data triplifi-
cation for a business. Since this is an in-company application
from a laboratory to a division, we cannot describe it as “a
success story of business-relevant, industrial deployments of
Semantic Web technologies” in the CFP, although it will be
useful as a case study.
    Our company manufactures and sells consumer electronics
ranging from refrigerators to TV sets, and it has recently been
endeavoring to deal effectively with a number of inquiries
about product malfunctions, which are gathered at a call               Fig. 1.     Search flow of inquiry contents from Linked Data
center. Nowadays, moreover, if the response to an inquiry
is mishandled, users tend to be complainers in some cases.             a research contract with a certain amount of R&D expenses1 .
A bad reputation then spreads widely via social media, that
is, “flaming” occurs, and may greatly affect sales of all the            II.       T RIPLIFICATION OF S OCIAL M EDIA I NFORMATION
company’s products. Making the response more problematic
for operators at the call center is the difficulty of distinguishing       To create a training dataset, firstly, we divided each sen-
whether the malfunction that is the subject of the inquiry is          tence in the dataset into chunks of semantically consistent
caused by a user’s way of using the product or a problem               words by using Part of Speech (POS) analysis and syntactic
that accidentally occurs in an individual product, or caused           analysis, and then manually labeled one of eight properties,
by a problem common to the design or production phase of a             namely, Subject, Action, Object, Location, Time, Modifier,
particular model. In the case that an operator considers the           Because, and Other, to each block. We then used conditional
malfunction to be the user’s fault at the initial stage, and           random fields (CRF) as a learning model, which is an undi-
it subsequently turns out to be the manufacturer’s fault, a            rected graphical model for predicting a label sequence for a
firestorm may occur that may lead to lawsuits. The Consumer            sequence. The key point of the proposed method is that we also
Affairs Agency in Japan and several law firms warn that the            constructed approximately 250 annotation rules using the result
initial response to an inquiry is especially important in general.     of syntactic analysis and the predefined ontology, for example,
However, since pernicious complainers exist, if the manufac-           a noun before a postpositional particle ‘WO’ corresponds to
turer always considers the inquiry to be the manufacturer’s            OBJECT in a Japanese sentence, and a sentence after a word
fault, the cost will soar.                                             ‘NAZANARA’ (because) and a sentence before the word have
                                                                       a causal relation, and so forth. We then decided which of the
    Therefore, we proposed a method of comparing seman-                CRF estimation and the rule decision should be adopted based
tically analyzed social media information and the inquiry              on the estimation probability of CRF.
content. We triplify entries about product malfunctions on
social media, and convert them to a network of Linked Data                 In addition, we determined identities of values (chunks),
in advance. Then, by searching for the content of the inquiry          that is, entity linking, so that values of Subject, Object, etc.
to the call center in the network, we confirm whether the same         that have the same meaning refer to an identical node in the
issue is currently spreading on social media and whether the           network, as much as possible. Finally, we unified the values
inquiry is the tip of an iceberg. If there is a similar entry on       that are determined to be identical to a node whose label is a
social media, it is determined whether the inquiry content is          typical value.
a malfunction common to a model and, if so, the operator
offers a polite explanation to the user and a notification is sent          III.     M ATCHING BETWEEN I NQUIRY C ONTENT AND
to a quality control (QC) section. Moreover, if the entry has                                   L INKED DATA
causal links connecting to users’ dissatisfaction and discontent,
a notification with high priority will be sent to the quality             Figure 1 presents the flow when an inquiry is received
control section.                                                       at a call center. When the call center receives an inquiry
                                                                       from a user, an operator records the summary of the inquiry
   We, that is, our laboratory, brought the above-mentioned            content as two or three sentences (call log). Each sentence is
advantages to the attention of a division of our company, which
manufactures and sells consumer electronics, and then received           1 approx. ten million yen for a half year
                                                                               The accuracy of the Location property is lower than that
                                                                               of other properties because of the shortage of geographical
                                                                               names registered in the system. The low accuracy of the
                                                                               Time property seems to be attributable to the difficulty of
                                                                               distinguishing it from the Modifier property. We also confirmed
                                                                               that extraction of the causal relation is feasible, since the
                                                                               accuracy of the Because property is high.
                                                                                   The division to which we provided this result commented
                                                                               that the 94.1% extraction accuracy is satisfactory, but pointed
                                                                               out that on this occasion social media information was col-
                                                                               lected for a certain period and converted to a graph (Linked
                                                                               Data), and therefore the graph represents a snapshot. Opinions
                                                                               expressed on social media are continually changing from
                                                                               product release to malfunction discovery and manufacturers’
                                                                               responses, and thus such time-series variations should be
                                                                               represented in the graph. In addition, users’ complaints are of
                                                                               varying strength, and thus they should be divided into multiple
                                                                               stages from a weak complaint to a strong complaint. Therefore,
                                                                               we intend to prepare more detailed properties for representing
                                                                               various nuances of verbs.

                                                                               B. Matching between inquiry content and social media
                                                                                   In the experiment, we first extracted 220 call logs (sum-
                                                                               maries of inquiry contents described by operators) from 25,459
Fig. 2. Linked Data graph for an inquiry content (above) and correspoinding    logs about our company’s TV sets for a month, September
social media information (below)                                               2012. We then compared them with social media information
                                                                               that was triplified as described in IV-A. Finally, the matching
triplified in the format of < Si , Vi , Oi >, and then triples that
                                                                               results between the call log and part of social media were
have the same structure as the sentence are searched in the
                                                                               manually checked, and then the accuracy of the matching was
triple store. As a result, if a triple with the same structure as
                                                                               calculated. The result is shown in Table II.
the inquiry content is found, we determine that the problem
                                                                                 TABLE II.       M ATCHING ACCURACY OF INQUIRY CONTENTS ( AVE .)
does not concern an individual product, but is common to a
model. Moreover, the number of triples with the same structure                                         No match                       Match
                                                                                             No data    Triplification Error   Precision    Recall
is regarded as an amount of topics on social media. When                                      9.1%             13.6%            88.2%       33.3%
querying the triple store to find Ss , Vs , Os , we also use the
method of entity linking described in II. Example graphs of
social media entries and inquiry content are shown in Fig. 2,                      The fact that the precision of call logs to social media
where each sentence has a sentence ID node and at most eight                   graph was about 90% indicates that checking the same entry on
properties.                                                                    social media as a call log is possible. Since the recall was low,
                                                                               however, we found that it is difficult to deduce how widely the
                                                                               call log is spreading on social media from this result. The recall
 IV.    E XPERIMENTS ON T RIPLIFICATION AND M ATCHING                          was low because there are several expressions that represent
A. Triplification of social media                                              the same condition and content on social media, and also the
                                                                               method of entity linking mentioned in II is insufficient to unify
    In an experiment, we collected entries about a TV set                      them.
manufactured and sold by our company from a well-known
review site in Japan2 , and then conducted labeling, learning,                     The division to which we provided this result commented
and estimation with the method described in the previous                       that when an inquiry is received at a call center, it is not
section. The dataset is 197 sentences for three months, and                    possible due to time constraint that an operator performs
evaluated with 10-fold cross-validation. Table I shows the                     keyword search with appropriate keyword expansion, and find
combined result of the CRF estimation in the case of the                       the same entry as the inquiry content on social media, but
estimation probability p > 0.6 or the rule decision, otherwise.                this system automated comparison between call logs and social
       TABLE I.         E XTRACTION ACCURACY FOR EACH PROPERTY
                                                                               media using semantic search with word identification and word
                                                                               relation. The comment also indicated that in future when the
 (%)         SUB.       OBJ.   ACT.   LOC.   TIME     MOD.     BCOZ     Ave.
                                                                               malfunction of a model is spreading on social media, an alert
 Precision   85.7       88.8   96.9   63.6   100.0    88.2     100.0    94.1   should be transmitted before receiving the call log.
 Recall      100.0      92.7   95.4   46.7    67.9    91.3     100.0    94.1
                                                                                         V.     C ONCLUSIONS AND F UTURE W ORK
   Weighted Average (Ave.), which is an average value ac-                         Future works include performance evaluations. We have
cording to the number of each property, indicates that the                     developed the system and are in the trial phase. In the future,
combined method we proposed achieved accuracy of 94.1%.                        we intend to identify issues that may arise through the actual
  2 http://kakaku.com                                                          operation of the system, and further improve the system.

</pre>