=Paper= {{Paper |id=Vol-2075/NLP4RE_paper12 |storemode=property |title=Managing Multi-Lingual User Feedback: The SUPERSEDE Project Experience |pdfUrl=https://ceur-ws.org/Vol-2075/NLP4RE_paper12.pdf |volume=Vol-2075 |authors=Fitsum Meshesha Kifetew,Anna Perini,Angelo Susi |dblpUrl=https://dblp.org/rec/conf/refsq/KifetewPS18 }} ==Managing Multi-Lingual User Feedback: The SUPERSEDE Project Experience== https://ceur-ws.org/Vol-2075/NLP4RE_paper12.pdf
             Managing Multi-Lingual User Feedback: the
                 SUPERSEDE project experience

                   Fitsum Meshesha Kifetew                     Anna Perini               Angelo Susi
                        kifetew@fbk.eu                        perini@fbk.eu              susi@fbk.eu
                                       Fondazione Bruno Kessler, Trento, Italy




                                                       Abstract
                       [Context & Motivation] In the SUPERSEDE project, methods and
                       tools have been developed to collect and analyze user feedback, to
                       identify relevant information for deciding which are the most impor-
                       tant requirements to be considered for the next release of a product.
                       [Question/problem] Even if the project proposal was to analyze feed-
                       back in the English language only, later it emerged that there was a
                       need to analyze multi-lingual (German, English) feedback. [Principal
                       ideas] We considered two different solutions: 1) translating user feed-
                       back from German to English, and processing it with the techniques
                       developed for the English language; 2) exploiting Natural Language
                       Processing (NLP) techniques for German to analyze directly the feed-
                       back in German. [Contribution] In this short report we describe this
                       project experience, summarizing main commonalities and differences
                       between the aforementioned solutions.




The SUPERSEDE1 (SUpporting evolution and adaptation of PERsonalized Software by Exploiting contextual
Data and End-user feedback) project is an H2020 research and innovation project that proposes a feedback-
driven approach to the life cycle management of software services and applications, with the ultimate purpose of
improving users’ quality of experience. The SUPERSEDE tool-suite, provides feedback-gathering and monitoring
tools that allow to collect data concerning user experience. These data are then analyzed with the purpose
of obtaining relevant information for deriving software requirements, to be taken into account by developers
when making decisions on software evolution, such as which (new) requirements to consider and with which
priority [MMK+ 17, BKM+ 17]. and to plan for the next release.
   The tools developed in the project are validated on three industrial use cases, representing software applica-
tions for different application domains, whose users can provide textual feedback in different languages. When
performing the industrial validation of the analysis tools for textual user feedback developed in the project, a
multi-lingual (i.e. German, English) issue arose and needed to be taken into account. This multi-lingual aspect
was not initially part of the objective of the project, i.e., all user feedback was assumed to be in English; however
as the project progressed, it became clear that we needed to address it anyways. In this short report, we focus on
SEnerCon, one of the use cases, which runs a web application in the domain of household energy saving, called
interactive Energy Saving Account - iESA, currently deployed on the German market. In particular, we sum-
marize key aspects of the two different solutions implemented in the project to address the issue of multi-lingual
feedback analysis, namely (i) building analysis tool for textual feedback in English, and then validating them on

Copyright c 2018 by the paper’s authors. Copying permitted for private and academic purposes.
  1 Project started in May 2015 and will end in April 2018. Website: www.supersede.eu
textual feedback from the industrial case study translated from German to English; (ii) building analysis tools
directly for German textual feedback.
   The implementations of both solutions rest on a similar process that includes the following steps: (1) dataset
preparation, where manual annotation of feedback messages by type and sentiment is performed by a domain
expert. Type includes the following labels: Bug Report, Feature Request, Enhancement Request, and Other,
while sentiment is labeled as negative, neutral, and positive; (2) pre-processing, where uninformative tokens are
removed; (3) feature extraction, where different linguistic properties and sentiment are extracted; (4) feedback
classification, where machine-learning techniques are employed to train a classifier on a (portion of the) dataset.
   Among the main differences in the implementations of the two solutions are: (a) an additional activity for the
dataset preparation step was requested for the first solution, that is the feedback was translated from German to
English by a domain expert in SEnerCon; (b) in the feature extraction, different type of features were extracted
in the two solutions, in particular for the first solution combinations of the speech-acts used in the messages were
extracted, by applying a novel technique that was developed for English text [MKP17]. Moreover, since feedback
data were scarcely available at the beginning of the project, we have used openly available datasets that closely
mimic the characteristics of the feedback data we analyze. In particular, we used user feedback from the issue
tracking system of the OpenOffice Writer application, which were available in English. Since the second solution
was implemented later in the project, we used directly the dataset of feedback in German from the SEnerCon
use case which were collected during the second year of the project.
   Applying the two approaches, we were able to obtain reasonable results, considering the fact that the datasets
available were very much limited in size. In particular, for the first solution (translating to English), the dataset
from SEnerCon was composed of 575 messages translated to English from German and annotated by domain
experts. On this dataset, we obtained classification accuracy of 83%. Similar results were also achieved for
sentiment. On the other hand, for the second solution (directly analyzing feedback in German), the dataset was
composed of 600 messages in German annotated by domain experts. The accuracy of the analysis was 59.20%
for classification and 65.81% for sentiment. It is important to note here that the underlying machine learning
techniques applied in the two approaches are also different. However, in both cases the size of the dataset is
quite small. Hence, when in the future when more user feedback data becomes available, the accuracy of the
trained models is expected to improve.
   In conclusion, the decision regarding the two approaches depends, among other things, on availability of
resources and the intended application of the tool. If the required expertise, domain and language knowledge
are available in house at the time of the development of the analysis tools and potentially in future use of the
tool, then implementing the analysis tools to work directly on the feedback messages in the original language
(e.g., German) is the optimal choice. Otherwise it is useful to consider the application scenario of the analysis
tool as well. If models are built once from the dataset and then used afterwards without the need for continuous
update, then adopting the option of translating to English may be considered.

Acknowledgement
This work is a result of the SUPERSEDE project, funded by the H2020 EU Framework Programme under
agreement number 644018. We also thank the Future Media group of FBK for their contribution.

References
[BKM+ 17] Paolo Busetta, Fitsum Meshesha Kifetew, Denisse Muñante, Anna Perini, Alberto Siena, and Angelo
          Susi. Tool-supported collaborative requirements prioritisation. In COMPSAC (1), pages 180–189.
          IEEE Computer Society, 2017.
[MKP17]     Itzel Morales-Ramirez, Fitsum Meshesha Kifetew, and Anna Perini. Analysis of online discussions
            in support of requirements discovery. In Advanced Information Systems Engineering - 29th Inter-
            national Conference, CAiSE 2017, Essen, Germany, June 12-16, 2017, Proceedings, pages 159–174,
            2017.
[MMK+ 17] Itzel Morales-Ramirez, Denisse Muñante, Fitsum Meshesha Kifetew, Anna Perini, Angelo Susi, and
          Alberto Siena. Exploiting user feedback in tool-supported multi-criteria requirements prioritiza-
          tion. In 25th IEEE International Requirements Engineering Conference, RE 2017, Lisbon, Portugal,
          September 4-8, 2017, pages 424–429, 2017.