<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On Finding the Relevant User Reviews for Advancing Conversational Faceted Search</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Eleftherios Dimitrakis</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Konstantinos Sgontzos</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Panagiotis Papadakos</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Marketakis</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexandros Papangelis</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Stylianou</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yannis Tzitzikas</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Computer Science Department - University of Crete</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Computer Science - FORTH-ICS</institution>
          ,
          <country country="GR">Greece</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Speech Technology Group - Toshiba Research Europe</institution>
        </aff>
      </contrib-group>
      <fpage>22</fpage>
      <lpage>31</lpage>
      <abstract>
        <p>Faceted Search (FS) is a widely used exploratory search paradigm which is commonly applied over multidimensional or graph data. However sometimes the structured data are not su cient for answering a user's query. User comments (or reviews) is a valuable source of information that could be exploited in such cases for aiding the user to explore the information space and to decide what options suits him/her better (either through question answering or query-oriented sentiment analysis). To this end in this paper we introduce and comparatively evaluate methods for locating the more relevant user comments that are related with the user's focus in the context of a conversational faceted search system. Speci cally we introduce a dictionary-based method, a word embedding-based method, and one combination of them. The analysis and the experimental results showed that the combined method outperforms the other methods, without signi cantly a ecting the overall response time.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Faceted Search (FS) is a widely used exploratory search paradigm. It is used
whenever the user wants to nd the desired item from a list of items (either
products, hotels, restaurants, publications, etc). Typically FS o ers exploratory
search over multidimensional or graph data. However sometimes the structured
data are not enough for answering a user's query. User comments (or reviews)
is a valuable source of information that could be exploited in such cases for
aiding the user to explore the information space and to decide what options
suits him/her better. Indeed, user comments/reviews are available in various
applications of faceted search, e.g. for hotel booking and in product catalogs.</p>
      <p>
        Enabling the interaction of FS though spoken dialogue, is appropriate for
situations where the user cannot (or is not convenient to) use his hands or eyes.
In such cases, the user interacts using his voice and provides commands or poses
questions. If a question cannot be translated to a query over the structured
resources of the dataset, then the system cannot deliver any answer. In such
cases it is reasonable to resort to the available unstructured data, i.e. to users'
comments and reviews. Figure 1 illustrates the context. The objective is not to
provide the user with a direct answer but rst to identify which of the user
comments are relevant to the user's question. Direct query answering is reasonable
only in cases where, there is a single and credible source of unstructured data
(e.g. wikipedia). This is not the case with user comments since they can be
numerous, and their content can be con icting. If we manage to nd the relevant
comments, then the system could either read these comments to the user, or
attempt to apply question answering if the user requests so, or any other kind
of analysis, e.g. sentiment analysis as in [
        <xref ref-type="bibr" rid="ref14 ref2">2, 14</xref>
        ]. In any case spoken dialogue
interaction, poses increased requirements on quality, since the system should not
"read" irrelevant comments as reading costs user time.
      </p>
      <p>Multidimensional</p>
      <p>or graph data
User comments</p>
      <p>and reviews</p>
    </sec>
    <sec id="sec-2">
      <title>Data Resources</title>
      <p>Preference-enriched
Faceted Search
Finding Related</p>
      <p>Comments</p>
      <p>Mapping to
Spoken Dialogue</p>
    </sec>
    <sec id="sec-3">
      <title>Interaction System</title>
    </sec>
    <sec id="sec-4">
      <title>User</title>
      <p>Note that instead of analyzing the user comments for estimating whether a
hotel is good or bad as a whole, the interaction that we propose enables the user
to get information about the particular aspects or topics that are important for
him, e.g. about noise, cleanliness, the quality of the wi , parking, courtesy and
helpfulness of sta , etc. The set of such topics is practically endless and we cannot
make the assumption that structured data will exist for all such topics. Therefore,
it is bene cial to have systems that are able to exploit associated unstructured
data, e.g. user comments and reviews. The problem is challenging because user
comments are usually short, meaning that it is hard to achieve an acceptable
level of recall. In this paper we focus on this problem, and we introduce methods
relying on hand crafted and statistical dictionaries for identifying the relevant
comments. In addition we describe an evaluation collection that we have created
for comparatively evaluating the introduced methods, as well as an ongoing
application and evaluation over a bigger and real dataset. In a nutshell, the
key contributions of this paper are: (a) we show how the FS interaction can
be extended for exploiting unstructured data in the form of user comments
and reviews, and (b) we introduce and comparatively evaluate four methods for
identifying the more relevant user comments in datasets related to the task of
hotel booking. The rest of this paper is organized as follows: Section 2 presents
the required background and related work. Section 3 describes the proposed
methods. Section 4 reports experimental results. Finally, Section 5 concludes
the paper and discusses directions for future research and work.</p>
      <sec id="sec-4-1">
        <title>Background and Related Work</title>
        <p>2.1</p>
        <sec id="sec-4-1-1">
          <title>Background: Faceted Search and PFS</title>
          <p>
            Faceted search is the de-facto standard in e-commerce and tourism services.
It is an interaction framework based on a multi-dimensional classi cation of
data objects, allowing users to browse and explore the information space in a
guided, yet unconstrained way through a simple visual interface [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ]. Features of
this framework include: (a) display of current results in multiple categorization
schemes (called facets, or dimensions, or just attributes), (b) display of facets and
values leading to non-empty results only, (c) display of the count information for
each value (i.e. the number of results the user will get by selecting that value),
and (d) ability to re ne the focus gradually, i.e. it is a session-based interaction
paradigm in contrast to the stateless query-and-response dialogue of most search
systems. Faceted search is currently the de facto standard in e-commerce (e.g.
eBay, booking.com), and its popularity and adoption is increasing. It has been
proposed and applied for web searching, for semantically enriching web search
results, for patent-search, as well as for exploring RDF and Linked Data (e.g. see
[
            <xref ref-type="bibr" rid="ref16 ref4">4, 16</xref>
            ], as well as [
            <xref ref-type="bibr" rid="ref19">19</xref>
            ] for a recent survey). The enrichment of faceted search with
preferences, hereafter Preference-enriched Faceted Search, for short PFS, was
proposed in [
            <xref ref-type="bibr" rid="ref12 ref20">12,20</xref>
            ]. PFS o ers actions that allow the user to order facets, values,
and objects using best, worst, prefer to actions (i.e. relative preferences), around
to actions (over a speci c value), or actions that order them lexicographically, or
based on their values or count values. Furthermore, the user is able to compose
object related preference actions, using Priority, Pareto, Pareto Optimal (i.e.
skyline) and other. The distinctive features of PFS is that it allows expressing
preferences over attributes, whose values can be hierarchically organized (and/or
multi-valued), it support preference inheritance, and it o ers scope-based rules
for resolving automatically the con icts that may arise. As a result the user is
able to restrict his current focus by using the faceted interaction scheme (hard
restrictions) that lead to non-empty results, and rank the objects of his focus
according to the expressed preferences. Recently, PFS has been used in various
domains, e.g. for o ering a exible process for the identi cation of sh species
[
            <xref ref-type="bibr" rid="ref17">17</xref>
            ], as a Voting Advice Application [
            <xref ref-type="bibr" rid="ref18">18</xref>
            ], as well as, for data that contain also
geographical information [
            <xref ref-type="bibr" rid="ref6">6</xref>
            ].
2.2
          </p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Related Works</title>
          <p>
            Conversational Faceted Search Only a few works exist that involve speech
interfaces on top of the faceted search paradigm: [
            <xref ref-type="bibr" rid="ref3">3</xref>
            ] exploits a speech user
interface over facets that index audio metadata associated with audio content (that
system is used for the Spoken Web, an alternative to WWW based on audio
content, and the associated Mediaeval Spoken Web Search Task), while a faceted
browser over Linked Data is described in [
            <xref ref-type="bibr" rid="ref7">7</xref>
            ], where commands in natural
language are translated to SPARQL queries. To the best of our knowledge though,
the only work that combines spoken dialogue systems with faceted search is the
one presented in [
            <xref ref-type="bibr" rid="ref13">13</xref>
            ], where the described LD-SDS system is limited to spoken
dialogues over structured datasets (expressed in RDF). In this work we extend
conversational faceted search for exploiting also available unstructured data (e.g.
user reviews). Note that, tackling the same problem using only a single
largescale source of unstructured data, e.g. Wikipedia (as described in [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ]), is much
easier since in that case we do not have the source selection problem (selection
of user comments in our case), and the source contains many and long texts,
therefore it is not di cult to achieve a good recall level.
          </p>
          <p>
            Similar Tasks Two similar tasks, as regards the text size, from the area of
Question Answering are: (1) Machine Comprehension (MC) which aims at
identifying the answer boundaries from a given text passage and an input question
(e.g. [
            <xref ref-type="bibr" rid="ref1">1</xref>
            ] performs MC over Wikipedia), and (2) Answer Sentence Selection which
aims at identifying the right sentence from a list of candidate sentences, given
an input question (e.g. as in [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ]).
3
          </p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>The Proposed Approach</title>
        <p>In x3.1 we describe an extension of the interaction of PFS for exploiting also
associated unstructured data, and in x3.2 we focus on the problem of nding the
relevant comments.</p>
        <sec id="sec-4-2-1">
          <title>3.1 The Interaction</title>
          <p>
            The user interacts with the system using actions corresponding to PFS actions,
i.e. actions that correspond either to hard constraints (i.e. lters), or soft
constraints (i.e. preferences). We shall use the term ofocus to refer to the restricted
set of objects (those after applying all lters), and pfocus to refer to the rst
bucket of the focus, that contains the more preferred objects. If the cardinality
of either of the above sets is below a con gurable threshold (say 10), then if
the user's questions cannot be answered by the structured dataset, the system
resorts to the user comments for this. Note that if at some point in the
interaction, the user's focus is big (i.e. min(jof ocusj; jof ocusj) &gt; ) and the user asks
a question that cannot be answered by the structured dataset, then the system
suggests the user to " rst re ne the focus" in the sense that it is not useful
to ask questions of the form \quiet hotel in Rome", or \hotels with fast wi
in London". In other words, we could say that the system enters this mode in
the so-called \End Game" phase of faceted search [
            <xref ref-type="bibr" rid="ref15">15</xref>
            ]. This choice has several
bene ts:
(a) Applicability: It can applied without requiring the comments to be indexed
a priori, and this enables the application of this model over RSS feeds and blog
comment hosting services (e.g. Disqus).
(b) E ciency: Since the analysis will be done only for the comments of the
hotels in the focus, it is feasible to make this analysis at real time.
(c) Less Noise, Better Quality: For the same reason, as in (b), the quality of
the retrieved comments is expected to be higher in comparison to the quality of
retrieval over the entire set of comments (of all hotels).
          </p>
        </sec>
        <sec id="sec-4-2-2">
          <title>Finding the Relevant Comments</title>
          <p>We shall use a scoring function for estimating the relevance between an input
question q and each user review ri, where 1 i . Below we introduce four
scoring methods: (I) a Baseline, (II) a WordNet-based, (III) a Word2vec-based,
and (IV) a combination of (II) and (III).</p>
          <p>
            WordNet [
            <xref ref-type="bibr" rid="ref11">11</xref>
            ] is lexical database for the English language comprising 166,000
(f; s) pairs, where f is a word-form and s the set of words that have the same
sense, that also includes relations between words and senses (like Synonymy,
Antonymy, Hypernymy etc.). Word2Vec [
            <xref ref-type="bibr" rid="ref10">10</xref>
            ] is a method for transforming
individual words into vectors of low dimensionality (it is low in comparison a
jwordsj-dimensionality), e.g. 300, so that their distances reveal their semantic
association (these representations are derived by training a two-layer neural
network). The motivation for the selection of the above methods is their ability
to capture semantically relevant reviews beyond the trivial task of exact string
matching, and their rich and domain-agnostic vocabulary.
          </p>
          <p>
            The process for identifying the more relevant comments, in any of the four
I-IV methods, consists of the following steps:
1) For each review ri we split its text into individual sentences and get a set
of sentences rij , where 1 j s and s is the number of sentences in each
review. In this way, we can score the reviews based on the maximal scored sentence.
2) Apply tokenization, removal of stop-words and punctuations, as well as
lemmatization (using Stanford CoreNLP [
            <xref ref-type="bibr" rid="ref8">8</xref>
            ]) both to the input question q and each
associated sentence rij of ri. Let denote the result by q words and rij words
respectively.
3) Construct the method-related representation of q and each rij (it will be
described below).
4) Score and rank each review based on the de ned relevancy formula.
          </p>
          <p>
            Below we describe the representation and the scoring formula for each method.
I) Baseline: Here we just compute the maximum Jaccard Similarity between
the q words and the corresponding rij words sets:
S(q; ri) = max8rij2ri J accardSim(q words; rij words)
II) WordNet: In this method we construct WordNet-based representations for
the q and rij sets. Speci cally, for each word in q words and rij words we take
the union of the synonyms, antonyms and hypernyms, denoted by wordN et(q)
and wordN et(rij ) respectively, as extracted from the WordNet. The nal score
is de ned again using the maximum Jaccard Similarity as:
S(q; ri) = max8rij2ri W N S(q; rij ),
where W N S(q; rij ) = J accardSim(wordN et(q); wordN et(rij )).
III) Word2vec: This method exploits the word2vec embeddings available in
the GoogleNews 300-dimensional pre-trained model4. Speci cally, we get the
4 https://code.google.com/archive/p/word2vec/
word2vec vector representations of all words in q words and rij words, denoted
by word2vec(q) and word2vec(rij) respectively. Then we apply the Word Movers
Distance (WMD) [
            <xref ref-type="bibr" rid="ref5">5</xref>
            ] which calculates the minimum distance (in the vector space)
between the embedded words of the two sets. The score is de ned as:
S(q; ri) = max8rij2ri W M S(q; rij),
where W M S(q; rij) = 1 W M Dn(word2vec(q); word2vec(rij)) and W M Dn
denotes the normalized distance calculated by the division with the max WMD
over all comments.
          </p>
          <p>IV) WordNet and Word2vec: Here we combine the two previous methods
through a weighted sum, reaching to the following de nition of score:
S(q; ri) = wwN
max W N S(q; rij) + ww2v
8rij2ri
max W M S(q; rij)
8rij2ri
where wwN ; ww2v 2 [0; 1] and wwN + ww2v = 1.
4
4.1</p>
        </sec>
      </sec>
      <sec id="sec-4-3">
        <title>Evaluation</title>
        <sec id="sec-4-3-1">
          <title>Evaluation over the Collection FRUCE</title>
          <p>We constructed a small evaluation collection in order to compare the presented
methods. The collection consists of 40 hand crafted user reviews/comments
related to hotels (c1; :::; c40) and 2 manually crafted queries (q1 and q2) related
to the topic of noise. The complete list of comments is web accessible5 and the
queries are the following: q1 = \Has anyone reported a problem about noise?",
q2 = \Is this hotel quiet?".</p>
          <p>For the needs of the evaluation we manually judged the relevance of the
collection's reviews to each query. Speci cally, each review ci is labeled with 1 if
it is relevant, and with 0 otherwise. The relevant/irrelevant ratio in the collection
is 1=3.</p>
          <p>
            Quality. We measured the mean R P recision and mean AveP over the
two queries q1 and q2 for all methods. Speci cally, for the IV method we
computed various weights combinations and chose the model that achieved the
highest mean AveP . Note that methods II and III correspond to the pairs
(wwN = 1:0; ww2v = 0:0) and (wwN = 0:0; ww2v = 1:0) respectively. In our
case the maximizing weights were found to be wwN = 0:7 and ww2v = 0:3 with
mean AveP = 0:569 and mean R P recision = 0:649. The corresponding scores
for method II were mean AveP = 0:398 and mean R P recision = 0:449, while
method III achieved mean AveP = 0:366 and mean R P recision = 0:4 (the
precision of Word2vec-based methods in analogous challenges [
            <xref ref-type="bibr" rid="ref9">9</xref>
            ] is around 55%,
i.e. similar to what we measured in our setting). Finally, IV outperforms both
II, III, while II slightly outpoints III. As expected, all of the above models
outperformed our baseline (mean AveP = 0:05 and mean R P recision = 0:05)
as shown in Table 1. We have to stress though that the results of methods II
5 at http://www.ics.forth.gr/isl/sar/resources/dataset/fruce
and IV could be further improved by combining other thesaurus with WordNet
or an updated version of WordNet, since WordNet currently fails to provide
the synonyms, hypernyms and antonyms of many words. Further, since we
currently consider all possible senses of a word in the WordNet based approach,
we might be introducing wrong terms in the wordN et(q) and wordN et(rij ) set.
This problem can possible be avoided with proper sense identi cation methods.
          </p>
          <p>In addition, we plot a 2D diagram for each of the three models II, III, IV
(baseline excluded), where the y-axis represents the computed score for (ri; qi)
and the x-axis indicates its true binary relevance. The plots are shown in Figure
2. We can observe that the points are not separable by a threshold in any of
the gures (parallel line to x-axis ). However, it is obvious that the IV approach
clearly improves the separation, preserving higher scores to the true relevant
reviews, like III, and lower scores to the non-relevant ones, like II.
E ciency. All experiments were performed using a 16GB RAM machine.
Regarding speed e ciency, it is worth measuring: a) the required time for loading
the appropriate resources (Dataset, WordNet, Word2Vec) (i.e. Init Time), and
b) the required time for computing the similarity score of one query-review pair
(i.e. Execution Time). Note that the Init Time cost has to be paid only once,
while Execution Time a ects the user interaction.</p>
          <p>Regarding Init Time, the most time consuming resource is Word2Vec due to
its enormous size (491,061 ms), followed by the loading of the FRUCE dataset
(39,149 ms). The WordNet dictionary loads almost instantly (63ms). The
Execution Time on the other hand is very fast for all methods (only 13 ms on
average). We only need about 1.5 seconds for analyzing and scoring 100 reviews
with 15 words on average. Table 2 shows the Execution Time for computing the
scores of the 40 reviews for all methods (the minimum values are in bold).
4.2</p>
        </sec>
        <sec id="sec-4-3-2">
          <title>Experiments over a Real Dataset</title>
          <p>We also evaluated the proposed methods over a real dataset, that we scrapped
from a travel website. This speci c dataset contains information about 382
different hotels located in 4 di erent cities (Kyoto, Tokyo, Osaka, Kobe) of Japan.
The extracted data are logically structured in facets so that they can be
directly plugged into the system, containing the following types of information:
(a) boolean values, used for describing the facilities of a hotel (e.g. free of charge
wi , free parking, etc.), (b) numerical values (integers or oats) for describing
quantitative values (e.g. price, review rating, distance from various points of
interest, etc.), (c) geographic values for describing the location and (d) textual
values. In the last category there are also comments that review hotels, which
are categorized into comments with a positive and negative aspect. We would
like to remark that almost all (more than 23 thousand) review comments that
we have extracted contain both a positive and a negative part. Table 3 shows
the total number of hotels and the average number of comments per hotel for
the 4 di erent cities of Japan.
E ciency. The time required to load the user reviews is 186,769 ms. For
evaluating the execution time, we measured the required time for analysing and
scoring (according to the q1 and q2) 2,000 randomly selected reviews, returning
the 10 most highly ranked ones. The minimum, maximum and average times
were 21 ms, 6,427 ms and 56 ms respectively (on average each review has 48
words), and the total time was 113,870 ms. It follows that the proposed method
is acceptable in terms of e ciency. Speci cally, if we assume that we have 3
hotels in the current user focus and the average number of reviews per hotel is
61 (as shown in Table 3), we can score all reviews in around 10 secs.
Quality. Since the reviews are not annotated with binary relevance scores for
the two used queries, it is di cult to evaluate the quality of the scoring methods
on this collection. Annotating the whole collection is a laborious and time
consuming task. However we have started to manually annotate a part of the full
reviews for the two queries that we have used in the FRUCE Collection. For the
time being, we have marked 71 distinct comments, and identi ed 66 relevant and
76 irrelevant (ci; qi) pairs. The average top-2 precision of the IV method for the
2 queries by considering only the subcollection of 71 human judged comments is
0.5, while the average R-precision (R = 33) is again 0.5. We have noticed that
we would get higher results if the comments were clean, in the sense that the
collection has several spam comments that a ect negatively the results. Currently,
we are in the process of cleaning the collection.
5</p>
        </sec>
      </sec>
      <sec id="sec-4-4">
        <title>Conclusion</title>
        <p>
          In the context of Faceted Search quite often the structured data are not enough
for answering a users query. In such cases the system could resort to related
textual comments (posed in natural language) for identifying those that could
be exploited for helping the user. This requires nding the most relevant
comments that (a) are associated with the most preferred objects, and (b) are
related to a user's question. Moreover, spoken dialogue interaction poses increased
requirements on quality, in order to avoid wasting user's time by reading
irrelevant comments. To this end, we introduced a dictionary-based method that uses
WordNet, a word embedding-based method, speci cally Word2vec, and one that
combines both. The analysis and the experimental results showed that the key
result is that without dictionaries (either human-made or statistical ones), the
e ectiveness of retrieving the relevant comments is very low even in a small
dataset. Speci cally, the baseline method achieved mean AveP = 0:05 and
mean R precision = 0:05. However the method that uses both WordNet
and Word2vec outperforms every other method with mean AveP = 0:569 and
mean R precision = 0:649, taking on average only 13 ms to score a review.
We believe that the proposed method can be applied in several domains and for
various tasks, from booking services to product selection. As part of our future
work we plan to: (a) continue the quality evaluation over the real dataset, (b)
extend the system described in [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] with this functionality, (c) investigate the
applicability of comparative opinion mining and query-oriented sentiment
analysis, and (d) investigate how we could exploit external sources in cases where even
the user comments/reviews are not su cient for answering a user's question.
        </p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>D.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Fisch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Weston</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Bordes</surname>
          </string-name>
          .
          <article-title>Reading wikipedia to answer opendomain questions</article-title>
          .
          <source>CoRR, abs/1704.00051</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>K.</given-names>
            <surname>Cortis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Freitas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Daudert</surname>
          </string-name>
          , M. Hurlimann, M. Zarrouk,
          <string-name>
            <given-names>S.</given-names>
            <surname>Handschuh</surname>
          </string-name>
          , and
          <string-name>
            <given-names>B.</given-names>
            <surname>Davis</surname>
          </string-name>
          . Semeval
          <article-title>-2017 task 5: Fine-grained sentiment analysis on nancial microblogs and news</article-title>
          .
          <source>In Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL</source>
          <year>2017</year>
          , Vancouver, Canada,
          <source>August 3-4</source>
          ,
          <year>2017</year>
          , pages
          <fpage>519</fpage>
          {
          <fpage>535</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>M.</given-names>
            <surname>Diao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Mukherjea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Rajput</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Srivastava</surname>
          </string-name>
          .
          <article-title>Faceted search and browsing of audio content on spoken web</article-title>
          .
          <source>In Proceedings of the 19th ACM international conference on Information and knowledge management</source>
          , pages
          <volume>1029</volume>
          {
          <fpage>1038</fpage>
          . ACM,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>S.</given-names>
            <surname>Ferre</surname>
          </string-name>
          .
          <article-title>Sparklis: an expressive query builder for sparql endpoints with guidance in natural language</article-title>
          .
          <source>Semantic Web</source>
          ,
          <volume>8</volume>
          (
          <issue>3</issue>
          ):
          <volume>405</volume>
          {
          <fpage>418</fpage>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <given-names>M.</given-names>
            <surname>Kusner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Kolkin</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Weinberger</surname>
          </string-name>
          .
          <article-title>From word embeddings to document distances</article-title>
          .
          <source>In International Conference on Machine Learning</source>
          , pages
          <volume>957</volume>
          {
          <fpage>966</fpage>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>P.</given-names>
            <surname>Lionakis</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          . Pfsgeo:
          <article-title>Preference-enriched faceted search for geographical data</article-title>
          .
          <source>In OTM Confederated International Conferences" On the Move to Meaningful Internet Systems"</source>
          , pages
          <fpage>125</fpage>
          {
          <fpage>143</fpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>B. L.</given-names>
            <surname>Lopez-Ochoa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Sanchez-Cervantes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Alor-Hernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. A.</given-names>
            <surname>AbudFigueroa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. A.</given-names>
            <surname>Olivares-Zepahua</surname>
          </string-name>
          , and L.
          <string-name>
            <surname>Rodr</surname>
          </string-name>
          guez-Mazahua.
          <article-title>An architecture based in voice command recognition for faceted search in linked open datasets</article-title>
          .
          <source>In International Conference on Software Process Improvement</source>
          , pages
          <volume>174</volume>
          {
          <fpage>185</fpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>C. D. Manning</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Surdeanu</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          <string-name>
            <surname>Bauer</surname>
            ,
            <given-names>J. R.</given-names>
          </string-name>
          <string-name>
            <surname>Finkel</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Bethard</surname>
            , and
            <given-names>D.</given-names>
          </string-name>
          <string-name>
            <surname>McClosky</surname>
          </string-name>
          .
          <article-title>The stanford corenlp natural language processing toolkit</article-title>
          .
          <source>In ACL (System Demonstrations)</source>
          , pages
          <fpage>55</fpage>
          {
          <fpage>60</fpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          , G. Corrado, and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>E cient estimation of word representations in vector space</article-title>
          .
          <source>arXiv preprint arXiv:1301.3781</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <given-names>T.</given-names>
            <surname>Mikolov</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>K.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. S.</given-names>
            <surname>Corrado</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Dean</surname>
          </string-name>
          .
          <article-title>Distributed representations of words and phrases and their compositionality</article-title>
          . In C. J.
          <string-name>
            <surname>C. Burges</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Bottou</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Welling</surname>
            ,
            <given-names>Z.</given-names>
          </string-name>
          <string-name>
            <surname>Ghahramani</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          K. Q. Weinberger, editors,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>26</volume>
          , pages
          <fpage>3111</fpage>
          {
          <fpage>3119</fpage>
          . Curran Associates, Inc.,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Miller</surname>
          </string-name>
          .
          <article-title>Wordnet: A lexical database for english</article-title>
          .
          <source>Commun. ACM</source>
          ,
          <volume>38</volume>
          (
          <issue>11</issue>
          ):
          <volume>39</volume>
          {
          <fpage>41</fpage>
          ,
          <string-name>
            <surname>Nov</surname>
          </string-name>
          .
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          12.
          <string-name>
            <given-names>P.</given-names>
            <surname>Papadakos</surname>
          </string-name>
          and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          .
          <article-title>Comparing the e ectiveness of intentional preferences versus preferences over speci c choices: a user study</article-title>
          .
          <source>International Journal of Information and Decision Sciences</source>
          ,
          <volume>8</volume>
          (
          <issue>4</issue>
          ):
          <volume>378</volume>
          {
          <fpage>403</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          13.
          <string-name>
            <given-names>A.</given-names>
            <surname>Papangelis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Papadakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kotti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Stylianou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.</given-names>
            <surname>Plexousakis</surname>
          </string-name>
          .
          <article-title>Ld-sds: Towards an expressive spoken dialogue system based on linkeddata</article-title>
          .
          <source>In Search Oriented Conversational AI</source>
          , SCAI 17 Workshop (co-located
          <source>with ICTIR 17)</source>
          ,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          14. G. Petrucci and
          <string-name>
            <given-names>M.</given-names>
            <surname>Dragoni</surname>
          </string-name>
          .
          <article-title>An information retrieval-based system for multidomain sentiment analysis</article-title>
          . In F. Gandon, E. Cabrio,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stankovic</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <surname>A</surname>
          </string-name>
          . Zimmermann, editors,
          <source>Semantic Web Evaluation Challenges</source>
          , pages
          <volume>234</volume>
          {
          <fpage>243</fpage>
          ,
          <string-name>
            <surname>Cham</surname>
          </string-name>
          ,
          <year>2015</year>
          . Springer International Publishing.
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          15.
          <string-name>
            <surname>G. M. Sacco</surname>
            and
            <given-names>Y.</given-names>
          </string-name>
          <string-name>
            <surname>Tzitzikas</surname>
          </string-name>
          .
          <article-title>Dynamic taxonomies and faceted search: theory, practice, and experience</article-title>
          , volume
          <volume>25</volume>
          . Springer Science &amp; Business
          <string-name>
            <surname>Media</surname>
          </string-name>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          16. E.
          <string-name>
            <surname>Sherkhonov</surname>
            ,
            <given-names>B. C.</given-names>
          </string-name>
          <string-name>
            <surname>Grau</surname>
            , E. Kharlamov, and
            <given-names>E. V.</given-names>
          </string-name>
          <string-name>
            <surname>Kostylev</surname>
          </string-name>
          .
          <article-title>Semantic faceted search with aggregation and recursion</article-title>
          .
          <source>In International Semantic Web Conference</source>
          , pages
          <volume>594</volume>
          {
          <fpage>610</fpage>
          . Springer,
          <year>2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          17.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Bailly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Papadakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Minadakis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Nikitakis</surname>
          </string-name>
          .
          <article-title>Using preference-enriched faceted search for species identi cation</article-title>
          .
          <source>International Journal of Metadata, Semantics and Ontologies</source>
          ,
          <volume>11</volume>
          (
          <issue>3</issue>
          ):
          <volume>165</volume>
          {
          <fpage>179</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          18.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          and
          <string-name>
            <given-names>E.</given-names>
            <surname>Dimitrakis</surname>
          </string-name>
          .
          <article-title>Preference-enriched faceted search for voting aid applications</article-title>
          .
          <source>IEEE Transactions on Emerging Topics in Computing, PP(99):1{1</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          19.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Manolis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Papadakos</surname>
          </string-name>
          .
          <article-title>Faceted exploration of rdf/s datasets: a survey</article-title>
          .
          <source>Journal of Intelligent Information Systems</source>
          , pages
          <fpage>1</fpage>
          {
          <fpage>36</fpage>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          20.
          <string-name>
            <given-names>Y.</given-names>
            <surname>Tzitzikas</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Papadakos</surname>
          </string-name>
          .
          <article-title>Interactive exploration of multidimensional and hierarchical information spaces with real-time preference elicitation</article-title>
          .
          <source>Fundamenta Informaticae</source>
          ,
          <volume>20</volume>
          :1{
          <fpage>42</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          21. L.
          <string-name>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. M. Hermann</surname>
            , P. Blunsom, and
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Pulman</surname>
          </string-name>
          .
          <article-title>Deep learning for answer sentence selection</article-title>
          .
          <source>CoRR, abs/1412.1632</source>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>