<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Supporting Real Estate Search Through Automatic Information Suggestion</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Goro Otsubo LIFULL Co.</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ltd. Tokyo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Japan ohtsubogoro@lifull.com</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Author Keywords Voice Recognition</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Multi modal</string-name>
        </contrib>
      </contrib-group>
      <abstract>
        <p>Searching real estate property is uncommon task for most of the users. As a result, the user is not familiar with the detailed search condition which is useful for search. In this paper, we propose to use voice recognition as a support for real estate property search. First user set vague search condition with GUI. Then system listens conversation between users. From the conversation, system extracts keyword and suggest detailed search condition and real estate information search results with those conditions. We will discuss system design, algorithm used to link spoken words and detailed search condition and preliminary test results.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Typical user can set the vague request to search the real estate
property like the town they want to search, layout of property
and price. Further, for example, it is also possible to set
detailed search conditions such as "Pets allowed". However,
©2018. Copyright for the individual papers remains with the authors.
Copying permitted for private and academic purposes.</p>
      <p>WII’18, March 11, 2018, Tokyo, Japan
in most cases the user does not know what type of detailed
search conditions are available and how to set them. As a
result, they are not used effectively in the real estate search
process.</p>
      <p>To solve this issue, we assumed that we need support system
which can recommend proper detailed search condition to
assist real estate search. To do that, we have chosen to utilize
the conversation between users. Asking detailed demand for
new real estate property is sometimes intimidating for the user.
Moreover, the user may not necessary know the proper word to
search the real state information. Therefore, we assumed that
the system which can extract detailed search condition from
the casual user conversation would be effective to support the
real estate property search.</p>
      <p>Interfaces for searching information using voice have been
studied for a long time, and commercial service which
utilizes voice interface become popular in recent years due to
popularization of smartphones and home devices. Recently
speech recognition on the server side has become widespread,
the accuracy of speech recognition has dramatically improved
even for unspecified speakers.</p>
      <p>
        However, in most cases speech recognition is used for simple
web search and search in the app store[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. This is due to the
fact that even though the precision of speech to text conversion
has improved, the next step which is "understanding meaning"
remains as tough problem yet. According to the research
by Luger et al.,[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] although the user’s expectation for speech
recognition is high, current speech recognition interface makes
the user feel very stressful when using it.
      </p>
      <p>
        Considering technical difficulty described above, we have
chosen to use voice recognition as a support role. The user is
not assumed to directly talk to the search interface. Instead,
system listens the conversation between users. The system
recognizes the keyword in the conversation, and tries to
explore the appropriate detailed search condition to search the
real estate property. Matching spoken word by the user and
detailed search condition is the first challenge in this research.
The frustration that the user feels when using the voice
interface is that the system can not understand the meaning of the
word they have pronounced. We need to match appropriate
detailed search condition to words in users’ conversation.
There are various attempt to incorporate speech interface in
search as a support role for exploratory search. Andolina
et al. proposed systems [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ][
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] that extract keywords from
usersâ A˘ Z´ conversations to stimulating human creative
thinking.We assumed similar approach could be effective in real
estate search.
      </p>
      <p>
        Other reason why an existing system using speech recognition
makes a user feel frustrated is that the system is not transparent
[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. In other words, the user has no way of knowing why the
system returned a response when the misunderstood answer
came back. For this reason, we need to make matching
process between spoken and recognized word and recommended
detailed condition as clear as possible. By showing every
keywords searched during the detailed search condition and
relation between them, we will be able to achieve that goal.
Also, attempt to avoid keyword input by selecting and
manipulating suggested keywords by touch has been proposed[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] .
By using similar interaction, we assumed that we will be able
to increase the effectiveness of the system. Even if system can
not recognize the user’s intention correctly, user may be able
to find and select interesting keywords displayed on the screen
and explore related detailed search condition.
      </p>
    </sec>
    <sec id="sec-2">
      <title>We will describe the design of the system below.</title>
      <p>SYSTEM DESIGN
Screen shot of developed system is shown in Figure 1.
User can set the search conditions which are area to search,
layout and price of real estate property via GUI. After
setting the search condition, users have free conversation about
their intention for new house between themselves. System
continuously monitors their conversation and recognized text
is shown in the lower part of the screen. System
automatically and continuously extract keywords from conversation
and tries to find the related detailed search condition such as
"Within 800 meters from convenience store", "Pet allowed".
System also display the search result of the real estate property
specifying each detailed condition. Link is shown between
detailed search condition and searched property. There are two
major challenges in developing the proposed system. First,
we needed to develop algorithm which search detailed search
condition using user’s conversation data. Second, we need
interaction interface which will effectively support user to search
real estate information even if the system does not recognize
the user’s intention correctly. We will discuss these challenges
next.</p>
      <p>MATCHING ALGORITHM BETWEEN DETAILED SEARCH
CONDITION AND SPOKEN WORDS
One of the major frustrations felt using existing speech
interface is that the system recognizes only programmed keywords
while there is no clue for the user about which word to speak.
As a result, quite often system does not recognize the word
that the user pronounced. Even though the user pronounces the
word which has similar meaning to the keyword that system
recognizes, system can not understand the similarity between
those words.</p>
      <p>
        Similar difficulty is recognized in the field of question and
answer retrieval(herein Q&amp;A retrieval) task[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] . Major
challenge for Q&amp;A retrieval is word mismatch between the user’s
question and the question-answer pairs in the archive [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. To
solve the word mismatch problem, many different approaches
have been proposed.
      </p>
      <p>
        In this research, we tried to utilize Word2Vec [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. By using
Word2Vec, each word is vectorized and similarity between
words can be calculated. Basic idea in this research is that we
tried to match spoken words and detailed search condition not
with simple word match, but with match between related words
expanded from spoken words and detailed search condition
using Word2Vec. Following is the algorithm we used.
1. For each detailed search condition, manually set two to five
related keywords.
2. For each manually set keyword, search related words
using Word2Vec, and record them as "expanded related
keywords".
3. Convert conversation voice to text string using speech to
text conversion function.
4. Extract noun and verb from the recognized text, and record
them as "spoken words"
5. For each spoken words, search related words using
Word2Vec, and record them as "expanded related
keywords".
6. Tries to find same word from expanded keywords from
spoken words, and expanded keywords from manually set
keywords. If same word can be found, put link between
spoken word and detailed search condition.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Example of matching is shown in Figure 2.</title>
      <p>
        Next we will discuss the training data set for Word2Vec.
Related words extracted using Word2Vec depends on nature of
training data set. As an example, we will show the result of
retrieved related words of the word "Noise" using Wikipedia[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
as training data in Table 1.(Original words are in Japanese)
Generally, words used in connection with "Noise" are lined
up, but they are different from what the user associates when
searching for a real estate property. Users who are concerned
about "Noise" may choose "Top floor" detailed search
condition if they are concerned about the noise from the floor above.
Users who care about the noise from the roads may choose
"Higher than the second floor". In either case, those detailed
search conditions has no relations to the words extracted using
Wikipedia as a training data.
      </p>
      <p>
        We have also gathered text data from the web site called All
About Japan [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] which has large amount of text related to real
estate property. We found out that related words extracted
is more suitable for user’s intuition when searching the real
estate property. Retrieved related words of the word "Noise"
using All About Japan as training data is also shown in Table 1.
"Sound leak" is what user may be interested in when searching
real estate property.
      </p>
      <p>However, number of words in All About Japan is not necessary
enough. As a result, in many cases we could not extract related
words because spoken word does not exist in All About Japan
data set.
Considering characteristics of each data set described above,
we decided to use both data set for keyword extraction. First
we tries to extract keywords from All About Japan data set. If
no matching word is found, we tries to extract keywords using
Wikipedia data set.</p>
      <p>USER INTERACTION
By using the algorithm described above, we can expect that
we will be able to search the detailed search condition better
than simple word match. However, still there are errors and we
don’t expect that we can search the detailed search condition
with high precision.</p>
      <p>The other aspect that we should consider is transparency of
the system which is described earlier. To ease user frustration,
we need to make inference process clear to the user.
Considering these factors, we designed the interaction
interface shown in Figure 1. Not only detailed search condition
and spoken words are displayed, but also manually set
keywords and expanded keywords are shown on the screen. All
the words displayed on the screen can be used as an search
keyword by dragging and dropping the word into the text area
shown in the lower part of the screen.</p>
      <p>As stated before, precision of voice-to-text speech and search
of detailed search condition is not necessary high. In that case,
user may be frustrated if we only display "no results found"
or totally irrelevant result. In this research we tried to display
as much words as possible on the screen. By viewing those
words, it it probable that some words on the screen might be
interesting for the user. If so, user can start new search via
interaction on the display, not by voice recognition.
EVALUATION
We have conducted two types of evaluation so far. First, we
evaluated how effective search algorithm is for various user
input. Since we would like to evaluate effectiveness of our
algorithm, we used text input rather than voice input to avoid
error caused by speech recognition.</p>
      <p>
        We have chosen fifty sentences from Q&amp;A site about real
estate search[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] such as "Good view and well-ventilated" . All
the sentences do not include the exact word in detailed search
condition. Therefore, none of the conditions can be searched
using simple word match algorithm. For each sentence, we
have selected corresponding detailed search condition. System
displays up to four detailed search condition. If at least one
search condition displayed is related to input sentence, we
evaluated it as a success.
      </p>
      <p>As a result, 34 out of 50(68%) sentences can be evaluated as
success. 9 out of 34 sentences which are evaluated as success
include manually defined extend keywords. If we remove
them from the result, 25 out of 41 (61%) sentences can be
searched successfully using proposed algorithm.</p>
      <p>Second, using current system, we conducted simple user
evaluation. We have explained system’s concept and function, and
put the system beside, we had conversation about what type
of real estate property the user is interested in. Five users have
participated in the test. During the user test, we got consistent
response from the user. Every participants see the importance
of system’s assistant role. In some cases, the user can find the
interesting results. Four out of five participants see detailed
search condition that they have never searched real estate
property with before. We also observed that the user quite often try
to use the voice interface as an main interface. Even though
current system only extracts and search with detailed search
condition, the user tried to search with voice sentence like
"Search property in Tokyo area". Even though we designed
the system so that the user can set those search condition with
GUI, the user often forget that.</p>
      <p>From the result of user test, we realized that interaction design
should be improved so that user does not misunderstand that
system can recognize every request that user might have. In
the current design, search condition setting GUI is hidden
while the system listen the conversation. In this case, the
user expects system can understand any word what they say.
To avoid such a misunderstanding, we need to show search
condition setting GUI upfront. When the user speaks, and do
not operate the GUI, we can show current voice recognition
interface over the search condition setting GUIs.</p>
      <p>CONCLUSION AND FUTURE DIRECTION
We have developed the real estate search system which
recommends detailed search conditions from users’ conversation.
Evaluation result shows the mixed results. We could confirm
the potential of proposed algorithm. However, we also
recognized that user interface design should be improved. Based on
the evaluation results, we will redesign the system interaction,
and will conduct further user test to evaluate how effective
system can support the search of real estate information.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <year>2017</year>
          .
          <article-title>All Abount Japan : House,Real estate property</article-title>
          . https://allabout.co.jp/r_house/. (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <year>2017</year>
          .
          <article-title>OKWeb-Sumai(Housing)</article-title>
          . https://okwave.jp/c622.html. (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <year>2017</year>
          .
          <article-title>Wikipedia (Japanese)</article-title>
          . https://ja.wikipedia.org/. (
          <year>2017</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <given-names>Salvatore</given-names>
            <surname>Andolina</surname>
          </string-name>
          , Khalil Klouche, Diogo Cabral, Tuukka Ruotsalo, and
          <string-name>
            <given-names>Giulio</given-names>
            <surname>Jacucci</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>InspirationWall: Supporting Idea Generation Through Automatic Information Exploration</article-title>
          .
          <source>In Proceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition</source>
          (C&amp;#38;C '15). ACM, New York, NY, USA,
          <fpage>103</fpage>
          -
          <lpage>106</lpage>
          . DOI: http://dx.doi.org/10.1145/2757226.2757252
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Salvatore</surname>
          </string-name>
          et al Andolina.
          <year>2015</year>
          .
          <article-title>IntentStreams: Smart Parallel Search Streams for Branching Exploratory Search</article-title>
          .
          <source>In Proceedings of the 20th International Conference on Intelligent User Interfaces (IUI '15)</source>
          . ACM, New York, NY, USA,
          <fpage>300</fpage>
          -
          <lpage>305</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <given-names>Jiwoon</given-names>
            <surname>Jeon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          , and Joon Ho Lee.
          <year>2005</year>
          .
          <article-title>Finding Similar Questions in Large Question and Answer Archives</article-title>
          .
          <source>In Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM '05)</source>
          . ACM, New York, NY, USA,
          <fpage>84</fpage>
          -
          <lpage>90</lpage>
          . DOI: http://dx.doi.org/10.1145/1099554.1099572
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <given-names>Khalil</given-names>
            <surname>Klouche</surname>
          </string-name>
          , Tuukka Ruotsalo, Diogo Cabral, Salvatore Andolina, Andrea Bellucci, and
          <string-name>
            <given-names>Giulio</given-names>
            <surname>Jacucci</surname>
          </string-name>
          .
          <year>2015</year>
          .
          <article-title>Designing for Exploratory Search on Touch Devices</article-title>
          .
          <source>In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15)</source>
          . ACM, New York, NY, USA,
          <fpage>4189</fpage>
          -
          <lpage>4198</lpage>
          . DOI: http://dx.doi.org/10.1145/2702123.2702489
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <given-names>Ewa</given-names>
            <surname>Luger</surname>
          </string-name>
          and
          <string-name>
            <given-names>Abigail</given-names>
            <surname>Sellen</surname>
          </string-name>
          .
          <year>2016</year>
          .
          <article-title>"Like Having a Really Bad PA": The Gulf Between User Expectation and Experience of Conversational Agents</article-title>
          .
          <source>In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems (CHI '16)</source>
          . ACM, New York, NY, USA,
          <fpage>5286</fpage>
          -
          <lpage>5297</lpage>
          . DOI: http://dx.doi.org/10.1145/2858036.2858288
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <given-names>Thomas</given-names>
            <surname>Mikolov</surname>
          </string-name>
          .
          <year>2013</year>
          .
          <article-title>word2vec:Tool for computing continuous distributed representations of words</article-title>
          . (
          <year>2013</year>
          ). https://code.google.com/word2vec/.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Verto</surname>
          </string-name>
          .
          <year>2017</year>
          .
          <article-title>WhatâA˘ Z´ s the Future of Personal Assistant Apps? (</article-title>
          <year>2017</year>
          ). http://research.vertoanalytics.
          <article-title>com/ what-the-future-of-personal-assistant-apps-webinar-deck.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Xiaobing</surname>
            <given-names>Xue</given-names>
          </string-name>
          , Jiwoon Jeon, and
          <string-name>
            <given-names>W. Bruce</given-names>
            <surname>Croft</surname>
          </string-name>
          .
          <year>2008</year>
          .
          <article-title>Retrieval Models for Question and Answer Archives</article-title>
          .
          <source>In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '08)</source>
          . ACM, New York, NY, USA,
          <fpage>475</fpage>
          -
          <lpage>482</lpage>
          . DOI: http://dx.doi.org/10.1145/1390334.1390416
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>