Event Data Collection for Recent Personal Questions Masahiro Mizukami, Hiroaki Sugiyama, Hiromi Narimatsu NTT Communication Science Laboratories {mizukami.masahiro, sugiyama.hiroaki, narimatsu.hiromi}@lab.ntt.co.jp Abstract dinner yesterday?. Also, it is easy to imagine that immutable responses to recent personal questions make conversational In human-human conversation, people frequently agents unnatural; therefore, conversational agents have to ask questions about a person with whom to talk. spend different days that like people spend different days, and Since such questions also asked in human-agent it is more natural to return different answers to recent personal conversations, previous research developed a Per- questions. To solve this problem, preparing other kinds of son DataBase (PDB), which consists of question- data which expresses recent experiences helps conversational answer pairs evoked by a pre-defined persona to agents to answer such questions about recent things (recent answer user’s questions. PDB contains static in- personal questions). formation including name, favorites, and experi- One simple idea is to collect data that express such recent ences. Therefore, PDB cannot answer questions events as a diary that is updated by the user. Previous work about events that occurred after it was built. It on response generation leveraged diaries or microblogs as a means that this approach does not focus on answer- corpus that includes people’s recent personal information [Li ing questions about more recent things (recent per- et al., 2016]. Even though this approach seems reasonable, sonal questions), e.g., Have you seen any movies handcrafted-data-driven approach such as PDB has practical lately? In contrast, since recent questions are fre- advantages in controllability and reliability. In this paper, we quently asked in a casual conversation, conversa- collected event data from participants who take part in short- tional agents are required to answer recent ques- and long-term periods. This collected data is hand-crafted, tions for maintaining a conversation. In this paper, high-quality and easy to update (adding new day’s data). We we collect event data that consist of a large number clarified the potential of event data to answer questions about of experiences and behaviors in daily lives, which recent behaviors/experiences in casual conversations through enables to answer recent questions. We analyze analysis. them and show that our data is effective for answer- ing recent questions. 2 Related Works As mentioned in the introduction, PDB is the most closely 1 Introduction related research to answer user questions. Batacharia et al. Questions about a conversational partner are called “personal developed PDB about Catherine, a 26-year-old female living question,” which are an essential factor for expressing inter- in New York City [Batacharia et al., 1999]. To cover more est in conversational partners. Such questions frequently oc- questions and with different personas, Sugiyama et al., devel- cur in casual human-human conversations. Nishimura et al. oped a PDB with six personalities such as a 20-year-old fe- showed that such questions occurred in both human-human male, a 50-year-old male, and robots [Sugiyama et al., 2014]. and human-agent conversations [Nisimura et al., 2003]. Ad- Both PDBs contain only static information; therefore, they equately answering them is an essential factor in the develop- cannot answer recent questions. If we want to answer recent ment of conversational agents [Sugiyama et al., 2017]. questions by PDB, we have to update PDB’s contents con- To answer personal questions, previous works developed stantly; however, updating PDBs constantly causes too ex- Person DataBase (PDB), which consists of question-answer pensive costs. The difficulty of updating PDB is the relation- pairs evoked by a pre-defined persona [Batacharia et al., ship of questions and answers (QA); for example, when the 1999; Sugiyama et al., 2014]. Although their approach cov- content of a base QA changes (e.g., Question:Do you have ers a wide variety of personal questions, developing a high- any pets?, Answer:Yes, I have a dog. change to new An- quality PDB is too expensive. The cost problem makes it dif- swer:No, I don’t have.), related contents of QAs should be ficult to update constantly; consequently, PDB usually con- changed depending on a changed content of base QA (e.g., tains only static information that rarely changes over time. Question:Do you have a dog?, Answer:Yes, I have a dog. Therefore, conversational agents using PDB cannot answer should be changed to new Answer:No, I don’t have.). PDB questions about recent events such as What did you have for has many complicated relations of QAs, it makes updating 52 Event name Event reason Event time Event impressions Played a mobile phone game Habit before going to bed, To get daily bonus 0:00-4:00 Happy Read a novel by a mobile phone Habit before going to bed, To induce sleep 0:00-4:00 Fun, Sleepy Got up To prepare a lunch box 4:00-8:00 Sleepy, Tired Went back to sleep To rest before going to office 4:00-8:00 Sleepy Got up To go to office 8:00-12:00 Sleepy Ate breakfast and made up To go to office, Hungry 8:00-12:00 Delicious, Tired Drove a car while listening to musics To go to office, To motivate 8:00-12:00 Happy, Fun Worked I’m worker, To get a salary 10:00-12:00 Difficult Ate lunch Recess 12:00-16:00 Delicious Listened to musics To relax 12:00-16:00 Fun, Sleepy Worked I’m worker, To get a salary 12:00-16:00 Difficult Worked overtime To send mail 16:00-20:00 Tired Drove a car To go shopping 16:00-20:00 Sleepy Went shopping Shop received reservation products 20:00-24:00 Happy, Fun Took a bath To refresh oneself 20:00-24:00 Warm, Sleepy, Pleasant Ate dinner Prepared for me 20:00-24:00 Delicious Did travel preparations To go for a trip tomorrow 20:00-24:00 Tired, Pleasure Looked for things I have lost bought one 20:00-24:00 Sad, Laughing Played a mobile phone game Habit before going to bed 20:00-24:00 Happy, Sleepy Went to bed To prepare tomorrow 20:00-24:00 Sleepy Table 1: Examples of collected event data PDB difficult and expensive. A PDB’s merit, which is found sation with human-agent and to make conversational agents in handcrafted-data-driven methods, is the ability to gener- more natural, we must solve this invariance problem of re- ate answers based on facts and consistency from the data. sponses. Using information of date and time to train speaker- Such handcrafted-data-driven approaches answer questions embedding vector, it may help to solve this problem. How- with consistent replies and without a lie. The consistency of ever, we can imagine easily that this model requires much the responses based on facts has the potential for improving training data which is insufficient with the amount we have the performance of conversational agents. now. Although there are many studies on conversational agent’s In this paper, we created event data for answering re- response generation [Ritter et al., 2011; Inaba and Takahashi, cent personal questions in casual conversations. This ap- 2016], few studies focus on the consistency depending on proach by the created event data is identical with PDB as an agent’s personality. Persona-based conversation models the handcrafted-data-driven approach and is essential to ver- treat personality as speaker-embedding to increase the sen- ify answering based on facts, and we use it as the first step to tence quality [Li et al., 2016]. This model is the state-of-the- develop a function that answers questions about recent expe- art model to generate conversational agent’s responses using riences based on facts. an embedding vector that expresses agent’s personality. This approach has potential to answer recent personal questions; 3 Data Collection however, it indicates two critical problems. One is that this approach cannot promise to answer without lies; this prob- To answer recent personal questions that ask about recent ex- lem is strongly related with research of PDB. Hand-crafted periences and behaviors, we collect the consistent data from database approach such as PDB can answer responses that humans as events that express experiences and behaviors. reflected a right personality unless it gets wrong matching This event data has to be collected from participants with low- of questions. In contrast, neural network based approaches costs; because we need to update it constantly. Besides, we often answer questions with response sentences that do not have to collect data from various participants because we do exist in training data; since such models are optimized only not know what kind of persona influences events. for maximizing the naturalness of response sentences. Even We recruited 62 Japanese-speaking participants of roughly though this approach has the potential to answer recent per- equal numbers of both genders whose ages ranged from 10s sonal questions, it can offer no guarantee that the answers to 60s and collected daily experiences and behaviors as event exist in training data. data. They wrote down 20 events every day and at least Another problem is that this model does not consider the two events every four hours. We collect event name, rea- past consistency depended on the day, time, and past events. sons, time, and impressions for each event; because these When we asked a question such as What did you eat last aspects are asked in casual conversation frequently. Partic- night? to conversational agents, this approach always replies ipants were indicated not to write any descriptions including the same response such as I ate ramen. This QA pair is natural privacy. Such diary-like method to write down like a diary is when we check only this one pair; however, eating the same low-cost compare than the PDB’s collecting method. Specif- food every dinner is too unnatural in the daily life of conver- ically, we prepare an Excel file and ask participants to write sational agents. Therefore, to establish a long-term conver- four aspects such as name, reasons, time, and impressions, 53 for each column. The format of this Excel file is simple; one Cluster Event name Size line is for one event, one sheet is for one day, one file is for E1 Cleaned up (掃除をした) 2480 one participant. E2 Got up (起床した) 1238 An event includes four aspects: E3 Drove a car (車を運転した) 565 1. Event name: What is happened? What did you do? E4 Took a meal (ご飯を食べた) 1478 E5 Drank drink (飲み物を飲んだ) 1095 2. Event reasons: Why did it happen? What did you do? E6 Watched TV (テレビを見た) 633 3. Event time: Selected from the following four-hour time E7 Took a bath (お風呂に入った) 436 blocks: 0:00-4:00, 4:00-8:00, 8:00-12:00, 12:00-16:00, E8 Went to a toilet (トイレに行った) 911 16:00-20:00, or 20:00-24:00. E9 Ate lunch (昼食を摂った) 1356 4. Event impressions: How did you feel? E10 Went to bed (寝た) 788 For the aspects of reasons and impressions, participants can Table 2: List of 10 cluster’s representative events write more than one sentence with a space between phrases. We define two groups for collecting data. One is the long- term group which takes data with many days from a few par- Cluster Event name Size E1 Looked SNS by PC (PC でSNSを閲覧した) 117 ticipants, and this facilitates the comparison between partici- E2 Played a game (ゲームをした) 232 pants. Another one is the short-term group which takes data E3 Read mails (メールをチェックした) 269 with few days from a lot of participants; this is necessary to E4 Cooked dinner (夕食の支度をした) 581 collect various event data. Five participants wrote 20 events E5 Worked (仕事をした) 260 per day for 30 days (long-term group), and 57 participants E6 Did the laundry (洗濯をした) 1021 wrote 20 events per day for seven days (short-term group); E7 Got up (起床した) 876 finally, we collected a total of 10,980 events. Table 1 shows E8 Going to bed (就寝する) 146 examples of events collected from a participant who belongs E9 Worked (仕事した) 216 to the short-term group. The example shows that we obtain a E10 Took the train (電車に乗った) 202 variety of events even if the only one participant wrote. E11 Came back home by car (車で帰宅した) 131 E12 Drove a car (車を運転した) 232 E13 Ate breakfast (朝食を食べた) 992 4 Data analysis E14 Took a meal (ご飯を食べる) 486 We analyze next two viewpoints to show that our collected E15 Drank Coffee (珈琲を飲んだ) 518 data helps to answer recent personal questions that related to E16 Washed dishes (食器を洗った) 577 personality and date. First, the tendency of events was vary- E17 Watched a video (動画を見た) 233 E18 Watched TV (テレビを見た) 274 ing among participants; it shows that we have to reflect par- E19 Watching TV (テレビを見る) 126 ticipant’s characteristics to answer recent personal questions. E20 Took a bath (お風呂に入った) 436 Second, the tendency of events was varying according to a E21 Went to a toilet (トイレに行った) 222 day of the week; it shows that we have to reflect a day of the E22 Went shopping (買い物に行った) 689 week and update event data constantly. E23 Read a newspaper (新聞を読んだ) 218 To analyze the tendency of events, we categorized the E24 Tidying up dishes (食器を片づける) 540 collected event data since they have slightly different event E25 Ate lunch (昼食を摂った) 138 names, with which we cannot count the occurrence of each E26 Talked with guests (来客と話した) 143 event. For example, we wish to handle two events such as E27 Sent a child to sleep (子供を寝かせた) 317 Went to school’ and Went to high school as the same event. E28 Woke up (起きた) 143 E29 Slept in bed (ベッドで寝た) 300 To collect such similar events as the same event, we perform E30 Slept (寝た) 345 the word-based hierarchical clustering using word2vec that trained from Wikipedia data. Table 3: List of 30 cluster’s representative events Next, we highlight the difference between event’s tenden- cies among participants and days. We calculate frequency distributions of events for each participant and each day, and compare a JS divergence of these frequency distributions. Wikipedia articles; word2vec is useful to convert event names This comparison clarifies two relationships of event tenden- to a word embedding. In clustering, we tokenize event names cies: Distributions of event frequency depend on each par- by mecab [Kudo, 2006], and restore tokenized words to orig- ticipant and Participants have different distributions of event inal forms. Next, we calculate vectors by adding together frequency depending on each day. word2vec of each tokenized words, and cluster these calcu- lated vectors with Ward’s method [Szekely and Rizzo, 2005]. 4.1 Event clustering Figure 1 shows a dendrogram and a heatmap of each vector. We performed hierarchical clustering to find similar events in To confirm the difference of clustering results by the number the collected data [Larsen and Aone, 1999]. This clustering of clusters, we respectively show the hierarchical clustering is both analysis and a necessary procedure to compare events results of ten clusters and 30 clusters. Table 2 and Table 3 are among participants or days by collecting clusters. Before ten and 30 lists of events. Event names are the nearest event clustering, we trained word2vec [Mikolov et al., 2013] from to the center of each collected cluster, and cluster sizes are 54 Participants Cluster P1 P2 P3 P4 P5 E1 Saw SNS by a PC 1 0 7 3 9 E2 Played a game 0 0 2 97 0 E3 Read mails 34 16 6 5 0 E4 Cooked dinner 73 22 53 5 8 E5 Worked 0 38 14 57 0 E6 Did the laundry 25 72 32 67 39 E7 Got up 63 16 50 96 13 E8 Going to bed 0 0 11 0 22 E9 Worked 9 3 32 8 14 E10 Took the train 6 15 0 0 7 E11 Came back home by a car 90 0 3 0 1 E12 Drove a car 0 18 28 4 0 E13 Ate breakfast 85 82 54 55 80 E14 Took a meal 2 6 20 2 62 E15 Drank Coffee 39 18 1 60 69 E16 Washed dishes 2 13 30 25 36 E17 Watched a video 7 23 3 5 0 E18 Watched TV 61 7 0 1 0 Figure 1: Result of hierarchical clustering for events E19 Watching TV 0 0 29 0 2 E20 Took a bath 23 25 32 20 47 E21 Went to a toilet 1 0 20 0 0 numbers of events included in each cluster. E22 Went shopping 24 127 42 8 17 From Table 2, we obtained common events which seem E23 Read a newspaper 44 5 1 4 13 to happen to anyone such as Got up, Took a meal, Drank E24 Tidying up dishes 7 6 54 38 19 drink, Took a bath, Ate lunch and more. In contrast, from E25 Ate lunch 0 0 18 6 2 Table 3, we obtained detailed events which seem to happen E26 Talked with guests 2 3 13 1 3 to specific personas such as Look SNS by PC, Played a game, E27 Sent a child to sleep 2 11 15 3 33 E28 Woke up 0 32 1 0 35 Watched a video and more. Such events which indicates par- E29 Slept in bed 0 30 20 30 12 ticipant’s characteristic are important to highlight the differ- E30 Slept 0 12 9 0 57 ence between participants; therefore we use the 30 lists of events as clustering result to compare events between partici- Table 4: Cluster assignment of events in participant whom is only pants and days in following analysis sections. long-term group Note that we defined size of clusters based on a few pre- liminary experiments. Proposing the clustering method that determines a size of clusters based on clusters variances or entropy has a potential to improve clustering performance; therefore, we will tackle defining a better model to handle are different on each participant. event data in future work. To analyze details of event tendencies, we show counts of From Figure 1, we can find a few particularly bright clus- event cluster assignment in each participant who is a long- ters that include very similar events. In contrast, some clus- term group, in Table 4. Most participants have different dis- ters with less brightness include various events that are not so tributions of events, but E4, E6, E7, E9( and E5), E13, E14, similar. E15, E16, E20, E22, E23, E24, E26, and E27 occurred more Since this hierarchical clustering successfully makes clus- than once in all participants of the long-term group. E4, E6, ters, we can benefit by using clustering results for data anal- E16, E22, and E24 are clusters containing mainly housework yses. This clustering method that considers word meaning as such as washing, cleaning, cooking, shopping and more. E7, word2vec, could make clusters which gathered almost same E13, E14, E15, and E20 are clusters that indicate physiolog- meaning events. ical desires such as eating, drinking, sleeping, taking a bath and more. E9 (and E5) is the cluster that indicates mainly 4.2 Event analysis for each participant working, E23 indicates reading, and E26 indicate talking. First, we analyzed events among participants. To highlight The last E27 is a cluster included various events such as child the differences between participants, we calculate the distri- ‐ rearing, one’s hobby, and school life. Such basic events that bution on clusters of every participant. To calculate it, we are related to living were observed in almost participants. In used the 30 clusters in Table 3. We compare these cluster contrast, we obtained that events which relate to entertain- distributions between each participants using JS divergence. ment such as Played a game were observed in a specific par- The averaged JS divergence of every participant was 0.39. ticipant such as P3. The minimum JS divergence is 0.063, and the maximum JS divergence is 0.77, these scores were found among partici- These results show that participants have different event pants who are the short-term group. Averaged JS divergence tendencies. This indicates that we should collect data which is not close to 0; it means that distributions of event frequency depends on each persona to answer recent personal questions. 55 Mon. Tue. Wed. Thu. Fri. Sat. Sun. with a question. These results show us the possibility to an- Mon. .005 .005 .005 .006 .011 .013 swer a part of questions that were unsolved future works of Tue. .005 .005 .007 .006 .013 .019 PDB with our collected data. Wed. .005 .005 .004 .004 .008 .011 We can also answer questions that ask opinions. Such Thu. .005 .007 .004 .005 .011 .017 questions frequently occurred after disclosure or an answer Fri. .006 .006 .004 .005 .013 .017 Sat. .011 .013 .008 .011 .013 .005 that replied to first questions. To answer with opinions, we Sun. .013 .019 .011 .017 .017 .005 use aspects of event impressions. We show examples of ques- tions which ask opinions about events in Table 7. Specifically, Table 5: JS divergences between days of the week (bold means the when a conversational agent say I watched a movie. as disclo- top 10 high scores) sure, and the user asks Do you like it? that asks conversational agent’s opinion, a conversational agent can answer It is fun. by using an aspect of event impression from our collected 4.3 Event statistics for each day data. In question-answering based on the conventional PDB, Second, we analyzed events among days. Like Section 4.2, we cannot handle such kind of questions which continued the we counted the clusters of every participant with 30 clusters same topic as the previous turn. Answering questions about in Table 3. We compare cluster counts of all participants in a the details of the same one event, it shows the potential which total at a day of the week. We show JS divergences between improves the question-answering function to talk deeper. days of the week in Table 5. However, we obtain some questions that we could not an- We focus on JS divergences between weekdays and week- swer by our collected data; there are Questions that ask about ends. These high JS divergences (We showed it as bold in Ta- agent’s past custom and Questions that ask about agent’s fu- ble 5) show the difference between weekdays and weekends; ture. To answer questions that ask about agent’s future, we furthermore, the small JS divergence scores are concentrated have to prepare the other data such as plans made by agents. in between weekdays and between weekends. This result These plans may need the approach such as the belief-desire- shows that participants spend different life between weekdays intention model that is different from our event data. To an- and weekends. Such result that we can imagine easily lets swer questions that ask about agent’s past custom, we need us reconfirm the importance to answer depending on a day. data which indicates habitual events and experiences. Such Therefore, we need data which depend on each day to answer data seem closely related to our event data, because habitual questions that ask about events. events and experiences may be made by the accumulation of recent events. We clarify the relationship between past cus- tom and events, and will propose a method that generates past 5 Discussion custom based on accumulated recent events in future work. In this section, we discuss the potential to answer recent per- From analyses and case studies, we showed the potential sonal questions by our collected data. Our discussion follows of answering for recent personal questions that cannot be an- the “comparison with the conversation corpus” in Sugiyama swered by the previous PDB. Our collected data helps to an- et al. [Sugiyama et al., 2014], whose PDB covers 41.3% of swer not only asking events but also asking opinions. How- questions in real conversations and explains why other ques- ever, we obtain some problems that remain about questions tions were excluded. The top reason that questions were ex- which ask about past custom and future. In future work, we cluded is “limited by specific words, date, or time” such as tackle to answer questions that ask past custom such as ha- What did you eat for lunch today? or Where did you go bitual events using our event data. Furthermore, clarifying this summer vacation?, such questions are about 71.2% of volumes and frequency to collect enough event data; these a whole of excluded questions. We mainly focus on these ex- are How many events do we need in one day?, How many cluded questions and show case studies which can answer by times do we ask to write per one day?, and How many days our collected data. Tackling to answer such questions helps do we ask to write events?. Besides the data collection, to de- to solve future works of the previous research. velop conversational agents that answer recent personal ques- First of all, we collect 286 questions that are the same tions using event data, we have to propose a method that finds as excluded questions by Sugiyama et al. [Sugiyama et al., events which match with user’s recent personal questions. 2014], and extract 204 questions that were excluded by “lim- ited by specific words, date, or time.” In previous works, they said that these questions are difficult to maintain consistency 6 Conclusion with 5W1H answers in particular. We focus on these ques- In this paper, we collect 10,980 events which express recent tions and find questions that can answer questions if we use experiences and behaviors to help conversational agents an- event data. We show examples of such question which can swer questions about recent experiences. First of all, we answer based on an event in Table 6. analyze collected data to highlight the tendencies of events From Table 6, some questions which ask about speaker’s based on each participant and each weekday, and we show the recent behaviors can answer by our collected data. For exam- necessity of our event data that make conversational agents ple, we can answer a question such as What did you eat for more natural. Our analysis shows that event data reflect lunch today?, an answer is Yes, I ate a curry and rice by us- participant’s characteristics and dependencies on weekdays, ing an event such as Ate a curry and rice. In this manner, we and we show two knowledge about tendencies of events. can make an answer utterance that based on an event matched One, event tendencies are depending on each participant; we 56 Question Answer Event Did you eat for lunch today? Yes, I ate. Ate a lunch What did you eat for lunch today? I ate a curry and rice. Ate a curry and rice Did you play video games lately? Yes, I played video games. Played video games What did you play video games lately? I played smartphone games. Playing smartphone games What kind of games did you play lately? Smartphone games. Playing smartphone games Did you watch a movie? Yes, I watch a movie. Watched a movie Where did you go out somewhere recently? I went to the nearby French restaurant. Went to the neighboring French restaurant Where did you go out somewhere recently? I went to the spa. Going to the spa Table 6: Examples of question which can answer based on an event First question First answer / Disclosure Question Opinions Event Event impressions – I watched a video Do you like it? It is fun. Watched a video Fun – I use PC for only Do you like it? Yes, I love it. Played a PC game Fun / Tiresome presentation and playing games Did you go to Yes, I watched How was it? It is fun. Watched a movie Fun the theater recently? “The Dark Night Rising” and “Library war” Table 7: Examples of questions which ask impression or evaluation should collect event data which depends on each conversa- ACM SIGKDD international conference on Knowledge tional agent’s persona. Another one, event tendencies are de- discovery and data mining, pages 16–22. ACM, 1999. pending on each weekday; we should collect event data which [Li et al., 2016] Jiwei Li, Michel Galley, Chris Brockett, depends on each day to make conversational agent’s answers Georgios P Spithourakis, Jianfeng Gao, and Bill Dolan. A more natural. persona-based neural conversation model. arXiv preprint In the discussion, we followed the previous works and ob- arXiv:1603.06155, 2016. tained case studies that can answer by our collected event [Mikolov et al., 2013] Tomas Mikolov, Kai Chen, Greg data. Our event data helps to answer recent personal ques- tions such as What did you eat for lunch today? that asks Corrado, and Jeffrey Dean. Efficient estimation of about doing conversational agent’s events; therefore, results word representations in vector space. arXiv preprint show potential to achieve our first purpose that answers a arXiv:1301.3781, 2013. part of questions that cannot be answered by the previous [Nisimura et al., 2003] Ryuhei Nisimura, Yohei Nishihara, PDB. Furthermore, aspects of event impressions help to an- Ryosuke Tsurumi, Akinobu Lee, Hiroshi Saruwatari, and swer questions that ask opinions such as Do you like it?. This Kiyohiro Shikano. Takemaru-kun: Speech-oriented infor- continuous question-answering shows the potential which im- mation system for real world research platform. Proceed- proves the question-answering function to talk deeper. ings of International Workshop on Language Understand- In future work, we clarify volume and frequency to collect ing and Agents for Real World Interaction, 2003. enough event data, and develop conversational agents that an- [Ritter et al., 2011] Alan Ritter, Colin Cherry, and William B swer recent personal questions by collected event data. Dolan. Data-driven response generation in social media. In Proceedings of the conference on empirical methods in References natural language processing, pages 583–593. Association for Computational Linguistics, 2011. [Batacharia et al., 1999] B Batacharia, D Levy, R Catizone, A Krotov, and Y Wilks. Converse: a conversational [Sugiyama et al., 2014] Hiroaki Sugiyama, Toyomi Meguro, companion. In Machine conversations, pages 205–215. Ryuichiro Higashinaka, and Yasuhiro Minami. Large- Springer, 1999. scale collection and analysis of personal question-answer pairs for conversational agents. In International Con- [Inaba and Takahashi, 2016] Michimasa Inaba and Kenichi ference on Intelligent Virtual Agents, pages 420–433. Takahashi. Neural utterance ranking model for conversa- Springer, 2014. tional dialogue systems. In Proceedings of the 17th An- [Sugiyama et al., 2017] Hiroaki Sugiyama, Toyomi Meguro, nual Meeting of the Special Interest Group on Discourse and Ryuichiro Higashinaka. Evaluation of question- and Dialogue, pages 393–403, 2016. answering system about conversational agent s person- [Kudo, 2006] Taku Kudo. Mecab: Yet another part-of- ality. In Dialogues with Social Robots, pages 183–194. speech and morphological analyzer. http://mecab. source- Springer, 2017. forge. jp, 2006. [Szekely and Rizzo, 2005] Gabor J Szekely and Maria L [Larsen and Aone, 1999] Bjornar Larsen and Chinatsu Rizzo. Hierarchical clustering via joint between-within Aone. Fast and effective text mining using linear- distances: Extending ward’s minimum variance method. time document clustering. In Proceedings of the fifth Journal of classification, 22(2):151–183, 2005. 57