Utterance Embedding for Detecting Argumentative Topics in Assembly Minutes Hiroto Yano1,*,† , Soichiro Yasumoto1,† and Kazuhiro Takeuchi1,† 1 osaka electro-communication university Abstract Meeting minutes from government and local assemblies are comprehensive documents that meticulously record the deliberations and discussions of each member. These resources provide crucial information about the background of the decisions and retrieve their path through discussions with final approval. Unlike ordinary texts, minutes encapsulate multi-speaker dialogues, making it imperative to identify argumentative topics in which participants exchange different viewpoints on the matter at hand. This paper presents a novel computational model, rooted in machine learning, that uses speaker alternation patterns to transform each utterance into a vector representation. This model lays the foundation for the analysis of complex textual data as graph representation and holds promise for applications in Explainable Artificial Intelligence (XAI) by aiding in the verification of complex textual context summarization. Using these vectorized utterances, we then formulate clusters that capture the argumentative topic and extract discriminative keywords related to the discussion. The effectiveness of our approach is assessed by contrasting the extracted words with a manually written tree-structured summary. Keywords Argument Graph Mining, Argumentative Topics, Utterance Embedding 1. Introduction The government and local council meeting minutes (hereafter referred to simply as assembly minutes), where each member’s statements and discussions are recorded and made public, differ from documents created by individuals as they describe the process of resolving differences of opinion among multiple people. From the assembly minutes, one can verify the background of a particular proposal and how it was discussed and approved. In contrast to one-way mass communication, as exemplified by news texts, much information exchange, such as email and social networking applications, is conducted by multiple participants. Dialogue is the most basic form of dynamic information exchange, but it is often characterized as highly individualized, verbose, and repetitive. In the field of natural language processing (NLP), the AMI Meeting Corpus [1] is an early work that paves the way for sophisticated analyses of multi-party interactions in meeting environments. With the rich set of annotations and transcriptions to the corpus, it has helped The 2nd International Workshop on Knowledge Graph Reasoning for Explainable Artificial Intelligence, December 9, 2023, Tokyo, Japan * Corresponding author. † These authors contributed equally. $ mi22a012@oecu.jp (H. Yano); mi23a006@oecu.jp (S. Yasumoto); takeuchi@osakac.ac.jp (K. Takeuchi) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings researchers delve deep into the nuances of human communication. There are some dialogue studies that summarize dialogues. For example, Liu et al. (2019a) [2] collected a dialogue summary dataset from DiDi customer service center logs, and Gliwa et al. (2019) [3] created the SAMSum corpus. In this paper we focus on assembly minutes that are not mere transcriptions of casual dialogue. Assembly minutes are structured records of the dialogue in a meeting, and as such, they have characteristics similar to formal documents as well as being dialogues. Consequently, decipher- ing these documents requires a specialized set of skills. This study uses hand-written summaries of the minutes, each of which is presented as a tree structure. These tree representations serve as essential clues in the complicated process of deciphering and analyzing assembly minutes. The current state of AI summarization cannot explain how it analyzes and summarizes an original text. Our proposed method uses a large-scale language model to identify argumentative topics that can help decipher the structure of these minutes. Unlike general dialogues, where argumentative topics emerge and evolve through the natural alternation of utterances, assembly minutes present a more rigid structure where such spontaneous alternations are absent. To address this, we employ a model that is able to measure the semantic proximity between utterances, thereby facilitating a more nuanced analysis of the assembly minutes. Argumentative topics are detected by keyword weighting, like tf-idf and LDA (Latent Dirichlet Allocation), which are common methods for textual keyword analysis. Specifically, based on the vector representation of each utterance with the trained model, the utterances are clustered to obtain a set of utterances that most closely match the argumentative topic. Using this set of utterances as a document, the tf-idf score is applied and the top five ranked words are extracted and compared with the words in the summary text. 2. Analysis of Assembly Minutes In politics, it is important to accurately and fairly inform citizens of the assembly decision- making process to ensure transparency and fairness. One such method is to record and publish ‘assembly minutes’ of the statements of council participants that were deliberated and discussed by the local government council. Citizens can obtain political information, such as whether each participant took a position in favor of or against a particular issue of interest, from publicly available assembly minutes. However, it is difficult to read the vast number of publicly available assembly minutes. In addition, the recent proliferation of digital media has increased the need for fact-checking to detect fake news and verify the truth and accuracy of the information [4, 5]. To enable citizens to handle assembly minutes as primary information, it is important to improve the searchability and visibility of assembly minutes by identifying their discussion structure. 2.1. Argument Scheme An argument scheme [6] is a general template or structure that represents a general pattern of reasoning or argumentation and is defined to provide a framework for constructing and analyzing arguments. The theory of the schemes consists of specific propositions, statements, and argumentation patterns. For example, given the following statements • Global temperatures are rising. • Ice in the polar regions is melting. The argument that follows from these two statements could be: "Global temperatures are rising, therefore ice in the polar regions is melting." This sentence represents a cause (rising global temperature) and effect (melting polar ice). In this example, the argumentation pattern is indeed ‘cause-and-effect’, and the underlying premise of the argument can be stated as "If global temperatures rise, then ice in the polar regions melts." This premise is generally accepted based on scientific consensus. This relationship is commonly observed in scientific or fact-based arguments. Using the concept of the Argument Scheme, analyzing the process of argumentation against assembly (or meeting, discussion) minutes is called argument mining. In recent years, research in the field of natural language processing has been conducted with the goal of identifying argument structures in natural language text. The identification of argument structures involves a variety of tasks, such as separating arguments, classifying argument components into claims and premises, and identifying argument relations. Within argument structure identification, the task we focus on is identifying topics in an argument. Another advantage of applying Argument Schemes to minutes is their ability to visualize and map the structure of an argument. By breaking down an argument into its fundamental components, including premises, inferences, and conclusions, argument schemes can make the underlying logic of an argument more transparent and easier to understand. This can be particularly helpful in complex discussions where multiple arguments and counter-arguments are being made and where it might otherwise be difficult to keep track of the various points being put forward. Figure 1 shows an example of visualization of discussion. In the figure, Speaker A makes two statements in support of one claim. Speaker B, who responds to A, asserts a disagreement with the opinion of A and also forms a structure that is critical of A’s supporting opinion. By identifying argumentation patterns for each utterance and tracking the connections between elements, it is possible to analyze the discussion process. The difference between a discussion and a text lies in the foundational structure. Unlike conventional texts that are composed of sentences, a discussion is composed of alternating utterances from multiple speakers. As shown in Figure 1, speakers A and B take turns to contribute to the discussion. The basis for analyzing such discussions is the adjacency pair of utterances from different speakers [7, 8, 9]. Let taget discussion 𝑑𝑘 : be a sequence of exchanges. 𝑑𝑘 = {𝑒1 , 𝑒2 , 𝑒3 , . . .} It can also be described as: 𝑑𝑘 = {𝑢1 , 𝑢2 , 𝑢3 , . . .} Where: discussion c1: claim X agreement A c1: I [like baseball](X). A c2: support support X Y B c1: I don't like it much. c3: support X Y c2: [breathtaking tension with each pitch] (Y)... critical A opinion c3: The game between the pitcher and the hitter... c1: claim support X oppose c2: [time is too long and I get B B bored](A) c2: critical critical X Y opinion ・ given the rationale ・ ・ A Figure 1: Argument Scheme • 𝑑𝑘 is a specific discussion. • In the first representation, each 𝑒𝑖 denotes an exchange in the discussion. • In the second representation, each 𝑢𝑖 denotes an utterance in the discussion. • The subscript 𝑖 (whether in 𝑒𝑖 or 𝑢𝑖 ) is a positive integer that represents the position of the exchange or utterance in the sequence, respectively. Exchange 𝑒𝑖 (or Adjacency Pair): A pair of consecutive utterances 𝑢𝑖 and 𝑢𝑖+1 where the speaker of 𝑢𝑖 is not the same as the speaker of 𝑢𝑖+1 . 𝑒𝑖 = (𝑢𝑖 , 𝑢𝑖+1 ) Such that: 𝑠(𝑢𝑖 ) ̸= 𝑠(𝑢𝑖+1 ) Where 𝑠(𝑢𝑖 ) and 𝑠(𝑢𝑖+1 ) represent the speakers of utterances 𝑢𝑖 and 𝑢𝑖+1 , respectively. 2.2. Keyword based analysis In the fields of information retrieval and text processing, quantifying the importance of words within a document relative to a larger corpus is essential for various tasks. One well-known method is the term frequency-inverse document frequency (tf-idf). This numerical statistic quantifies the importance of a term not only based on its frequency in a single document but also offsets its commonality across a larger collection of documents, providing an adjusted measure of term importance. The final tf-idf score for a term in a document is the product of its term frequency (tf) and idf (inverse document frequency) scores. Term Frequency (tf) is derived directly from the BoW representation by counting the number of times a term appears in a document. Inverse Document Frequency (idf) is calculated based on the frequency of a term across all documents in the corpus. In parallel, Latent Dirichlet Allocation (LDA) is another cornerstone of text analytics. Unlike tf-idf, which focuses primarily on the meaning of terms, LDA is concerned with discovering latent thematic structures present in the corpus. It aims to represent documents as mixtures of topics, where each topic corresponds to a distribution of words. Notably, both tf-idf and LDA use the basic bag-of-words (BoW) model. Although this model is simple in its representation, focusing on the occurrence of words in a document and ignoring their order, it remains effective in many text-processing scenarios. Rather than relying on documents, this study adopted a discussion-based framework. Although the assembly minutes that are the subject of this paper are typically represented as text, a discussion is not a monologue but consists of utterances from multiple speakers. The unit corresponding to the document in text processing is a discussion characterized by the contributions of multiple interlocutors, where interactions emerge from the interplay of utterances among multiple speakers. The purpose of this paper is to provide such quantifications of word meaning, tailored to assembly minutes. A text provided as assembly minutes contains multiple discussions that took place on a single day. Since each discussion is considered independent, each discussion can be considered a "document" in tf-idf terms. In other words, when analyzing the minutes, a "document" corresponds to a discussion. To provide clarity and precision in our issue, we now turn to formally represent the problem in this paper. • 𝐷 as the target set of discussions. It corresponds to the text of assembly minutes for a specific day. • 𝑑𝑖 as a distinct discussion in 𝐷. LDA is a probabilistic topic model used to discover topics from a collection of documents. It assumes that each document consists of a mixture of several topics. Each topic is represented as a distribution over words in the vocabulary. While LDA is generally effective in identifying sets of words associated with specific topics, its direct application to the context of real-world discussions can be challenging. This kind of direct relationship isn’t considered in the basic LDA model. Figure 2 shows the plate notation for LDA with Dirichlet-distributed topic-word distributions. The posterior distribution is intractable to compute directly, so approximation techniques, such as Gibbs sampling or variational inference, are utilized. LDA’s strength lies in its ability to reveal underlying topics within text, aiding in document classification, information retrieval, and content summarization. The BoW model represents text documents as vectors, where each dimension corresponds to a unique word from the entire corpus, and the value represents the frequency or presence of that word in the document. BoW disregards the order of words, focusing only on the occurrence and frequency of words. LDA is a generative probabilistic model designed to uncover latent topics within a collection of documents. It depicts documents as combinations of topics, with each topic being a probability distribution over words. These outputs, characterized by clusters of co-occurring words, can be interpreted as topics. The problem here is that topics in assembly minutes are not those found in ordinary text or dialogue, but those that form the basis for analyzing structured meeting minutes in a highly empirical way. 𝜂 𝛽 𝑘 𝛼 𝜃 𝗓 𝘸 𝑁 𝑀 Figure 2: Plate notation for LDA with Dirichlet-distributed topic-word distributions 3. Data This paper proposes a method for detecting topics being discussed among multiple speakers from actual assembly minutes, which is a basis to form an argument scheme. For the actual assembly minutes, we use the minutes of the Tokyo Metropolitan Assembly meetings (minutes of TMA), in which the utterances made in actual assemblies are recorded. For the evaluation of the proposed method, on the other hand, we will use the Tokyo Metropolitan Assembly Bulletin (TMAB), in which the utterances by specific speakers and the question-and-answer sessions in response to those utterances are manually summarized. Figure 3 shows the relationship between the discussions in the meeting minutes and the summaries in TMAB. The summary in the figure is simple, providing a single topic shared between the two speakers, and summarizing each speaker’s argument as a short phrase. Each speaker’s summarized phrase is often shorter than the phrases that we refer to. Each speaker’s assertion about a topic is expressed in as few summary phrases as possible. There is a difference between general discussions and those in the actual minutes. Of course, the constituent units of text are sentences and discussions are utterances, but the actual minutes do not consist of the general spoken utterances themselves. In a manner of speaking, the minutes can be described as ‘pseudo-written language.’ One characteristic of this pseudo- written language is that it often forms lengthy sentences as shown in Table 1. Such long sentences are difficult to handle. As a preprocessing step for meeting analysis, we divide pseudo- sentences into smaller units, which we refer to as clauses. So, we use preprocessing to analyze utterances by dividing them into ‘clauses,’ which are shorter units than sentences. That is, we have to analyze which topic each ‘clause’ relates to in the utterance. Another much larger difference exists between utterances in general discussions and those in the actual minutes. That is, in the minutes in Figure 3, each speaker does not take a simple alternation of speakers in a general discussion. Moreover, as the manually provided structural summary of TMAB in the figure shows, there are multiple speakers responding to a single minutes of TMA TMAB discussion 1 discussion 1 Questioner's Statement. Questioner Respondent 1's statement. Argumentative Topic1 Respondent 2's statement. Respondent 3's statement. Summary of the first question from the questioner discussion 2 Summary of Responses Questioner's Statement. from Respondent 2 Respondent 1's statement. Respondent 2's statement. Argumentative Topic2 Respondent 3's statement. Summary of the second question from the discussion 3 questioner Questioner's Statement. Summary of Respondent Respondent 1's statement. 1's response Respondent 2's statement. Respondent 3's statement. Argumentative Topic 3 discussion 4 Summary of the third question from the Questioner's Statement. questioner Respondent 1's statement. Respondent 2's statement. Summary of Responses by Respondent 3's statement. Respondent 3 discussion 5 discussion 2 Figure 3: Relationship between a discussion in assembly minutes and its summary in TMAB speaker’s question. This is due to the fact that the assembly is planned and facilitated in advance. We assume that the summary provided by TMAB is an important requirement for the analysis of the assembly minutes. Specifically, one questioner provides multiple argumentative topics, and multiple people answer each argumentative topic. Therefore, without detecting the argumentative topic for each pair of questioner and respondent from the minutes, it is not possible to perform the usual structural analysis of the discussion. In this paper, we propose a method to detect argumentative topics in assembly minutes and evaluate detected argumentative topics compared with TMAB’s manual summary. 4. Proposed Method In this section, we present a new computational model rooted in machine learning that uses speaker alternation patterns in meeting minutes as the basis for training data. Figure 4 shows a schematic of the entire proposed method. From Figure 4, the proposed method uses the following procedure to extract argumentative topics. • Fine-tuning SBERT by pairing the utterances of the questioner and the respondent Table 1 Example utterances in TMA As a member of the Liberal Democratic Party of the Tokyo Metropolitan Assembly, I would like to ask several questions regarding issues of the metropolitan government. I look forward to clear answers from the Governor and other directors. The construction of the Tsukuba Express is being carried out amid great expectations of the people of Tokyo. Ltd. and the three prefectures concerned have announced that the 58-km line between Akihabara and Tsukuba is scheduled to open simultaneously in the fall of 2005. Until now, there was a time when even achieving the opening target of fiscal 2005 was in doubt due to delays in some land acquisitions and other factors, but what is the rationale behind this clarification of the opening date? Questioner The Tsukuba Express was originally planned as a national project to alleviate congestion on the Joban Line and to provide access to Tsukuba Science City, as well as to provide a large supply of residential land, and the so-called Housing Railway Law was enacted by the Diet. However, compared to when it was conceived in 1989, the country’s socioeconomic situation has changed dramatically today. In fact, the latest announcement revised downward the transportation demand to less than 300,000 persons per day. Since the initial plan was for approximately 600,000 passengers, we have no choice but to estimate the demand at half of that figure, and we expect to face severe business management challenges after the opening. ... I would like to answer a general question from Councilor Naoki Takashima. Regarding the idea of abolishing the special industrial district building ordinance, in order to strengthen Tokyo’s manufacturing and other industries, it is necessary to create an environment that allows factories to be rebuilt or expanded as freely as possible. In response to the Tokyo Metropolitan Government’s request, last July the national government abolished the Law on Industrial Restriction, which regulates the location Respondent 1 of factories in urban areas. The Tokyo Metropolitan Government is also committed to strengthening its industrial capacity through the early elimination of restrictions on factory locations and various forms of support. We will continue to coordinate with the municipalities to repeal the special industrial district building ordinance. The other questions will be answered by the Director General concerned. I would like to answer five questions regarding the new Joban Line, etc. First, regarding the timing of the opening of the new Joban Line, by the end of last year, discussions on the intersection with the Sobu Nagareyama Electric Railway, which had been an issue in the construction process, were completed, and the necessary railroad land was almost completely secured. Based on this, and as a result of coordination among the parties involved, major civil works are expected to be completed by the end of the fiscal year. Furthermore, taking into consideration the progress of the construction of the station building and facilities, as well as the process of running tests of the trains, we have Respondent 2 come to the conclusion that the station will open in the fall of 2005. Next, regarding the economic effects of the Joban New Line, according to a study conducted in 1997 by the Joban New Line Project Promotion Council, which was organized mainly by private companies, the direct effects of the Joban New Line project are estimated to be approximately 1 trillion yen for the construction and operation of the railroad and approximately 6 trillion yen for housing development and public investment in areas along the line, for a total of approximately 7 trillion yen over the 30 years from 1996. ... 1. fine-tuning 2. embedding questioner’s utterance fine-tuned utterance pair SBERT utterance SBERT vector respondent's utterances 3. clustering 4. TF-IDF pseudo-document 1:Top 5 words pseudo-document 2:Top 5 words pseudo-document 3:Top 5 words utterance TF-IDF clustering vector pseudo-document k:Top 5 words pseudo-document pseudo-document Figure 4: Overview of the entire proposed method • Create utterance vectors using SBERT with fine-tuning • Clustering of utterance vectors to form pseudo-documents • Calculate TF-IDF for the formed pseudo-document and extract the top 5 words The details of each procedure are explained in the following sections. 4.1. Proposed model for utterance embedding In order to determine the argumentative topic from the meeting minutes, it is necessary to find utterance clauses that are strongly related to the argumentative topic from the questioner’s utterance. Also, as mentioned in section 3, the meeting minutes used must detect the argumen- tative topic for each pair of questioner and respondent, since one questioner provides multiple arguments and multiple people respond to each argument. Therefore, by learning the utterances of the questioner and the respondent as a pair, we form utterance vectors that represent the argumentative topic. Sentence-BERT [10] (hereafter SBERT) is used to form utterance vectors. SBERT is a modifica- tion of the pretrained BERT network that uses siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. In this paper, we perform fine-tuning of SBERT using the training data of the questioner- answer utterance pairs and triples of unrelated utterances in the minutes. 4.2. Training Create training data for fine-tuning SBERT from meeting minutes. Figure 5 shows a schematic of creating training data from meeting minutes. The training data is created with each respondent’s utterance for each discussion as the anchor, the questioner’s utterance as the positive, and the utterances of respondents other than the respondent who is the anchor as the other. Using this data, SBERT is trained so that the utterance vectors of a questioner and a respondent within the same discussion in the meeting minutes are close and the utterance vectors of a respondent and another respondent are far apart. minutes of TMA Training Data discussion 1 discussion 1 anchor positive other Questioner's Utterances. Respondent 1’s Questioner’s Respondent 2,3’s Respondent 1's Utterances. Utterances Utterances Utterances Respondent 2's Utterances. Respondent 2’s Questioner’s Respondent 1,3’s Utterances Utterances Utterances Respondent 3's Utterances. Respondent 3’s Questioner’s Respondent 1,2’s Utterances Utterances Utterances discussion 2 discussion 2 Figure 5: Created training data for the pair having the discussion from the meeting minutes Figure 6 shows the structure of SBERT’s learning model. A Dense layer is added after the pooling layer in order to learn the utterance clauses in the meeting minutes with low-dimensional vectors, and in this paper, the training is performed with 10 dimensions. The vector representation is formed using SBERT after fine-tuning for the utterance clauses of the same meeting minutes used for training. 4.3. Clustering based on neighbor pairs Clustering is performed on the formed utterance vectors to form a set of utterances that are strongly related to the argumentative topic. Here, since SBERT is trained on questioner- respondent pair data, each cluster is a set of utterances related to an argumentative topic. 4.4. Extracting words for discussion Extract words corresponding to the argumentative topic from the clusters formed. Inspired by BERTopic[11], this paper investigates the extraction of words corresponding to argumentative topics using tf-idf, where all utterance clauses in each cluster are considered as one pseudo- document. BERTopic is a topic model that extracts consistent topic representations through class-based tf-idf. First, document embeddings are created using SBERT. Next, the created document embeddings are classified by clustering, and then class-based tf-idf is used to generate topic representations for the clusters that are formed. An important aspect of BERTopic is that it generates topic representations by clustering document embeddings obtained from a large-scale language model and ranking sentences using class-based tf-idf. Other embedding techniques Softmax classfier (u,v, |u-v|) u v Dense Dense pooling pooling BERT BERT Clause A Clause B Figure 6: Training of Sentence-BERT model can also be used if the language model generating the document embedding is fine-tuned for semantic similarity. Since each cluster is a set of utterance clauses that are strongly related to an argumentative topic, a pseudo-document composed of utterance clauses that discuss a single argumentative topic can be created by concatenating all the utterance clauses that belong to a cluster. Important keywords in the argumentative topic are extracted by weighting words by tf-idf for the created pseudo-document. Figure 7 shows the method of extraction and evaluation from clusters. First, the utterance clauses belonging to each cluster are concatenated and used as pseudo-document to weight words by tf-idf. Next, the utterance vector is used to filter out unnecessary clauses that do not contribute to the identification of keywords. The center of gravity is calculated from the utterance vectors of each cluster, the similarity between the calculated center of gravity and each utterance vector is measured, and utterance clauses with low similarity are deleted. After filtering, the top 5 words of tf-idf values are extracted. Here, the top 5 words of the extracted tf-idf values are compared with the argumentative topic contained in TMAB, and evaluated based on how well the keywords related to the argumentative topic are extracted by the partial match ratio of the words. clustering cluster 1 cluster 2 clause1 clause1 clause2 clause2 clause1 clause3 clause3 clause3 c1 c2 c3 c2 c2 c3 c3 c1 c1 clause2 TF-IDF cluster 1 cluster 2 word1 word1 Its Topic in TMAB word2 word2 Comparison Topic1: About the new subway line Topic2: Measures to Promote Shopping word3 word3 District Development word4 word4 word5 word5 Figure 7: How TMAB compares topics and extracted words 5. Experiments and Results Perform comparative experiments with LDA and tf-idf. LDA is given the combined text of all utterances of the questioner and respondent in a day’s worth of conference discussion minutes, and extracts the top five words for each topic. tf-idf extracts Document Frequency (DF) from a day’s worth of meeting minutes and Term Frequency (TF) from the combined text of all utterances of the questioner and respondent in the same day’s discussion minutes, and extracts the top five words. As with the proposed method, the evaluation is based on the partial agreement rate between the top five extracted words and the argumentative topic contained in the TMAB. The results of word agreement rates between the top five words extracted by the proposed method, LDA, and tf-idf and the TMAB summary are shown in Table 2. Table 2 confirms that the proposed method is able to extract more keywords that are strongly related to the argumentative topic compared to other methods. This suggests that in the comparison method, since the questioner asked multiple questions in a single text, it was not possible to differentiate the information from the other questions when weighting words, and thus keywords related to the argumentative topic were not extracted. In the proposed method, clustering using utterance vectors to create clusters of question utterances that are related to the respondent’s answer utterances is found to contribute significantly to the extraction of keywords related to the argumentative topic. Table 2 Comparison of proposed method, LDA, and tf-idf Target Proposed Method LDA tf-idf data 1 49 21 9 1 data 2 48 22 9 0 data 3 47 17 7 0 ave. 48 20.0 8.333 0.333 Also, • Example of agreement for all methods • Proposed method, example matched by LDA • Example matched only by the proposed method • Example of only LDA matched The table below shows four examples. Table 3 shows an example of the results matched by all methods. Table 4 shows examples of results matched by the proposed method and LDA. Table 5 shows an example of results matched only by the proposed method. Table 6 shows an example of results that matched only the LDA. Table 3 One example where the proposed method, LDA, and tf-idf are all Targeted. Target smart interchange Proposed Method ‘interchange’, ‘Tachikawa City’, ‘babysitter’, ‘Yuriko’, ‘metropolitan bus’ LDA ‘interchange’, ‘babysitter’, ‘shopping district’, ‘ene’, ‘electric power’ tf-idf ‘interchange’, ‘aggregate’, ‘speech and behavior’, ‘401’, ‘arrest’ Table 4 An example of the proposed method and LDA only when target is applied. Target blood donation Proposed Method ‘blood donation’, ‘loss’, ‘foodstuff’, ‘sightseeing’, ‘phenomenon’ LDA ‘blood donation’, ‘foodstuff’, ‘loss’, ‘densely wooded area’, ‘goods’ tf-idf ‘2033’, ‘this year’, ‘imagination’, ‘Europe’, ‘general’ Table 5 shows that LDA extracts keywords related to another question of the questioner, while the proposed method does not extract keywords related to another question, suggesting that the use of questioner and answer pairs in the proposed method is effective for the problem of determining the argumentative topic from the meeting minutes. Table 5 An example where only the proposed method is targeted. Target national health insurance Proposed Method ‘soft’, ‘insurance’, ‘some time ago’, ‘beginning’, ‘lighting’ LDA ‘the US armed forces’, ‘self-employed’, ‘Yokota’, ‘accident and sickness benefits’, ‘freelance’ tf-idf ‘script’, ‘Tottori prefecture’, ‘2013’, ‘origin’, ‘fair’ Table 6 An example where only the LDA is targeted. Target Tokyo Big Sight Proposed Method ‘analog’, ‘100’, ‘fraud’, ‘pile’, ‘paper-based’ LDA ‘venue costs’, ‘Tokyo Big Sight’, ‘return’, ‘job training’, ‘advertisement’ tf-idf ‘INTEX’, ‘ability development’, ‘lengthening’, ‘cancel’, ‘Kawamura’ 6. Conclusion Clustering of utterance vectors and extraction of keywords related to argumentative topics using tf-idf was performed. By fine-tuning SBERT with training data using the questioner’s and respondent’s utterances as pairs, an utterance vector representing the argumentative topic was formed. The results suggest that the proposed method, unlike other comparison methods, solves the problem of questioners asking multiple questions in a single text and may be able to identify argumentative topics. This paper developed a computational model for representing speech vectors by fine-tuning a large-scale language model that incorporates a model of speech alternation within an argument context. This model made it possible to cluster utterances in meeting minutes. Using CLS vectors without fine-tuning BERT seemed to be an appropriate baseline; however, we lost time in the validation experiments while considering fine-tuning methods for better comparisons. In future work, we would like to establish a better baseline and evaluate the validity of this study more precisely. As another future task, we will consider improving the accuracy of the proposed method by conducting a study on summaries in which the proposed method could not agree. In addition, this study uses tf-idf for word extraction, but we would like to consider using a probabilistic model to generate an argumentative topic. Acknowledgments This work was supported by JSPS KAKENHI Grant Number 23H03462. References [1] J. Carletta, S. Ashby, S. Bourban, M. Flynn, M. Guillemot, T. Hain, J. Kadlec, V. Karaiskos, W. Kraaij, M. Kronenthal, et al., The ami meeting corpus: A pre-announcement, in: International workshop on machine learning for multimodal interaction, Springer, 2005, pp. 28–39. [2] C. Liu, P. Wang, J. Xu, Z. Li, J. Ye, Automatic dialogue summary generation for customer service, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 1957–1965. [3] B. Gliwa, I. Mochol, M. Biesek, A. Wawer, Samsum corpus: A human-annotated dialogue dataset for abstractive summarization, arXiv preprint arXiv:1911.12237 (2019). [4] Y. Kimura, H. Shibuki, H. Ototake, Y. Uchida, K. Takamaru, M. Ishioroshi, T. Mitamura, M. Yoshioka, T. Akiba, Y. Ogawa, M. Sasaki, K. Yokote, T. Mori, K. Araki, S. Sekine, N. Kando, Overview of the ntcir-15 qa lab-poliinfo-2 task, in: Proceedings of the 15th NTCIR Conference on Evaluation of Information Access Technologies, volume -, 2020, pp. 101–112. [5] Y. Kimura, H. Shibuki, H. Ototake, Y. Uchida, K. Takamaru, M. Ishioroshi, M. Yoshioka, T. Akiba, Y. Ogawa, M. Sasaki, K. Yokote, K. Kadowaki, T. Mori, K. Araki, T. Mitamura, S. Sekine, Overview of the ntcir-16 qa lab-poliinfo-3 task, in: Proceedings of The 16th NTCIR Conference, volume -, 2022, pp. 156–174. [6] D. Walton, C. Reed, F. Macagno, Argumentation schemes, Cambridge University Press, 2008. [7] M. Higashiyama, K. Inui, Y. Matsumoto, Learning sentiment of nouns from selectional preferences of verbs and adjectives, in: Proceedings of the 14th Annual Meeting of the Association for Natural Language Processing, 2008, pp. 584–587. [8] T. D. Midgley, S. Harrison, C. MacNish, Empirical verification of adjacency pairs using dialogue segmentation, in: Proceedings of the 7th SIGdial Workshop on Discourse and Dialogue, 2006, pp. 104–108. [9] E. Jamison, I. Gurevych, Adjacency pair recognition in wikipedia discussions using lexical pairs, in: Proceedings of the 28th Pacific Asia Conference on Language, Information and Computing, 2014, pp. 479–488. [10] N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, in: EMNLP/IJCNLP (1), 2019. [11] M. Grootendorst, Bertopic: Neural topic modeling with a class-based tf-idf procedure, arXiv preprint arXiv:2203.05794 (2022).