<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Detecting and tracking ongoing topics in psychotherapeutic conversations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ilyas Chaoua</string-name>
          <email>ilyaschaoua@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Diego Reforgiato Recupero</string-name>
          <email>diego.reforgiato@unica.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Sergio Consoli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aki Ha¨rma¨</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Rim Helaoui</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Philips Research</institution>
          ,
          <addr-line>High Tech Campus 34, 5656 AE Eindhoven</addr-line>
          ,
          <country country="NL">The Netherlands</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Cagliari, Mathematics and Computer Science Department</institution>
          ,
          <addr-line>Via Ospedale 72, 09124, Cagliari</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>One of the key aspects in a psychotherapeutic conversation is the understanding of topics dynamics driving the dialogue. This may provide insights on the therapeutic strategy adopted by the counselor for the specific patient, providing the opportunity of building up artificial intelligence (AI) based methods for recommending the most appropriate therapy for a new patient. In this paper, we propose a method able to detect and track topics in real-life psychotherapeutic conversations based on Partially Labeled Dirichlet Allocation. Topics detection helps in summarizing the semantic themes used during the therapeutic conversations, and in predicting a specific topic for each talk-turn. The conversation is modeled by means of a distribution of ongoing topics propagating through each talk sequence. Tracking of topics aims at exploring the dynamics of the conversation and at offering insights into the underlying conversation logic and strategy. We present an alternative way to look at face-to-face conversations in conjunction with a new approach that combines topic modeling and transitions matrices to elicit valuable knowledge.</p>
      </abstract>
      <kwd-group>
        <kwd>Conversational AI</kwd>
        <kwd>Psychotherapeutic conversations</kwd>
        <kwd>Topics detection and modeling</kwd>
        <kwd>Partially Labeled Dirichlet Allocation</kwd>
        <kwd>Transitions matrices</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>
        The theoretical and technological advances in several disciplines of linguistics,
computer science, and healthcare have made possible the recent investigation of
therapeutic conversation analysis as a growing field of research [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. Computational learning
techniques have been leveraged to extract useful information from humans interactions
through the identification and exploration of unusual patterns [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ].
      </p>
      <p>Therapeutic conversations methods such as Cognitive Behavior Therapy (CBT)
refer to a range of therapies that can help treating mental health problems, emotional
challenges, and some psychiatric disorders of patients by changing their ways of
thinking and behave. Accordingly, these therapeutic methods create a new way of looking at
severe psychological issues to help patients to move towards a solution and to gain a
better understanding of themselves. The treatment is usually a face-to-face conversation
where the counselor interacts directly with the patient to understand his feelings, e.g.
confident, anxious, or depressed, as well as the causes of his feelings. During the
conversation, counselor and patient create a sequence of spoken sentences, each characterized
by a certain topic, creating in this way a thematic structure to the whole therapeutic
conversation. By adopting a set of techniques and conversational strategies coming from
clinical practice, the counselor aims at solving behavioral and psychological problems
of the patient by properly reacting to the patient and redirecting the conversation
towards certain thematics.</p>
      <p>
        As a result, investigating and modeling the human-to-human dialogues in this kind
of context may serve as a guide for the development of AI-based dialoguing systems
able, e.g., at recommending the most appropriate therapeutic strategy to adopt by the
practitioner for a new patient [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. In this context, topic detection and tracking (TDT) has
been the point of intensive studies since the beginning of natural language processing
(NLP) [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] and artificial intelligence (AI) research. One aim of TDT is to identify the
appearance of new topics and following their reappearance and evaluation [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        In this paper we investigate patient-therapist interactions by modeling the
propagation of the topics addressed during a given therapeutic conversation by using Partially
Labeled Latent Dirichlet Allocation (PLDA) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ]. Traditional Latent Dirichlet
Allocation (LDA) is one of the most popular topic model in the literature and is based on a
bag-of-words approach. Since data we will deal are partially labeled, then PLDA is of
interest because it is likely to produce better results than classic LDA. PLDA is often
very useful for applications with a human in the loop, since induced classes correspond
well with human categorization, as represented by the provided label space.
      </p>
      <p>The study has been conducted over a dataset of 1729 real-life transcribed
psychotherapeutic conversations, each made of different talk-turns, which we will describe
further in Section 4. In our method we first identify the most common topics used
within the psychology corpus. The PLDA model takes as input the given conversations
and detects significant words for each topic. Secondly, the trained PLDA model is able
to determine the potential topic addressed in each talk-turn. The talk-turns flow is then
transformed into a sequence of potential topics within each conversation. Finally, the
semi-supervised PLDA topic model is evaluated by computing its coherence over the
most significant words for each topic. Our final aim is to find the quintessential
patterns in therapeutic conversations and to understand the topic switches according to the
adopted dialogue strategy and topics propagation dynamics. In our method we
distinguish the topic changes driven by the counselor and the ones prompted by the patient,
and two topic transition matrices are constructed accordingly. These matrices are able to
characterize the conversation and to provide important hints towards the understanding
of the topics propagation dynamics.</p>
      <p>The remainder of the paper is organized as follows. In Section 2 we discuss the
related works in the literature on automatic topic detection methods and therapeutic
dialogue analysis, that help to establish the basis for the present work. In Section 3 we
describe how topic modeling algorithms work, specifying their main characteristics.
Afterwards we describe the data used for the experiments and the followed preprocessing
steps in Section 4. Section 5 shows our approach on TDT by illustrating the developed
methodology. Lastly, Section 6 discusses how to evaluate our model, while Section 7
ends the paper with conclusions and directions for future research.
2</p>
    </sec>
    <sec id="sec-2">
      <title>Related Work</title>
      <p>
        Current trends in therapeutic conversations research focus on the digitalization of
spoken interactions and on the recommendation of the most appropriate treatments,
employing computational techniques such as NLP which offer the potential to extract
knowledge from consultation transcripts. Authors in [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] combined robust
communication theory used in healthcare and a visualization text analytic technique called
Discursis, to analyze the conversational behavior in consultations. Discursis 3 is a visual text
analytic tool for analyzing human communications, that automatically builds an
inherent language model from a given transcribed conversation, mines its conceptual content
for each talk-turns, and creates a visual brief. The resulting report can identify
communication patterns present during the discussion, with appropriate results of engagement
between interlocutors to understand the conversation dynamics. In medical
consultations, the classification of conversations suffers from critical weaknesses, including
intense performance requirements, time-consuming, and non-standardized annotating
systems. To overcome these shortcomings, authors in [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] built an automated
annotating system employing a Labeled LDA [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] model to learn the relationships between
a transcribed conversation and its associated annotations. Those annotations refer to
the subjects and patient symptoms discussed during the therapeutic conversations. The
resulting system automatically identifies and restricts those annotations in separate
talkturns within a given conversation. Differently, contributors in [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] examined the use of
a LDA [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] topic model as an automatic annotator tool to explore topics and to predict
the therapy outcomes of the conversation. The authors assumed that the automated
detection of topics does not aim at predicting the symptoms, but it can be used to predict
some essential factors such as patient satisfaction and ratings of therapy quality. The
examinations from both approaches show that identification and tracking of topics can
give useful information for clinicians, enabling them to assist better the identification
of patients who may be at risk of loss of treatment. Analyzing human communications,
the authors in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] converted transcribed conversations to time series by developing a
discourse visualization system, a text analysis model and a set of quantitative metrics
to identify significant features. The system was able to understand the topics used by
specific participants, and to generate reports within a single conversation. The method
can be used to observe the structure and patterns of interactions and to reconstruct the
dynamics of the communication, including the consistency levels of topics discussed
between participants and the timing of topic changes. Contributors in [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] proposed a
conceptual dynamic latent Dirichlet allocation (CDLDA) model for TDT in
conversational text content. Differently to the traditional LDA model, which detects topics only
through a bag-of-words technique, CDLDA considers essential information including
speech acts, semantic concepts, and hypernym definitions in E-HowNet 4 [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. It
basically extracts the dependencies between speech acts and topics, where hypernym
infor3 http://www.discursis.com/
4 http://ckip.iis.sinica.edu.tw/taxonomy
mation makes the topic structure more complete and extends the abundance of original
words. Experimental results revealed that the proposed approach outperforms the
conventional Dynamic Topic Models [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], LDA, and support vector machine models, to
achieve excellent performance for TDT in conversations. Authors in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] presented
OntoLDA for the task of topic labeling utilizing an ontology-based topic model, along
with a graph-based topic labeling method (i.e., the topic labeling method based on the
ontological meaning of the concepts included in the discovered topics). The model was
able to show each topics as a multinomial distribution of concepts, and each concept as
a distribution over words. Comparing to the classical LDA, the model scaled up better
the topic coherence score by combining ontological concepts with probabilistic topic
models towards a combined framework applied to different kind of text collections.
Contributors in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] showed an approach to improve human-agent dialogs using
automatic identification and tracking of dialogue topics, exploiting the basis of contextual
knowledge provided by Wikipedia category system. This approach was constructed by
mapping the different utterances to Wikipedia articles and by defining their relevant
Wikipedia categories as a likely of topics. As a result, the detection method was able to
recognize a topic without holding a priori knowledge of its belonging subject category.
3
      </p>
    </sec>
    <sec id="sec-3">
      <title>Topic Modeling</title>
      <p>Topic models are a family of probabilistic approaches that aim at discovering latent
semantic structures in large documents. Based on the presumption that meanings are
relational, they interpret topics or themes within a set of documents originally
constructed from a probability distribution over words. As a result, a document is viewed
as a combination of topics, while a topic is viewed as a blend of words.</p>
      <p>
        One of the most widely used statistical language modeling for this end is Latent
Dirichlet Allocation (LDA) introduced by Blei et al. [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. LDA is a generative approach.
It assumes that documents in a given corpus are generated by repeatedly picking up a
topic, then a word from that topic according to the distribution of all observed words in
the corpus given that topic. LDA aims at learning these distributions and inferring the
(hidden) topics given the (observed) words of the documents [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. Given the nature of
our data which includes partial annotations, we employ the following two variants of
LDA.
      </p>
      <p>
        Labeled Latent Dirichlet Allocation (LLDA) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ] is a supervised version of LDA
that constraints it by defining a one-to-one correspondence between topics and
humanprovided labels. This allows Labeled LDA to learn word-label correspondences.
Figure 1 illustrates the probabilistic graphical model of LLDA.
      </p>
      <p>
        Partially Labeled Latent Dirichlet Allocation(PLDA) [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] is a semi-supervised version
of LDA which extends it with constraints that align some learned topics with a
humanprovided label. The model exploits the unsupervised learning of topic models to explore
the unseen themes with each label, as well as unlabeled themes in the large collection
of data. As illustrated in Figure 2, PLDA assumes that the document’s words are drawn
from a document-specific mixture of latent topics, where each topic is represented as
a distribution over words, and each document can use only those topics that are in a
topic class associated with one or more of the document’s labels. This approach enables
PLDA to detect extensive patterns in language usage correlated with each label.
4
      </p>
    </sec>
    <sec id="sec-4">
      <title>Experimental dataset</title>
      <p>For our experiments, we use a dataset consisting of a collection of psychotherapeutic
transcripts available for research. The conversations have been transcribed and collected
according to the guidelines of the American Psychological Association (APA)5. An
approval to use the collection was granted by an Internal Committee of Biomedical
Experiments (ICBE) of Philips after a review of the agreements, the consent procedures,
and data handling plan by legal and privacy experts.</p>
      <sec id="sec-4-1">
        <title>5 http://www.apa.org</title>
        <p>4.1</p>
        <sec id="sec-4-1-1">
          <title>Data description</title>
          <p>Our dataset consists of 1729 transcripts of 1:1 conversation with a total of 340,455
talk turns, 75,732 unique terms, and more than 9 million words. Each transcript has
on average 200 talk-turns and eight words for talk-turn. They are also extended with
meta-data consisting of the corresponding school of psychotherapy, counselors-patients
information such as gender, age range, and sexual orientation as well as a table of topics
discussed during the therapeutic conversation. The table of topics contains two different
kinds of information:
– Subjects: they are organized into three hierarchical levels. The top level is the most
general whereas the other two are more precise. For example, the word Family
could correspond to a top level topic, while Family violence and Child abuse would
be associated to the second and third levels respectively. Up to 575 subjects have
been used in the three levels in total.
– Symptoms: there are 79 symptoms (e.g. Depression, Anger, Fear) as defined in the</p>
          <p>DSM-IV6 manual.
4.2</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Preprocessing of the meta-data</title>
          <p>Given the high number of items in the table of topics, we applied the following steps
to merge similar topics and reduce the number of subjects and symptoms to 18 and 16
respectively:
1. Eliminate all the subjects and symptoms that occur in less than 3% of the dataset;
2. Group together all the subjects belonging to the same Wikipedia category7 regardless
their position in the given hierarchical structure.
3. Assign a label to the new subject according to the psychology topics table from APA.
For example Parent-child relationship and Family are mapped to a new subject from
APA known as Parenting.
4. Reduce the number of symptoms by using the DSM-IV manual with the expert
support of a counselor. In particular, we group symptoms with high-level correlation into a
representative one. For example, Sadness and Hopelessness are merged into the
symptom: Depression.</p>
          <p>The final set is depicted in Figure 3.
4.3</p>
        </sec>
        <sec id="sec-4-1-3">
          <title>Preprocessing the conversation text</title>
          <p>
            Using the NLTK platform8 we applied a number of NLP pre-processing steps to the
dataset [
            <xref ref-type="bibr" rid="ref21">21</xref>
            ]. The resulting corpus consists of 2,849,457 tokens (14,274 unique ones)
and a total of 268,478 talk turns. The performed steps include:
– tokenization, which transforms texts into a sequence of tokens
          </p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>6 https://dsm.psychiatryonline.org</title>
      </sec>
      <sec id="sec-4-3">
        <title>7 https://en.wikipedia.org/wiki/Category:Main_topic_</title>
        <p>classifications</p>
      </sec>
      <sec id="sec-4-4">
        <title>8 http://www.nltk.org/</title>
        <p>
          – removal of all punctuations, stop words, numbers, words that frequently appeared
in the text with less content information (e.g., ”mm-hmm”) and words that occurred
in less than five documents and keeping nouns, verbs and adjectives only. This was
achieved by using unigram part-of-speech tagger [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ] to identify the types of words
in each talk turn
– removal of the 100 most common words as well as talk turns with one word only
as well as words shorter than three characters.
        </p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>The proposed approach for TDT</title>
      <p>The proposed automatic TDT method has three phases; the detection of topics,
assignment of topic labels to talk turns, and finally tracking of the propagation of topics over
the conversation.
5.1</p>
      <sec id="sec-5-1">
        <title>Detection of Topics</title>
        <p>
          The topic detection was performed using a PLDA implementation based on the Stanford
Topic Modeling Toolbox9(TMT). The model requires the definition of parameters such
as the number of hidden topics to discover, the hyperparameters and (see Figure 2),
and a vast amount of short text as input for training purpose. To enlarge the number of
corpora, we defined each talk-turn as a document, and we associated each document
(talk-turn) with the corresponding topics from the table of topics of the corresponding
transcript because PLDA is useful in general only when each document has more than
one label associated to it. As a result, we obtained a broader set of documents with
higher word co-occurrences. More in detail, we needed to specify how many new topics
(different from those in table of topics) the model had to discover. Experimentally this
number was set to 20. As a further input, we fed the PLDA with the list of 34 topics and
the values of and were set to 0.01. In total 268,478 talk turns were obtained after
the preprocessing step. Moreover, we used the CVB0 algorithm [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] where the overall
number of iterations was 150. After training the model we obtained a list of topics and
the associated learned words as shown in Table 1. The method also provided a
perdocument topic distribution for each talk turn. An example is illustrated in Figure 4,
where the five topics with the highest likelihood in a conversation are shown (i.e. Stress
and Job; Suicide and Death; Sexuality; Depression; Fear), each with the corresponding
talk-turns.
5.2
        </p>
      </sec>
      <sec id="sec-5-2">
        <title>Assignment of topics</title>
        <p>The PLDA returns the topics and the associated terms for each talk turn (document).
Based on the terms of each document, we can then determine how likely each
document was associated with a topic. Table 1 shows a completion of the terms learned from
the trained PLDA topic model. There we list the top ten terms for each topic. In the first
column, one can see the discovered topic and its associated words whereas the second
and third columns indicate two of the 34 topics already known and their related words.
The terms in the same topic tend to be similar, particularly for subjects and
symptoms. For example, Parenting carries the member of the family, such as mom, mother,
dad, etc.. Moreover, Addiction includes the terms close to alcohol and drugs (drinking,
smoke, etc.). On the other hand, the domain of the discovered topics has inherent
interpretations and holds words that are not covered by the annotations in psychotherapy
corpus. For example, the Topic-5 includes similar terms but their meaning (work) is far
from any annotations in APA. For this reason, for the tracking task, we only used the
34 elements present in our table of topics.</p>
        <sec id="sec-5-2-1">
          <title>9 https://nlp.stanford.edu/software/tmt/tmt-0.4/</title>
          <p>In addition to the 54 topics (34 known and 20 discovered by PLDA) and their
relevant terms, we aimed to know the likelihood of each topic in each talk-turn. As such,
another output of the PLDA topic model consisted in the partition of the documents
into a set of 54 topic proportions, called per-document topic distributions, where each
talk-turn was represented as a combination of topics with different proportions. As
already explained earlier, Figure 4 shows an example of the potential ongoing topics in
each talk-turn within a therapeutic conversation from the dataset.
The goal was to understand how the known topics propagate, are localized and change
for each talker in conversations. We already used PLDA to identify a potential topic
for each talk-turn from the table of topics converting the face-to-face conversation to a
sequence of topics for each speaker. For the tracking topics task we added a new topic
annotated as Meaningless talk which we associated to talk-turns that provide poor
semantics contents or language, or non-verbal communication (e.g. ”Yahh!!, Mm-hmm”).
We built two topics transitions matrices (TTMs) to understand how the topics change
from one talk-turn to another one. Topic changes can be seen as a dynamic mechanism
that frequently occurs inside a conversation where speakers move from one topic to
another and can be contested by either speaker. We recognize three kinds of changes:
1. The counselor keeps talking about the same topic to the patient from the previous
talk-turn, and vice versa;
2. The counselor moves to a new topic after the talk-turn of the patient;
3. The patient moves to a new topic after the talk-turn of the counselor.</p>
          <p>More in detail we constructed patient-to-counselor TTM CPk that describes all the
topics changes within the conversation k. In particular, CPk[i; j] is the number of times
that topic i changes into j in the conversation k. We merged the CPk matrices together
by summing up corresponding elements obtaining our final matrix CP . Similarly, we
built a counselor-to-patient matrix P C by using the topics-change defined earlier by
switching counselor and patient. The difference between the two matrices is illustrated
in Figure 5, which, shows the engagement patterns between counselors and patients,
providing a new way to describe one-to-one conversations. There are three possibilities
depending on whether the resulting value is lower than -10 (black), between -10 and
+10 (gray) and greater than 10 (white). The diagonal of the matrix in Figure 5 gives an
idea about the first type of topics changes which corresponds to the “resistance level”
on the same topic; it has 17 gray values which proves the fact that the speakers like
continuing to talk on the same topic. It has also twelve black values and six white values.
The former means that the counselor switches topics twice as the patient does. A
possible explanation is that the counselor aims at searching for other correlated symptoms
or subjects that would lead to a mental disease. The other values of the matrix
describe the second and third type of topic changes; the number of white and black values
are approximately equal, which means that the conversations, in general, are discussed
without perceived tactics. Nevertheless, some rows and columns are mostly negative
or positive, like Parenting, and indicate the use of some strategies. The counselor
often switches the topic if the previous one was Mania, Medication or Patient-Counselor
Relations. Instead, he/she frequently starts a new topic if the patient’s talk restrains
less semantic contents (Meaningless Talk). Contrary, the patient often switches topics
if the previous topic was related to Parenting, Friendship, Sexual dysfunction, Crying,
or Stress-and-Work. Conclusively, TTM guides to a bright understanding of how and
when topics are changing thus giving important insights to the counselor for CBT.
6</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Evaluation</title>
      <p>
        The evaluation of the performance of a topic model is not an easy task. In most cases,
topics need to be manually evaluated by humans, which may express different opinions
and annotations. The most common quantitative way to assess a probabilistic model
is to measure the log-likelihood of a held-out test set performing perplexity. However,
the authors in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] have shown that, surprisingly, perplexity and human judgment are
often not correlated, and may infer less semantically meaningful topics. A potential
solution to this problem is provided by the topic coherences, that is a typical way to
assess qualitatively topic models by examining the most likely words in each topic. For
such a purpose, we employed Palmetto10, a tool to compute topic coherence of a given
word set with six different methods. The one that we selected for our purposes was
the C V method [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], which uses word co-occurrences from the English Wikipedia,
and that has been proven to highly correlate with human ratings. C V is depended on
a one-set segmentation of the top words and a measure that uses normalized pointwise
mutual information. The one-set segmentation computes the cosine similarity between
each top words vector and the amount of all top words vectors. The coherence value is
then the arithmetic average of these similarities and represents an intuitive measure of
the goodness of the topics produced by PLDA. In this work, we evaluated our PLDA
topic model for topic detection using C V coherence. In particular, we gave the top five
terms (according to the weight of PLDA shown in Table 1) for each of the 34 topics as
the input, obtaining as output a satisfactory coherence amongst all the detected topics.
Indeed on average a topics coherence value larger than 50% was obtained, which is
recognized in the research community already as a well-acceptable coherence score for
a TDT model. This further substantiates the validity and potentials of our method.
10 http://aksw.org/Projects/Palmetto.html
      </p>
    </sec>
    <sec id="sec-7">
      <title>Conclusions</title>
      <p>Constituting a crucial aspect for the analysis and modeling of counselor-patient
conversations, the automatic TDT task in a psychotherapeutic conversation poses a significant
challenge. In this paper we implemented a topic detector which efficiently understands
therapeutic discussions in consultations. We exploited PLDA and state-of-the-art NLP
techniques and topic coherence evaluation systems. Furthermore, we computed TTM
to capture the dynamics of each ongoing topic in the conversations understanding how
much each interlocutor is affected in the dialogue and when he/she prefers switching
topics. Knowing how topics change and their propagations can be used by counselors
to drive the discussion and to detect a patient emotional state during the therapeutic
conversation. These aspects of interaction are critical for all mental health specialists as
they are involved in patient’s health concerns. We conclude that PLDA and TTM may
be of benefit to the therapeutic conversational speech analysis community, by having
high potentials into real-life applications in psychotherapy.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Allahyari</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Kochut</surname>
          </string-name>
          .
          <article-title>Automatic Topic Labeling Using Ontology-Based Topic Models</article-title>
          .
          <source>In 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA)</source>
          , pages
          <fpage>259</fpage>
          -
          <lpage>264</lpage>
          ,
          <year>December 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D.</given-names>
            <surname>Angus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. E.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and J.</given-names>
            <surname>Wiles</surname>
          </string-name>
          .
          <article-title>Human Communication as Coupled Time Series: Quantifying Multi-Participant Recurrence</article-title>
          .
          <source>IEEE Transactions on Audio, Speech, and Language Processing</source>
          ,
          <volume>20</volume>
          (
          <issue>6</issue>
          ):
          <fpage>1795</fpage>
          -
          <lpage>1807</lpage>
          ,
          <year>August 2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Angus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Watson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Gallois</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Wiles</surname>
          </string-name>
          .
          <article-title>Visualising conversation structure across time: insights into effective doctor-patient consultations</article-title>
          .
          <source>PloS One</source>
          ,
          <volume>7</volume>
          (
          <issue>6</issue>
          ):e38014,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Asuncion</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Welling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.W.</given-names>
            <surname>Teh</surname>
          </string-name>
          .
          <article-title>On Smoothing and Inference for Topic Models</article-title>
          .
          <source>UAI '09</source>
          , pages
          <fpage>27</fpage>
          -
          <lpage>34</lpage>
          , Arlington, Virginia, United States,
          <year>2009</year>
          . AUAI Press.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bangalore</surname>
          </string-name>
          ,
          <string-name>
            <surname>G.</surname>
          </string-name>
          <article-title>Di Fabbrizio, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Stent</surname>
          </string-name>
          .
          <article-title>Learning the Structure of Task-Driven Human-Human Dialogs</article-title>
          .
          <source>IEEE Transactions on Audio, Speech, and Language Processing</source>
          ,
          <volume>16</volume>
          (
          <issue>7</issue>
          ):
          <fpage>1249</fpage>
          -
          <lpage>1259</lpage>
          ,
          <year>September 2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>D.M.</given-names>
            <surname>Blei</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.D.</given-names>
            <surname>Lafferty</surname>
          </string-name>
          .
          <source>Dynamic Topic Models. ICML '06</source>
          , pages
          <fpage>113</fpage>
          -
          <lpage>120</lpage>
          , New York, NY, USA,
          <year>2006</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D.M.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          , and
          <string-name>
            <given-names>M.I. Jordan. Latent Dirichlet Allocation. J.</given-names>
            <surname>Mach</surname>
          </string-name>
          .
          <source>Learn. Res.</source>
          ,
          <volume>3</volume>
          :
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          ,
          <year>March 2003</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Breuing</surname>
          </string-name>
          and
          <string-name>
            <surname>I. Wachsmuth.</surname>
          </string-name>
          <article-title>Talking topically to artificial dialog partners: Emulating humanlike topic awareness in a virtual agent</article-title>
          . volume
          <volume>358</volume>
          .
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Boyd-Graber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gerrish</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Wang</surname>
          </string-name>
          , and
          <string-name>
            <given-names>D.M.</given-names>
            <surname>Blei</surname>
          </string-name>
          . Reading Tea Leaves:
          <article-title>How Humans Interpret Topic Models</article-title>
          .
          <source>NIPS'09</source>
          , pages
          <fpage>288</fpage>
          -
          <lpage>296</lpage>
          , USA,
          <year>2009</year>
          . Curran Associates Inc.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>W.T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.C.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.S.</given-names>
            <surname>Chung</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.J.</given-names>
            <surname>Chen</surname>
          </string-name>
          .
          <article-title>E-HowNet and Automatic Construction of a Lexical Ontology</article-title>
          .
          <source>COLING '10</source>
          , pages
          <fpage>45</fpage>
          -
          <lpage>48</lpage>
          , Stroudsburg, PA, USA,
          <year>2010</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chen</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Liu</surname>
          </string-name>
          .
          <article-title>Development and research of Topic Detection and Tracking</article-title>
          .
          <source>In 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS)</source>
          , pages
          <fpage>170</fpage>
          -
          <lpage>173</lpage>
          ,
          <year>August 2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>P.</given-names>
            <surname>Drew</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Chatwin</surname>
          </string-name>
          , and S. Collins.
          <article-title>Conversation analysis: a method for research into interactions between patients and health-care professionals</article-title>
          .
          <source>Health Expectations: An International Journal of Public Participation in Health Care and Health Policy</source>
          ,
          <volume>4</volume>
          (
          <issue>1</issue>
          ):
          <fpage>58</fpage>
          -
          <lpage>70</lpage>
          ,
          <year>March 2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>G.</given-names>
            <surname>Gaut</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Steyvers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z. E.</given-names>
            <surname>Imel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. C.</given-names>
            <surname>Atkins</surname>
          </string-name>
          , and
          <string-name>
            <given-names>P.</given-names>
            <surname>Smyth</surname>
          </string-name>
          .
          <article-title>Content Coding of Psychotherapy Transcripts Using Labeled Topic Models</article-title>
          .
          <source>IEEE Journal of Biomedical and Health Informatics</source>
          ,
          <volume>21</volume>
          (
          <issue>2</issue>
          ):
          <fpage>476</fpage>
          -
          <lpage>487</lpage>
          ,
          <year>March 2017</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Gelbukh</surname>
          </string-name>
          .
          <article-title>Natural language processing</article-title>
          .
          <source>In Fifth International Conference on Hybrid Intelligent Systems (HIS'05)</source>
          , pages 1 pp.-,
          <year>November 2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Howes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Purver</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>McCabe</surname>
          </string-name>
          .
          <article-title>Investigating Topic Modelling for Therapy Dialogue Analysis</article-title>
          .
          <source>March</source>
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>W. Mohr J</surname>
          </string-name>
          and
          <string-name>
            <given-names>P.</given-names>
            <surname>Bogdanov</surname>
          </string-name>
          .
          <article-title>Introduction-topic models: What they are and why they matter</article-title>
          .
          <source>Poetics</source>
          ,
          <volume>41</volume>
          (
          <issue>6</issue>
          ):
          <fpage>545</fpage>
          -
          <lpage>569</lpage>
          ,
          <year>2013</year>
          .
          <article-title>Topic Models and the Cultural Sciences</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>N.P.P.</given-names>
            <surname>Khin</surname>
          </string-name>
          and
          <string-name>
            <given-names>T.N.</given-names>
            <surname>Aung</surname>
          </string-name>
          .
          <article-title>Analyzing Tagging Accuracy of Part-of-Speech Taggers</article-title>
          .
          <source>Advances in Intelligent Systems and Computing</source>
          , pages
          <fpage>347</fpage>
          -
          <lpage>354</lpage>
          . Springer, Cham,
          <year>August 2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Hall</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Nallapati</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.D.</given-names>
            <surname>Manning</surname>
          </string-name>
          .
          <article-title>Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-labeled Corpora</article-title>
          .
          <source>EMNLP '09</source>
          , pages
          <fpage>248</fpage>
          -
          <lpage>256</lpage>
          , Stroudsburg, PA, USA,
          <year>2009</year>
          .
          <article-title>Association for Computational Linguistics</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>D.</given-names>
            <surname>Ramage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.D.</given-names>
            <surname>Manning</surname>
          </string-name>
          , and
          <string-name>
            <given-names>S.</given-names>
            <surname>Dumais</surname>
          </string-name>
          .
          <article-title>Partially Labeled Topic Models for Interpretable Text Mining</article-title>
          .
          <source>KDD '11</source>
          , pages
          <fpage>457</fpage>
          -
          <lpage>465</lpage>
          , New York, NY, USA,
          <year>2011</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ro</surname>
          </string-name>
          <article-title>¨der, A. Both, and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Hinneburg</surname>
          </string-name>
          .
          <article-title>Exploring the Space of Topic Coherence Measures</article-title>
          .
          <source>WSDM '15</source>
          , pages
          <fpage>399</fpage>
          -
          <lpage>408</lpage>
          , New York, NY, USA,
          <year>2015</year>
          . ACM.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>A.K.</given-names>
            <surname>Uysal</surname>
          </string-name>
          and
          <string-name>
            <given-names>S.</given-names>
            <surname>Gunal</surname>
          </string-name>
          .
          <source>The Impact of Preprocessing on Text Classification. Inf. Process. Manage.</source>
          ,
          <volume>50</volume>
          (
          <issue>1</issue>
          ):
          <fpage>104</fpage>
          -
          <lpage>112</lpage>
          ,
          <year>January 2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <surname>J.F Yeh</surname>
            ,
            <given-names>Y.S.</given-names>
          </string-name>
          <string-name>
            <surname>Tan</surname>
            , and
            <given-names>C.H.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
          </string-name>
          .
          <article-title>Topic detection and tracking for conversational content by using conceptual dynamic latent Dirichlet allocation</article-title>
          .
          <source>Neurocomputing</source>
          ,
          <volume>216</volume>
          (
          <string-name>
            <surname>Supplement</surname>
            <given-names>C</given-names>
          </string-name>
          ):
          <fpage>310</fpage>
          -
          <lpage>318</lpage>
          ,
          <year>December 2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>