<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Journal of Biomedical Informatics 137
of the AAAI Conference on Artificial Intelligence (2023) 104267. URL: http://dx.doi.org/10.1016/j.jbi.
36 (2022) 12226-12234. 2022.104267. doi:10.1016/j.jbi.2022.104267.</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1016/j.aiopen.2023.08.012</article-id>
      <title-group>
        <article-title>Toward a Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Pavlos Constas</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vikram Rawal</string-name>
          <email>vikram.rawal@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Matthew Honorio Oliveira</string-name>
          <email>matthewhonorio.oliveira@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Andreas Constas</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aditya Khan</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kaison Cheung</string-name>
          <email>siukai.cheung@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Najma Sultani</string-name>
          <email>najma.sultani@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Carrie Chen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Micol Altomare</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Akzam</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jiacheng Chen</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vhea He</string-name>
          <email>vhea.he@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lauren Altomare</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Heraa Muqri</string-name>
          <email>heraa.muqri@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Asad Khan</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Nimit Amikumar Bhanshali</string-name>
          <email>nimit.bhanshali@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Youssef Rachad</string-name>
          <email>youssef.rachad@mail.utoronto.ca</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Guerzhoy</string-name>
          <email>guerzhoy@cs.toronto.edu</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2016</year>
      </pub-date>
      <volume>1</volume>
      <fpage>517</fpage>
      <lpage>520</lpage>
      <abstract>
        <p>We propose a reinforcement learning (RL)-based system that would automatically prescribe a hypothetical patient medication that may help the patient with their mental health-related speech disfluency, and adjust the medication and the dosages in response to zero-cost frequent measurement of the fluency of the patient. We demonstrate the components of the system: a module that detects and evaluates speech disfluency on a large dataset we built, and an RL algorithm that automatically finds good combinations of medications. To support the two modules, we collect data on the efect of psychiatric medications for speech disfluency from the literature, and build a plausible patient simulation system. We demonstrate that the RL system is, under some circumstances, able to converge to a good medication regime. We collect and label a dataset of people with possible speech disfluency and demonstrate our methods using that dataset. Our work is a proof of concept: we show that there is promise in the idea of using automatic data collection to address speech disfluency.</p>
      </abstract>
      <kwd-group>
        <kwd>Disfluency</kwd>
        <kwd>disfluency</kwd>
        <kwd>ASR</kwd>
        <kwd>reinforcement learning</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Speech disfluency is a common medical issue. It can
be caused by, among other factors, conditions such as
depression, anxiety, and insomnia (see Section 6). Speech
disfluency includes stuttering as well as issues like pauses
that are too long, repetitions, “false starts,” and “repairs”
of previous utterances [
        <xref ref-type="bibr" rid="ref2 ref3">1</xref>
        ].
ment Learning-based system for helping physicians
adMachine Learning for Cognitive and Mental Health Workshop
(ML4CMH), AAAI 2024, Vancouver, BC, Canada
∗Corresponding author.
†These authors contributed equally.
nEvelop-O
system for detecting how disfluent the person’s speech is,
and a subsystem for minimizing the speech disfluency by
ifnding a combination of medications that works using
      </p>
      <sec id="sec-2-1">
        <title>We train our disfluency detection system to predict the</title>
        <p>labels we assigned to clips in the dataset we collected.</p>
        <p>To demonstrate the feasibility of the RL subsystem, we
construct a patient simulation . We measure the precision
with which our speech disfluency detection subsystem
can measure disfluency, and obtain from the literature
the plausible timespans and onset times for the efects of
medications. We then run a patient simulation with
plau</p>
        <sec id="sec-2-1-1">
          <title>2.1. Methods</title>
          <p>can find strategies to minimize the speech disfluency of The rating process consisted of two stages, each stage
our plausibly-simulated patients. lasting approximately one week. In each stage, raters
ran</p>
          <p>To evaluate our subsystems, we collect a dataset of domly received several channels and were asked to rate
public videos of people with possible disfluencies and their audio samples from the dataset. This was arranged
label the dataset using a scalable strategy that allows us so that each audio sample in the dataset would have up to
to obtain precise and standardized ratings by having each 3 raters. Each rater received diferent channels in the
difvideo be rated by multiple raters. ferent stages. To ensure independent evaluation, raters</p>
          <p>The rest of the paper is organized as follows: we ex- were advised against sharing their assessments between
plain our data collection and labelling process. We then each other.
describe our disfluency rating process, and report results Data from the initial stage was not used in our
experion that subsystem. We then describe our patient sim- ments. The round was used for acquainting raters with
ulation process and report results of the RL system’s the variation of disfluency observed in the dataset. At the
performance on the simulated patients. For the patient end of this phase, each rater was privately given summary
simulation to be plausible, we connect the patient simu- statistics regarding their ratings in the round, including
lation to how precisely speech can be rated for fluency the mean and standard deviation of their ratings across
by our system and to the plausible efects of medications. audio samples, as well as a spreadsheet containing a
mea(Note that if the measurement of speech fluency is too sure of their bias for each audio sample (where bias is the
noisy and/or the medications’ efect is too subtle or the distance of their rating from the mean rating across all
onset is too long, learning would likely not be possible.) raters for that audio sample). This process was aimed at
Finally, we summarize our literature search results for allowing raters to recognize possible inconsistencies and
medications that could plausibly afect speech fluency. biases in their “internal model” of disfluency. The
ratings from the second stage were the finalized ratings that
would be used for fine-tuning our disfluency-detection
2. Data Collection system. The ratings were standardized, as described
below.</p>
          <p>
            The objective of our data collection process is to obtain 3. Rater Performance Analysis
a series of audio samples from individuals with
possible mental health-related speech disfluency, across a pe- In this Subsection, we analyze the rater data, and show
riod of time. We collected 19 channels from searching that raters are somewhat consistent in their ratings of
on YouTube for mental health-related vlog channels by the same clips. This indicates that we can use the
stanYouTubers, as well as the D-vlog, a dataset of channels dardized ratings (see below) as targets when estimating
of YouTubers with depression [
            <xref ref-type="bibr" rid="ref4">2</xref>
            ]. the fluency of speakers in audio clips.
          </p>
          <p>For each YouTuber represented in the videos, we
scraped their channel for other videos which contained
significant stretches of unedited spoken audio. Terms 3.1. Data Model
used to query for videos from each channel were subsets To assess the performance of the raters, we conducted a
of the following keywords {“depression”, “story”, “vlog”, regression analysis. The model we utilized was
“depression vlog”, “anxiety”, “tested”, “figure”, “rambling”,
“issues”, “anxiety vlog”, “webcam”}. For each video, only   =   +   +  
the audio was extracted. In total, we obtained 195 audio
clips. There are 9 to 11 audio clips for each channel, with
an average of 10 audio clips per channel.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>2.2. Rating System</title>
          <p>We devised a rating system to assess the severity of the
disfluency in the video data. The authors acted as raters
for the videos. The 19 YouTuber channels in the dataset
were examined for disfluencies in them. Raters were
tasked with assessing the disfluency severity in each 3.2. Analysis
video on a scale of 1 to 7, which was adapted from the
Stuttering Severity Instrument-Third Edition (SSI-3) [3]. Below, we perform an exploratory analysis of the
non(But note that “disfluency” is a more general term than standardized ratings.
“stuttering.”)
where   is the rating given to audio clip  by rater  ,  
is the rater bias,   is the true average disfluency of the
channel, and   is the random error (see [4] for a
similar model). This model is estimated using least-squares
regression.</p>
          <p>Using this model, the performance of the raters was
assessed by randomly splitting the dataset into a 70%
training set, and a 30% validation set.</p>
          <p>We compute the Root-Mean-Square Error (RMSE) on
the training set and the validation set when predicting the
disfluency scores using the data model. The RMSE on the
training set was 0.8/6.0 (on a scale of 0 to 6 rather than 1
to 7 as in the input) and the RMSE on the validation set
was 0.9/6.0. The validation RMSE we would obtain if the
data model predicted the average rating every time would
be 1.4/6.0. The  2 value of the model on the training set
was 0.44, indicating that the rater coeficients and the
clip coeficients have explanatory power.</p>
          <p>For each clip on the validation set, we compute the
standard deviation of the ratings assigned by diferent
raters to the same clip. The median standard deviation
is 0.6/6.0. This suggests that the median disagreement
between raters was just over half a rating point on a
given clip. The standard deviations are given on a scale
of 6.0 since the scores range from 1 to 7.</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>3.3. Standardized Ratings</title>
          <p>Diferent raters use diferent standards for fluency. We
therefore obtained standardized ratings. We accomplish
this by subtracting the rater bias   (see Section 3.1) for
rater  for each rating   by rater  . Then, when we
compute the average standardized rating for every clip, we
average ratings that are actually on the same scale.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4. Disfluency Pipeline</title>
      <p>
        In this section, we describe our subsystem for
assessing the disfluency of a person in the input clip. We use
Whisper 1[5] to transcribe the audio. We then use an
Auto-Correlational Neural Network-based tagger [
        <xref ref-type="bibr" rid="ref1">6</xref>
        ] to
tag the Whisper transcript. Finally, we fine-tune
GPT2 [7] on the tagged transcript as input in order to predict
the disfluency scores we assigned.
      </p>
      <sec id="sec-3-1">
        <title>4.1. Transcribing Audio with Whisper</title>
        <p>The YouTube videos are transcribed using the Automated
Speech Recognition (ASR) model Whisper. Tokens such
as “uh”, “um”, etc. were included in the transcript.</p>
      </sec>
      <sec id="sec-3-2">
        <title>4.2. Disfluency Tagging</title>
        <p>
          The parsed text transcripts were subsequently fed into a
Disfluency Tagging Auto-Correlational Neural Network
(DT-ACNN) [
          <xref ref-type="bibr" rid="ref1">6</xref>
          ] – a system designed to categorize each
word within the text transcript as either “fluent” or
“dislfuent”. In [
          <xref ref-type="bibr" rid="ref1">6</xref>
          ], the Switchboard corpus of conversational
speech [8] dataset was used. For the task of predicting a
per-word “fluent” or “disfluent” label, the authors report
a recall of 90.0%, a precision of 82.8%, and an F1 score
        </p>
        <sec id="sec-3-2-1">
          <title>1https://github.com/openai/whisper</title>
          <p>0.10
E
S
M
0.05
0.00
0
10
20
30
40</p>
          <p>50</p>
          <p>Epoch
of 86.2% on the dataset. The reported results indicate
the efectiveness of the DT-ACNN model in disfluency
detection.</p>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>4.3. Fine-tuning GPT-2</title>
        <p>We fine-tune GPT-2 to predict the average standardized
disfluency score by the rates who rated the clip from the
tagged Whispter transcripts, as well as from the
wordsper-minute (WPM) measure.</p>
        <p>For the regression task, we train with embedding size
768, using the Mean Squared Error (MSE) loss, and the
AdamW optimizer with parameters  1 = 0.9,  2 = 0.999,
 = 10 −9. The token limit of GPT2 is 1024. Inputs that
exceed this limit were truncated. The following
hyperparameters were used during training: a learning rate of
4.5 × 10−4, a batch size of 4 (dictated by computational
limitations), with weight decay parameter 0.01, for 50
epochs.</p>
        <p>P-tuning [9] was used. In this approach, a soft prompt
with a set of 100 tokens is introduced at the beginning
of the input. These tokens aid in guiding the model
during classification. The model uses a Prompt Encoder to
optimize the prompt, with an encoding layer comprised
of 128 units. The model performance was evaluated by
randomly splitting the dataset into an 80% training set,
and a 20% validation set.</p>
      </sec>
      <sec id="sec-3-4">
        <title>4.4. Results</title>
        <p>The learning curves for the disfluency prediction task are
in Fig. 1. We observe that our system is currently able to
predict the validation rating to within about 0.15/6 of the
actual rating on average (the standard error is obtained
by taking the square root of the MSE) for YouTubers not
in the training set.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>5. Patient Simulation and</title>
    </sec>
    <sec id="sec-5">
      <title>Reinforcement Learning</title>
      <sec id="sec-5-1">
        <title>5.1. Overview</title>
        <p>In this Section, we explore the plausibility of using RL
together with signals from our speech disfluency detector
to find an efective medication regimen for people with
speech disfluency.</p>
        <p>We first describe how we simulate people with speech
disfluency in a plausible way. We then demonstrate that
our RL algorithm could find an efective medication
regimen in a plausible scenario.</p>
      </sec>
      <sec id="sec-5-2">
        <title>5.2. Prior Work: RL for Medication</title>
      </sec>
      <sec id="sec-5-3">
        <title>Adjustment</title>
        <p>Reinforcement Learning (RL) for medication adjustment
has been proposed in several contexts. Oh et al.
evaluate an ofline RL algorithm learned on South Korea’s
national Health insurance system to prescribe diabetes
medication [10]. Javad et al. similarly propose an ofline
RL algorithm [11]. They measure the performance of
the system based on the concordance to the
prescription actually made, as well as by analyzing outcomes
where the system’s recommendation and the actual
recommendation in the data disagreed. Sun et al. explored
anapproach for Type 2 Diabetes treatment [12]. They
merged a knowledge-driven model, informed by clinical
guidelines, with a data-driven deep reinforcement
learning model. The knowledge-driven model uses data from
the Singapore Health Services Diabetes Registry which
contains over 189,520 patients and their Type-2 Diabetes
medication prescription to narrow down a list of viable
medications to which the data-driven model applies a
Deep Q Network (DQN) which also learns from the
historical patient data and is used to rank the candidate
medications selected by the knowledge-driven model
based on expected long-term rewards.</p>
        <p>Nemati et al. developed a clinician-in-the-loop
framework for heparin dosing, leveraging data from the
MIMICII intensive care unit database [13]. This study engaged
an interactive agent in simulated dosing trials, learning
from the outcomes to refine decision-making processes.
Similarly, Anzabi Zadeh et al. utilized deep reinforcement
learning in the context of warfarin dosing for patients
with blood clotting issues with an emphasis on
individualized dosing due to warfarin’s narrow therapeutic range
[14]. In this method, they frame the problem as a Markov
decision process (MDP) and employ an agent within a
Pharmocokinetic/Pharmacodynamic (PK/PD) model to
simulate dose-responses of virtual patients in which the
agent learns the best dose-duration pair through
experience replay.
We model a medication administration environment,
aiming to determine the most efective medication regime
for people experiencing depression, anxiety, insomnia,
and resulting speech fluency issues. The person’s health
state evolves based on Hidden Markov Models (HMMs).
Each health issue (depression, anxiety, insomnia) has
its unique HMM, governing how the patient’s state
progresses. The patient’s observed speech fluency is also
influenced by these health states.</p>
        <p>The Medication object represents diferent types of
medications, each with varying efects on the
aforementioned health issues. These efects include
beneficial impacts on the conditions and potential side efects.
Medications have properties like dosage, half_life, and
time_to_efect, which dictate how they function over
time.</p>
        <p>We simulate people with disfluency by evolving the
HMM state. The patient model has the following
attributes:
• Depression, Anxiety, Insomnia Scores: These
attributes represent the initial underlying
conditions of the patient. Represented as an integer
between 1 and 5. A higher number denotes a
more severe state.
• Depression, Anxiety, Insomnia Hidden Markov
Models: Models that represent the behaviour of
how the severity of the patient’s depression,
anxiety, and insomnia change over time based on
their initial states and also through interaction
with medicine. These directly impact observed
speech fluency context.
• Speech Fluency Score: Indicates the patient’s
natural ability to speak fluently, modelled as a
continuous value between 0 and 1.
• Medication Accumulation: A list that keeps track
of all medications that are currently in the
patient’s system.</p>
        <p>Alongside this, we also model an individual medication
with the following attributes:
• Name: The name of the medication.
• Depression, Anxiety, Insomnia Efects : Captures
the medication’s average efect and variability on
each condition given the standard dosage.
• Dosage: The amount of medication administered
relative to the standard dose. (e.g. Dosage = 1.5
means 1.5x the standard dose). This attribute
scales the efects of the medication on the patient.
• Time to Efect : The amount of days it takes for
the medication to start showing efects.
• Half-Life: The amount of days it takes for the
medication dosage in the patient’s system to
reduce by half.</p>
      </sec>
      <sec id="sec-5-4">
        <title>5.4. Hidden Markov Model (HMM)</title>
        <p>A Hidden Markov Model (HMM) is a statistical model
that represents sequences of observable data as well as
hidden states. The sequences of observable data are
generated based on hidden states which cannot be directly
observed. Here, the “observable data” is the patient’s
speech fluency score while the “hidden states” are the
underlying depression, anxiety, and insomnia conditions
that afect the severity of the disfluency. We base this
model of of the fact that a patient’s underlying level of
depression [15], insomnia [16], and anxiety [17] have an
impact on their speech fluency.</p>
        <p>Each health condition — depression, anxiety, and
insomnia — has its associated HMM. The key components
of these HMMs are:
• The initial probability distribution over the initial
state of the condition. Initialized as a uniform
distribution, indicating that any severity level is
equally likely at the start.
• The transition matrix, which defines the
probability of transitioning from one state (severity
level) to another in consecutive time steps. For
instance, if a patient is currently at a severity
level of 3 for depression, the transition matrix,
the transition matrix will dictate the probability
of them improving to level 2, worsening to level
4, or remaining at level 3 in the next step.
• We use a Gaussian Hidden Markov Model as the
observable context is assumed to be generated
from a Gaussian distribution. The means and
covariances define these distributions for each
hidden state.
• Each state has a mean context emission, set to be
the same int the depression, anxiety, and
insomnia states.</p>
        <sec id="sec-5-4-1">
          <title>See Fig. 2 for a diagram.</title>
        </sec>
      </sec>
      <sec id="sec-5-5">
        <title>5.5. RL Environment</title>
        <p>At each step, an agent can choose to administer a specific
medication from the available list. The environment then
evolves based on the medication’s efects and the
underlying psychiatric state of the patient model by the use of
a transition matrix that map the current psychiatric state
to a new state based on a probability distribution that
models the dynamic and evolving nature of underlying
psychiatric states [18]. The agent receives a reward based
on the patient’s measured fluency.</p>
        <p>We use the LinUCB [19] algorithm to learn the optimal
medication strategy. The goal is to maximize the patient’s
speech fluency.</p>
        <p>The efects of the medication on the patient model is
implemented by applying the efects of the medication
on each condition on each condition’s transition matrix.
5
4
3
2
1
output
output
output
output
output</p>
      </sec>
      <sec id="sec-5-6">
        <title>5.6. Medication Selection Algorithm</title>
        <p>We use the increase in speech fluency as our reward.
Speech fluency is modelled as a linear function:
 = 0.1 ⋅  + 0.2 ⋅  + 0.3 ⋅  + 0.4 ⋅ 
where  is current speech fluency,  ,  ,  are the patient’s
current depression, anxiety, and insomnia scores
respectively, labelled on a 5 point scale, where 1 represents no
symptoms and 5 represents the most severe symptoms.
 is the patient’s baseline fluency.</p>
        <p>The implementation of the LinUCB algorithm observes
the current state of the patient, then for each medication
that is part of the environment, it estimates the reward
using a linear approximation. An upper confidence bound
is calculated for the estimated reward in which the
medication with the highest upper confidence bound is chosen
according to the equation:
  = arg max (  ( )    +  √  ()   −1  () )</p>
        <p>where   () is the feature vector for action  at time
 ,   is the parameter vector for action  which we want
to estimate, and  is the hyperparameter controlling the
exploration/exploitation trade-of [ 20]. In our
implementation of the LinUCB algorithm, we chose the value of
 = 10.0 .</p>
      </sec>
      <sec id="sec-5-7">
        <title>5.7. Results</title>
        <p>In our simulations, we run the RL algorithm and keep
track of disfluency over time. We inject noise into the
algorithm’s simulated measurements of disfluency to
simulate the fact that our disfluency detection system does
not measure disfluency perfectly. In the experiments
reported here, we inject a minimal amount of noise, cor- 0.475 0 25 50 75 100 125 150 175 200
responding to the high precision with which we can Days
measure disfluency. Figure 3</p>
        <p>We define the success of a trial as an improvement Convergence to higher speech fluency.
of over 0.5 in speech fluency. We define failure as a
deterioration of over 0.5 in speech fluency. Here,  ≈ 0.1 Speech Fluency over Time
is the standard deviation of fluency in the dataset. 0.77 Speech Fluency</p>
        <p>Figures 3 and 4 show examples of a successful and an t0.76
unsuccessful run, respectively. Across 500 patient simu- ffcyeE0.75
lation runs, we found a high potential for reinforcement cn
learning to correctly apply medication efects to reduce lcFehu0.74
speech disfluency, with 52% of runs showing success and eepS0.73
9% of runs demonstrating failure. ilzed0.72</p>
        <p>The average fluency across these runs was 0.66/1.00 roaNm0.71
with a standard deviation of 0.1. The success rate of the
simulations was 52%, with a failure rate of approximately 0.70
16%. Success and failure are defined as runs terminating 0 25 50 75 D1a0y0s 125 150 175 200
greater than 0.5 and lower than −.5 from the initial
lfuency level, respectively. Figure 4</p>
        <p>These results support the possibility of the use of rein- Lack of convergence to higher speech fluency.
forcement learning to improve speech fluency under the
studied conditions. Our preliminary results indicate that
if speech disfluency can be measured to within 10% of the
true score and the medications have plausible properties 6. Medication Literature Review
(similar to the ones seen in Section 6), then reinforcement
learning is a possible method to dose medications that A systematic literature review was conducted to
deterpessimize speech disfluency. However, the variability mine common medications used to treat major depressive
in outcomes and the presence of failed simulations that disorder (“depression”) and other mental illnesses that
prompted theoretical patient deterioration indicate that afect speech fluency. Table 1 indicates 23 medications
further research is needed in improving the model’s accu- whose onset time and response rates were used to inform
racy and understanding factors contributing to failures, the reinforcement learning simulation.
that will be important for applying these findings in a
clinical setting.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>7. Ethical Considerations</title>
      <p>In this paper, we outline and evaluate a proposal to
adjust medications automatically in order to improve the
speech fluency of a simulated patient. Although
administering medications automatically is sometimes done (e.g.,
with Insulin pumps), this can only be done after
thorough clinical trials and with informed consent from the
patient. Patients must be thoroughly informed about the
nature of the automated system, its potential risks and
benefits, and their rights in the decision-making progress.</p>
      <p>This extends beyond initial consent, and includes
ongo[15] P. Fossati, L. Bastard Guillaume, A.-M. Ergis, J.- (2012) 572–579.</p>
      <p>F. Allilaire, Qualitative analysis of verbal flu- [28] S. L. Cincotta, J. S. Rodefer, Emerging role of
sertinency in depression, Psychiatry Research 117 dole in the management of schizophrenia,
Neu(2003) 17–24. URL: https://www.sciencedirect.com/ ropsychiatric disease and treatment (2010) 429–441.
science/article/pii/S0165178102003001. doi:https: [29] J. S. Maan, T. Duong, A. Saadabadi, Carbamazepine
//doi.org/10.1016/S0165-1781(02)00300-1. (2018).
[16] M. M. Jacobs, S. Merlo, P. M. Briley, Sleep [30] Z. Tolou-Ghamari, M. Zare, J. M. Habibabadi, M. R.
duration, insomnia, and stuttering: The re- Najafi, A quick review of carbamazepine
pharmalationship in adolescents and young adults, cokinetics in epilepsy from 1953 to 2012, Journal
Journal of Communication Disorders 91 (2021) of research in medical sciences: the oficial journal
106106. URL: https://www.sciencedirect.com/ of Isfahan University of Medical Sciences 18 (2013)
science/article/pii/S0021992421000290. doi:https: S81.</p>
      <p>//doi.org/10.1016/j.jcomdis.2021.106106. [31] A. C. Pande, J. G. Crockatt, D. E. Feltner, C. A.
Jan[17] Z. Wang, M. Tang, M. Larrazabal, E. Toner, ney, W. T. Smith, R. Weisler, P. D. Londborg, R. J.</p>
      <p>M. Rucker, C. Wu, B. Teachman, M. Boukhechba, Bielski, D. L. Zimbrof, J. R. Davidson, et al.,
PreL. Barnes, Personalized state anxiety detection: An gabalin in generalized anxiety disorder: a
placeboempirical study with linguistic biomarkers and a controlled trial, American Journal of Psychiatry
machine learning pipeline (2023). doi:https://doi. 160 (2003) 533–540.</p>
      <p>org/10.48550/arXiv.2304.09928. [32] J. R. Strawn, L. Geracioti, N. Rajdev, K. Clemenza,
[18] C. Gauld, D. Depannemaecker, Dynamical systems A. Levine, Pharmacotherapy for generalized
anxin computational psychiatry: A toy-model to appre- iety disorder in adult and pediatric patients: an
hend the dynamics of psychiatric symptoms, Fron- evidence-based treatment review, Expert opinion
tiers in Psychology 14 (2023). doi:10.3389/fpsyg. on pharmacotherapy 19 (2018) 1057–1070.
2023.1099257. [33] K. Chokhawala, S. Lee, A. Saadabadi, Lithium.
na[19] E. Nelson, D. Bhattacharjya, T. Gao, M. Liu, D. Boun- tional library of medicine, National Center for
efouf, P. Poupart, Linearizing contextual bandits Biotechnology Information. PubMed Central (2022).
with latent state dynamics, in: Uncertainty in Arti- [34] T. Hui, A. Kandola, L. Shen, G. Lewis, D. Osborn,
ifcial Intelligence, PMLR, 2022, pp. 1477–1487. J. Geddes, J. Hayes, A systematic review and
meta[20] L. Li, W. Chu, J. Langford, R. E. Schapire, A analysis of clinical predictors of lithium response
contextual-bandit approach to personalized news in bipolar disorder, Acta Psychiatrica Scandinavica
article recommendation, in: Proceedings of the 140 (2019) 94–115.
19th international conference on World wide web, [35] D. F. Ionescu, R. C. Shelton, L. Baer, K. H. Meade,
2010, pp. 661–670. M. B. Swee, M. Fava, G. I. Papakostas, Ziprasidone
[21] M. Wilson, J. Tripp, Clomipramine (2019). augmentation for anxious depression, International
[22] H. Abdul-Baki, I. I. El Hajj, L. ElZahabi, C. Azar, clinical psychopharmacology 31 (2016) 341.</p>
      <p>E. Aoun, A. Skoury, H. Chaar, A. I. Sharara, A ran- [36] T. Zhao, T.-W. Park, J.-C. Yang, G.-B. Huang, M.-G.
domized controlled trial of imipramine in patients Kim, K.-H. Lee, Y.-C. Chung, Eficacy and safety
with irritable bowel syndrome, World journal of of ziprasidone in the treatment of first-episode
psygastroenterology: WJG 15 (2009) 3636. chosis: an 8-week, open-label, multicenter trial,
In[23] C. A. Townsend, Selegiline transdermal patch (em- ternational clinical psychopharmacology 27 (2012)
sam) for major depressive disorder, American Fam- 184–190.</p>
      <p>ily Physician 77 (2008) 505. [37] F.-G. Pajonk, Risperidone in acute and
long[24] J. J. Moore, A. Saadabadi, Selegiline, in: StatPearls term therapy of schizophrenia—a clinical profile,
[Internet], StatPearls Publishing, 2022. Progress in Neuro-Psychopharmacology and
Bio[25] T. Tenjin, S. Miyamoto, Y. Ninomiya, R. Kitajima, logical Psychiatry 28 (2004) 15–23.</p>
      <p>S. Ogino, N. Miyake, N. Yamaguchi, Profile of blo- [38] J. Peuskens, Risperidone in the treatment of
pananserin for the treatment of schizophrenia, Neu- tients with chronic schizophrenia: a multi-national,
ropsychiatric disease and treatment (2013) 587–594. multi-centre, double-blind, parallel-group study
[26] L. Dean, Venlafaxine therapy and cyp2d6 genotype versus haloperidol, The British Journal of
Psychia(2020). try 166 (1995) 712–726.
[27] R. D. Gibbons, K. Hur, C. H. Brown, J. M. Davis, J. J. [39] M. J. Allen, S. Sabir, S. Sharma, Gaba receptor (2018).</p>
      <p>Mann, Benefits from antidepressants: synthesis of [40] M. Panebianco, S. Al-Bachari, J. L. Hutton, A. G.
6-week patient-level outcomes from double-blind Marson, Gabapentin add-on treatment for
drugplacebo-controlled randomized trials of fluoxetine resistant focal epilepsy, Cochrane Database of
Sysand venlafaxine, Archives of general psychiatry 69 tematic Reviews (2021).
[41] F. Lavergne, I. Berlin, A. Gamma, H. Stassen, prospective study in chinese population,
NeuropsyJ. Angst, Onset of improvement and response to chiatric Disease and Treatment (2017) 515–526.
mirtazapine in depression: a multicenter naturalis- [54] J. Cookson, P. E. Keck Jr, T. A. Ketter, W. Macfadden,
tic study of 4771 patients, Neuropsychiatric disease Number needed to treat and time to
response/remisand treatment 1 (2005) 59–68. sion for quetiapine monotherapy eficacy in acute
[42] H. R. Song, W.-M. Bahk, Y. S. Woo, J.-H. Jeong, Y.- bipolar depression: evidence from a large,
randomJ. Kwon, J. S. Seo, W. Kim, M.-D. Kim, Y.-C. Shin, ized, placebo-controlled study, International
cliniS.-Y. Lee, et al., Eficacy and tolerability of generic cal psychopharmacology 22 (2007) 93–100.
mirtazapine (mirtax) for major depressive disorder: [55] J.-S. Lee, J.-H. Ahn, J.-I. Lee, J.-H. Kim, I. Jung,
C.multicenter, open-label, uncontrolled, prospective U. Lee, J.-Y. Lee, S.-I. Lee, C.-Y. Kim, Dose pattern
study, Clinical Psychopharmacology and Neuro- and efectiveness of paliperidone extended-release
science 13 (2015) 144. tablets in patients with schizophrenia, Clinical
Neu[43] K. A. Fariba, A. Saadabadi, Topiramate (2020). ropharmacology 34 (2011) 186–190.
[44] Y.-T. Liu, G.-T. Chen, Y.-C. Huang, J.-T. Ho, C.-C. [56] A. Kumar, S. Balan, Fluoxetine for persistent
develLee, C.-C. Tsai, C.-N. Chang, Efectiveness of dose- opmental stuttering, Clinical neuropharmacology
escalated topiramate monotherapy and add-on ther- 30 (2007) 58–59.
apy in neurosurgery-related epilepsy: A
prospective study, Medicine 99 (2020).
[45] G. Lewis, L. Dufy, A. Ades, R. Amos, R. Araya,</p>
      <p>S. Brabyn, K. S. Button, R. Churchill, C. Derrick,
C. Dowrick, et al., The clinical efectiveness of
sertraline in primary care and the role of depression
severity and duration (panda): a pragmatic,
doubleblind, placebo-controlled randomised trial, The</p>
      <p>Lancet Psychiatry 6 (2019) 903–914.
[46] M. F. Flament, R. Lane, R. Zhu, Z. Ying, Predictors
of an acute antidepressant response to fluoxetine
and sertraline, International clinical
psychopharmacology 14 (1999) 259–276.
[47] H. K. Singh, A. Saadabadi, Sertraline (2019).
[48] M. H. Trivedi, A. J. Rush, S. R. Wisniewski, A. A.</p>
      <p>Nierenberg, D. Warden, L. Ritz, G. Norquist, R. H.</p>
      <p>Howland, B. Lebowitz, P. J. McGrath, et al.,
Evaluation of outcomes with citalopram for depression
using measurement-based care in star* d:
implications for clinical practice, American journal of</p>
      <p>Psychiatry 163 (2006) 28–40.
[49] Z. Jia, J. Yu, C. Zhao, H. Ren, F. Luo, Outcomes and
predictors of response of duloxetine for the
treatment of persistent idiopathic dentoalveolar pain: A
retrospective multicenter observational study,
Journal of Pain Research (2022) 3031–3041.
[50] N. Parikh, M. Yilanli, A. Saadabadi,
Tranyl</p>
      <p>cypromine (2017).
[51] W. T. Heijnen, J. De Fruyt, A. I. Wierdsma, P.
Sienaert, T. K. Birkenhäger, Eficacy of tranylcypromine
in bipolar depression: a systematic review, Journal
of clinical psychopharmacology 35 (2015) 700–705.
[52] J. Waugh, K. L. Goa, Escitalopram: a review of its
use in the management of major depressive and
anxiety disorders, CNS drugs 17 (2003) 343–362.
[53] K. Jiang, L. Li, X. Wang, M. Fang, J. Shi, Q. Cao, J. He,</p>
      <p>J. Wang, W. Tan, C. Hu, Eficacy and tolerability of
escitalopram in treatment of major depressive
disorder with anxiety symptoms: a 24-week, open-label,
Clomipramine
Imipramine
Selegiline
Low remission rate (20%);
[21]
80.6% patients after 12-week
treatment, compared to 48.0%
in the placebo [22]
33-40% [23]
Depression, PTSD, OCD,
panic disorder, social anxiety
disorder [47]
Depression, social anxiety
disorder, PTSD [48]
Depression, anxiety [49]
Major depressive episodes
[50]
Depression, generalized
anxiety disorder (GAD),
obsessive compulsive disorder
(OCD) and panic attacks [52]
Schizophrenia, manic,
psychotic and depressive
episodes [54]
Psychotic disorders including
schizophrenia [55]
OCD, certain eating
disorders, panic attacks [56]</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <source>6-12 weeks [21] 4-5 weeks [22] 1-2 weeks [23] 4-6 weeks [25] 4-6 weeks [26] 4-6 weeks [28] First few days [29] 3 days [31] 1-3 weeks [33] 1-2 weeks [35] 4 weeks [37] 1-4 weeks [39] 1 week [41] 2-4 weeks (epilepsy)</source>
          ,
          <source>3 months (migraines) [43] Within 6 weeks [45]</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <article-title>1-2 weeks to start working, 4-6 weeks for full benefit</article-title>
          [
          <volume>48</volume>
          ]
          <fpage>2</fpage>
          -4 weeks [49]
          <fpage>1</fpage>
          -
          <lpage>2</lpage>
          weeks to start working, 6
          <article-title>-8 weeks for full improvement [50] 1-2 weeks to start working, 6-8 weeks for full improvement</article-title>
          [
          <volume>52</volume>
          ]
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>1-2 weeks to start working, 2-3 months for full improvement</article-title>
          [
          <volume>54</volume>
          ]
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <article-title>2-8 weeks for full improvement</article-title>
          [
          <volume>55</volume>
          ]
          <fpage>4</fpage>
          -5 weeks [56]
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>