=Paper=
{{Paper
|id=Vol-3649/Paper13
|storemode=property
|title=Deploying AI Methods for Mental Health in Singapore: From Mental Wellness to Serious Mental Health Conditions
|pdfUrl=https://ceur-ws.org/Vol-3649/Paper13.pdf
|volume=Vol-3649
|authors=Creighton Heaukulani,Ye Sheng Phang,Janice Huiqin Weng,Jimmy Lee,Robert J.T. Morris
|dblpUrl=https://dblp.org/rec/conf/aaai/HeaukulaniPWLM24
}}
==Deploying AI Methods for Mental Health in Singapore: From Mental Wellness to Serious Mental Health Conditions==
<pdf width="1500px">https://ceur-ws.org/Vol-3649/Paper13.pdf</pdf>
<pre>
                                Deploying AI Methods for Mental Health in Singapore:
                                From Mental Wellness to Serious Mental Health Conditions
                                Creighton Heaukulani1,∗ , Ye Sheng Phang1 , Janice Huiqin Weng1 , Jimmy Lee2,3 and
                                Robert J. T. Morris1,4
                                1
                                  MOH Office for Healthcare Transformation, Singapore
                                2
                                  Institute of Mental Health, Singapore
                                3
                                  Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore
                                4
                                  Yong Loo Lin School of Medicine, National University of Singapore, Singapore


                                               Abstract
                                               We describe our results from the implementation of machine learning and AI methods in three digital health initiatives
                                               serving individuals across the mental health spectrum in Singapore. The first initiative is Project HOPES, which we launched
                                               in 2019 for patients with serious mental illnesses. Originating as an observational study on digital phenotypes (collected via
                                               smartphones and wrist wearables) of 100 patients with schizophrenia, the tool has now been introduced as a service within a
                                               tertiary setting and has expanded to include patients with depression. The strategy dynamically prioritizes patients for early
                                               review by care coordinators according to need, which may avoid hospitalizations. Machine learning is used to predict clinical
                                               status, i.e., symptoms and functioning, and to predict relapses and other adverse clinical events; the latter can be done at
                                               92% sensitivity and 90% specificity with the available digital biomarkers. The second initiative we describe is mindline.sg
                                               (www.mindline.sg), a platform for mental wellness in the general population that we created in 2020. Through a public-facing
                                               website, we deliver over 800 resources including wellness education, clinically validated self-assessments and triaging, and
                                               interactive resources, including an AI chatbot. Launched at the height of the COVID-19 pandemic, with all its attendant
                                               stresses, the platform has been visited by somewhere between 10 to 20% of the national population by the end of 2023. The
                                               third initiative we describe is Let’s Talk (https://letstalk.mindline.sg), an online peer-support mental health network, which
                                               was co-created with youth advocates. The need for this platform was discovered through extensive studies with youth who
                                               expressed a desire for human-based support beyond the proliferation of digital solutions. In its first year, the site has been
                                               visited by over 80,000 unique users. Trained moderators review content on the site for safety and accuracy, and qualified
                                               therapists provide professional support through the free and anonymous Ask-A-Therapist service. To scale this service with a
                                               growing user base, we have been trialing the use of generative models to aid our therapists in finding relevant resources
                                               according to a user’s need and to encourage empathetic writing.

                                               Keywords
                                               digital phenotyping, schizophrenia, depression, mental wellness, AI chatbots, large language models, digital health


                                1. Introduction                                                 in serious functional impairment, which substantially
                                                                                                interferes with or limits one or more major life activi-
                                Mental Health conditions comprise one of the largest ties.”1 On the other hand, we serve the mental wellness
                                burdens of disease worldwide, especially when measured needs of the population through mental health promo-
                                in Years Lived with Disability (YLDs). Stigma is preva- tion, where a local agency, the Singapore Association
                                lent in many countries and cultures, which discourages for Mental Health, defines mental wellness as “a positive
                                help-seeking. At the MOH Office for Healthcare Transfor- state of mental health [that] is more than the absence of
                                mation and the Institute of Mental Health in Singapore, mental illness,” in which an individual is “able to think,
                                we adopt a population health approach starting from feel and act in ways that create a positive impact on your
                                each end of the mental health spectrum and “working physical and social well-being.”2 Our solutions include
                                our way in”. On the one hand, we serve the needs of digital health tools that increase the continuity of care,
                                patients with serious mental illnesses, which are defined shifting care from the hospital/clinic into the home and
                                by the US National Institute of Mental Health (NIMH) as from healthcare providers into the hands of individuals.
                                “mental, behavioral, or emotional disorder[s] resulting In this article, we describe our results from the implemen-
                                                                                                tation of machine learning and AI in three digital health
                                Machine Learning for Cognitive and Mental Health Workshop initiatives deployed in our strategy, together with the
                                (ML4CMH), AAAI 2024, Vancouver, BC, Canada                      challenges we continually confront as we deploy these
                                ∗
                                     Corresponding author.
                                Envelope-Open creighton.heaukulani@moht.com.sg (C. Heaukulani); tools across the mental health ecosystem in Singapore.
                                yesheng.phang@moht.com.sg (Y. S. Phang);
                                janice.weng@moht.com.sg (J. H. Weng); jimmy_lee@imh.com.sg                                              1
                                                                                                                                            https://www.nimh.nih.gov/health/statistics/mental-illness
                                (J. Lee); robert.morris@moht.com.sg (R. J. T. Morris)                                                   2
                                                                                                                                            https://www.samhealth.org.sg/understanding-mental-health/
                                         © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                         Attribution 4.0 International (CC BY 4.0).                                                         what-is-mental-wellness/


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
1.1. Serious Mental Illnesses: Digital   other) digital measures were predictive of poor symp-
     Phenotyping and AI in Schizophrenia toms and functioning: irregular sleep habits (including
     and Depression                      increased time spent awake in bed and in light stage
                                                              sleep), decreased steps and GPS mobility, decreased text
Patients in Singapore with schizophrenia are usually          messages sent, slowed tapping speed, and increased heart
treated in the specialist or hospital setting. After dis-     rate while asleep, among others [6, 5].
charge to the community, it is not uncommon to see               We have also investigated the use of machine learn-
relapses. Around 80% suffer at least one relapse within       ing to predict adverse clinical events, including relapse
five years of initial remission [1] which often result in     of psychosis. For these purposes, we defined a relapse
emergency room visits or re-hospitalizations. Relapse is      as a rehospitalization due to psychosis symptoms or a
extremely disruptive to a patient’s life and rehospitaliza-   significant deterioration in clinical status defined using a
tion incurs large costs. Relapsing patients often exhibit     validated clinician-assessed scale measuring general psy-
psychotic symptoms such as hallucinations, delusions, or      chopathology symptoms. Other clinical events include
disordered thinking. They might display changes in sleep      emergency room visits, readmissions due to reasons other
behaviors, mood, social withdrawal and disorganized be-       than psychosis, and unscheduled clinical visits. For these
haviors. An emerging strategy to detect such changes          predictive models, we explored both unsupervised learn-
involves digital phenotyping, defined as the “moment-         ing approaches to anomaly detection, including time se-
by-moment quantification of the individual-level human        ries smoothing and forecasting methods (including the
phenotype in situ using data from personal digital de-        method originally used by Henson et al. [7]) and isola-
vices” [2]. In 2019, we began the HOPE-S observational        tion forests, as well as supervised learning approaches
study [3], for which we developed the HOPES digital           utilizing generalized linear models, random forests, and
phenotyping platform described in detail by Wang et al.       gradient boosting trees. The models first establish a pa-
[4]. As of the end of 2023, the platform has been in con-     tient’s baseline (on the multivariate data) and attempt
tinuous operation with patients and clinicians for over       to detect deviations from that baseline at the individual-
four years. Events are currently recorded from the user’s     level. Retrospective analyses from our observational trial
smartphone including mobility (derived from obfuscated        indicate that we can detect adverse clinical events. The
GPS coordinates), tapping speed on the keyboard, am-          model that performed best varied by situation. For ex-
bient light, screen time, accelerometry, and sociability      ample, when using all digital measures (constituting a
indices (derived from calls, SMS and WhatsApp calls and       very high-dimensional and not very interpretable feature
messaging). Events are also captured from a wrist wear-       space), the gradient boosting tree performed best. But
able device measuring heart rate, heart rate variability,     when we restricted the dataset to the one or two most in-
activity (through step counts), and sleep (including stag-    terpretable features from each sensor (which are studied
ing and efficiency).                                          by clinical staff to explain the model), the isolation forest
   Over the course of the study, we continuously col-         always performed best. The best model to use therefore
lected digital phenotyping data from 100 patients with        depends on the operating mode, and ultimately will be
schizophrenia (each patient was followed up for a six-        guided by clinical requirements.
month period), with clinical assessments performed ev-           Here, we briefly report indicative performance metrics
ery six weeks to measure symptoms and functioning.            from the gradient boosting tree model. A more thorough
The total data collected throughout the trial consists of     report of the methodological descriptions and predictive
over 220 million events. We found generally high com-         results will be reported in upcoming publications. A clas-
pliance in wearing of the wrist device (91% of all possible   sification setup is used, as studied by Ben-Zeev et al. [8],
data was successfully collected in the week following         and sensitivity, specificity, and the harmonic mean score
enrolment), which required patients to wear the device        (between sensitivity and specificity) are reported. In our
at all times, including to sleep, and successful data col-    clinical setting, we are willing to accept a reasonable
lection from the smartphone (82% of all possible data         number of false alarms (i.e., lower the algorithm speci-
was collected), which only required patients to not close     ficity), to keep sensitivity high. A false positive alarm
the App in the background [5]. We note, however, that         may result in the care coordinator giving the participant
this high compliance rate was likely aided by a modest        a call to check in or sending them an inquiring and sup-
inconvenience fee that was provided to the trial patients.    portive text message. We therefore give more emphasis
Some patients, to whom the study was offered, declined        to sensitivity, i.e., being able to sense a deterioration in
to participate for reasons including privacy or intrusive-    patient health, even if mild. This is acceptable for our
ness, leaving us with the interesting challenge as to how     clinical partners who envision a shift from the traditional
we can ameliorate and obviate these concerns in the fu-       reactive model of care to a proactive one where early
ture. In our initial analyses associating clinical status     detection and intervention might provide extra support
with digital markers, we found that the following (and        and therefore prevent adverse clinical events. We there-
                android_complete feature model optimized for hm_beta2
                                                                                tapping_speed_ms
        0.95
                                                                                    hometime_hrs
        0.90
                                                                               num_text_msg_sent

        0.85                                                                    dist_travelled_kms
score


                                                                             sleep_efficiency_score
        0.80
                                                                                        total_steps
        0.75                                                                  accelerometry_mean
                                                         ml_model
                                                        RandomForest-100         awake_in_bed_hrs
        0.70                                            GradBoostTree-100
                    sensitivityspecificity harmonic_mean(b=2)                       rem_sleep_hrs
                                 metric
Figure 1: Boxplots of performance metrics for two popular                           sleep_total_hrs
machine learning models on predicting adverse clinical events
                                                                                  HR_bpm_resting
in schizophrenia, including relapses.
                                                                                       HRV_resting
                                                                                  screen_time_hrs
fore explored optimization of the following weighted
harmonic mean score:                                                         ambient_light_lumens
                                                                                                  0.00   0.05 0.10 0.15 0.20 0.25 0.30
                                      sensitivity × specificity                                              relative feature importance weight
               𝐻𝛽 = (1 + 𝛽 2 )      2
                                  (𝛽 × specificity) + sensitivity           Figure 2: Ranking by feature importance weight in the gradi-
                                                                            ent boosting tree model for the prototypical measures from
which has the interpretation that sensitivity is 𝛽 times
                                                                            each sensor.
as important as specificity (where we note that 𝛽 can
be greater than or less than one depending on whether
you value sensitivity or specificity more). This measure
is directly analogous to the weighted 𝐹-score used in                       ture and display the “relative feature importance weights”
classification. In the experiments that follow, we report                   (the impurity-based feature importance measures imple-
𝐻2 , where sensitivity is considered twice as important as                  mented by most packages) in fig. 2. In this experiment,
specificity.                                                                we see that tapping speed is considered the most impor-
   Boxplots of 𝐻2 over ten 85%/15% training/testing                         tant feature, followed by GPS-based and wrist device
dataset splits for two machine learning models are shown                    measures, including distance travelled, time at home,
in fig. 1. The median score (over the test sets) using the                  sleep efficiency score, and number of steps. This example
gradient boosting tree is 𝐻2 = 91.0% (91.5% sensitivity,                    only considers a small subset of the features, however,
89.6% specificity). Note that the displayed sensitivity and                 and we do see such rankings change depending on the
specificity scores are the median scores for these metrics                  experimental setup and the feature set provided.
over the test sets; they do not correspond to the compo-                       Having successfully completed the above study, we
nents used to compute the reported 𝐻2 score. This pre-                      are now piloting the use of this model as a preventative
dictive power is high but may be difficult to achieve in a                  service at the Institute of Mental Health (a large men-
clinical service (versus a controlled study) where compli-                  tal healthcare facility) in Singapore. This service is now
ance to wearable and smartphone data collection without                     being extended to serve both patients with schizophre-
financial incentives could be more challenging. The study                   nia and mood disorders including major depression. For
by Cohen et al. [9] indicates that this real-world compli-                  this new HOPES Clinical Service, patient-facing compo-
ance could drop to 50%. Missing and delayed data uploads                    nents for the smartphone App have been developed to
are common in clinical practice. Experiments on subsets                     supplement what was purely passive monitoring in the
of the features suggest that performance could drop to                      previous observational trial. Supported on both Android
𝐻2 = 85.6% (87.4% sensitivity, 78.7% specificity) with a                    and iOS, patients may now interact with the App through
sparse model containing a subset of measures that are rel-                  Ecological Momentary Assessments (EMAs), wherein they
atively easier to collect (steps, heart rate, accelerometer,                may answer some questions as to how they are feeling at
screen time, taps in Apps, and tapping speed).                              the time of a prompt, and they may additionally record
   To explore those digital measures that appear most                       what factors or stressors might be contributing to those
important for prediction, we fit the gradient boosting                      feelings. Such data collection is sometimes referred to as
tree model on the prototypical measures from each fea-                      active monitoring. The EMA responses are transmitted
     Digital phenotyping
      and EMA signals


                                 Clinician-
                                  defined
                                algorithms
 •     EMAs
 •     Sleep
 •     Activity
 •     Heart rate
 •     Sociability indices
 •     Mobility patterns        Machine
 •     Tapping speed            learning
 •     Screen time              routines

Figure 3: The clinical care coordinator dashboard in the HOPES service for serious mental illnesses surfaces patients according
to R/Y/G triage status, as determined by clinician-defined rules on the digital phenotyping data. Machine learning algorithms
may only ever increase the severity of a patient’s status.


to the care coordinators to aid in care and supplement          the HOPES App. These interventions, based on the EMAs
the digital phenotyping data, which together surface pa-        as well as the passive data, are timely and individualized,
tients on a clinical dashboard. Our strategy is to have         for example, sleep exercises are delivered when patients
clinicians in a care monitoring center regularly observe        have poor sleep for two nights or more, and mindful-
the anomaly detection signals on the dashboard, and with        ness exercises are delivered when patients indicate low
the help of digital tools that can explore a patient’s data     mood on the EMAs, as just two examples. Such interven-
the clinical staff may decide to take further interventions,    tions are known as Ecological Momentary Interventions
including earlier clinical review for a patient.                (EMIs), and some of these digital therapeutic exercises
   The clinical dashboard is designed to achieve the effec-     are inspired by cognitive behavioral therapy (CBT).
tiveness of Intensive Case Management [10] in the pres-
ence of large caseloads. Cost-effective staffing dictates       1.2. Our value proposition and strategy
that the caseload of care coordinators (caring for patients
that have returned to the community) is prioritized by a         Digital phenotyping enables the continual sensing of
positional ordering that is determined by a Red/Yellow/-         needs to drive both automated EMIs and to trigger
Green (R/Y/G) status, shown in fig. 3. A Green (G) status        stepped-up human-based care. The previous standard
indicates normal behavior and requires no action. A Yel-         of care had no such real-time sensing capabilities and
low (Y) status indicates moderate deviation of behavior          has been entirely reactive and episodic. This strategy
from normal (or possibly oncoming and escalating symp-           strengthens the connectivity between care team and pa-
toms) and triggers automated interventions (described            tient and augments the mission to increase continuity of
later). A Red (R) status indicates persistent and signifi-       care and extend care beyond the clinic into the commu-
cant deviation from normal and requires review by the            nity. We initially focused on schizophrenia because we
clinical team. To ensure patient safety, it is important         anticipated quite strong indicative behavior-based sig-
that this prioritization be effective and based on an ex-        nals. We are now scaling both “up” and “out” by moving
plainable clinical rationale. Hence a rule-based and fully       into other diagnosed conditions such as depression and
explainable set of criteria has been developed (working          into adjacent populations, such as well populations, and
closely with the clinicians who oversee the welfare of the       into physical health conditions.
patients), which determines whether a patient is flagged
Y or R or left in the G state. This explainable property is     1.3. The role of AI and machine learning
also desirable to assist a care coordinator when they call
the patient: they may explain what they have noticed in          Machine learning is used to predict a patient’s clinical
the digital markers (e.g., “you don’t seem to have been          status based on the multivariate set of digital markers,
very active lately” or “it seems you haven’t been sleeping       i.e., potential predictors of a patient’s current symptom
well”).                                                          severity and functioning. This may potentially avert the
   Patients with detected needs of lower acuity (i.e., those     need to come in for clinical visits as is currently required
in the Y state) are supported with automated interven-           under standard care. Machine learning is also used in
tions comprised of digital therapeutics delivered through        the longitudinal prediction of relapses and may raise the
                                                                 color coding of severity and prioritization of the patient
for attention by the care coordinator. We also note that      2. Mental Wellness in the
AI is used in a self-contained way in some of the EMI
tools in the form of a mental wellness chatbot, which we
                                                                 Community: Tools for the Well
will describe in the next section on our digital mental          and Mild-to-Moderate Needs
wellness platform, mindline.sg.
   The machine learning models utilized for clinical event    Many countries and cultures face persistent challenges
prediction include traditional time series smoothing, iso-    to mental health awareness and promotion including
lation forests, generalized linear models, random forests,    stigma toward mental disorders, reluctance to seek help,
and gradient boosting trees. The methods used for dis-        low mental health literacy, a lack of trained mental health
covering associations between clinical scales and digi-       personnel, and underdeveloped mental healthcare ecosys-
tal markers were multiple linear regression and multi-        tems. The COVID-19 pandemic created a surge in mental
level/hierarchical linear regression. A detailed descrip-     healthcare needs, demanding a new impetus to address-
tion of these methods is beyond the scope of this paper       ing these shortcomings. The pandemic also accelerated
and will appear in upcoming publications.                     the adoption and acceptance of digital health solutions,
   Explainable signals are important for the clinical ser-    creating a new opportunity for innovative approaches to
vice. The multivariate nature of our digital phenotyping      address mental healthcare needs.
data aids clinicians in the interpretation of alerts – for       In June 2020, we launched a Web App, mindline.sg
which we provide digital tools to allow clinicians to keep    (www.mindline.sg), serving as a digital mental health re-
track of patient data. It is important to note that our ma-   source website that has grown to include over 800 curated
chine learning-based relapse prediction algorithm is only     resources, a clinically-validated self-assessment tool for
permitted to increase patient severity (from G to Y or from   depression and anxiety, and a fully integrated AI chatbot
Y to R), which may prompt attention from a care coordi-       developed for mental health applications from Wysa3 ,
nator (see fig. 3). The AI algorithm may never relegate a     a leading partner in the field. More recently, we added
patient to lower acuity. This is not only for safety, but     curricular and structured learning materials with Intel-
also aligns with Singapore’s national regulatory guide-       lect 4 , another leading partner in digital mental health.
lines for predictive models in clinical service. As further   The landing page and a therapeutic exercise with the
experience is gained in the detection and management          chatbot are shown in fig. 4. The self-assessment tool
of patients with digital phenotyping, we may be able to       is comprised of the conventional scales of GAD-7 and
move to a wider use of AI algorithms and to allow them        PHQ-9 and triages users into well, mild, moderate, and
to play a more definitive role in the selection of patients   crisis levels. The triage status allows us to customize con-
for clinical review. As new regulatory guidelines emerge      tent and recommend appropriate therapeutic exercises.
and are navigated, pivots to our design and strategy may      Tailored products for youth and working adults are also
occur.                                                        provided to the public. Business-to-Business (B2B) cus-
                                                              tomizations and engagements serve a range of ecosystem
                                                              partners, including workplace partners and educational
1.4. What’s next?                                             institutions. Resources on common risk factors (such as
So far, the bulk of our experience has been with financial, employment and caregiver stress) are included,
schizophrenia. Our service, however, has expanded to addressing a broad spectrum of determinants of mental
include patients with depression and takes a transdiag- health. The platform was developed to be anonymous
nostic approach, which may justify further research trials and to contain authoritative and localised content for var-
to develop and refine algorithms for depression in the ious levels of distress, with a focus on wellness and mild
local context. Additionally, an upcoming research study needs. Moderate-to-severe needs are primarily served
will evaluate our clinical service. Finally, we are currently by detection (through the self-assessment triage and AI
expanding our digital phenotyping strategy to mental chatbot) and referral to professional support, which in-
wellness. In this way, our digital phenotyping and ma- cludes counselling centres and emergency 24/7 services,
chine learning tools are moving “inward” toward mild all according to a clinician-designed protocol.
severity and well-populations.                                   The platform has shown remarkable uptake. The site
                                                              has been visited by between 10 and 20% of the (targeted)
                                                              national population. The variability of this estimate is
                                                              due to the anonymity feature of the site (we can only de-
                                                              tect cookie IDs), which is a key feature enabling barrier-
                                                              free access. If unique users visit from multiple machines
                                                              or browsers, we may record them more than once. Indeed,
                                                              3
                                                                  www.wysa.com
                                                              4
                                                                  https://intellect.co
Figure 4: (Left) The mindline.sg landing page with the AI chatbot and self-assessment triage tools in the top two panels.
(Middle and Right) The Wysa AI chatbot directs a user to an exercise inspired by cognitive behavioral therapy.


it is a learning from our implementation that anonymity       2.1. Our value proposition and strategy
does limit our ability to evaluate the platform. Another
                                                              The goal of the mindline.sg platform is to empower in-
learning has been that successfully scaling the platform
                                                              dividuals in the community to take charge of their own
required expansive and sustained digital marketing ef-
                                                              mental health and to provide them the tools they need to
forts, as well as strategic ecosystem partnerships through
                                                              offer basic support (“first aid”) to themselves and those
the B2B products, and investment in partnership with
                                                              around them, all through the ease and convenience of a
educational institutions and healthcare providers. About
                                                              barrier-free digital solution. This aligns with our strategy
60% of our user acquisitions come from ads that we post
                                                              to improve population health through digital tools that
on social media; the next most frequent acquisitions are
                                                              enable self-empowerment and self-management. Such a
from search, direct entry of the URL, or use of a QR
                                                              strategy transfers some of the care from the system onto
code from our flyers and posters. Referrals also occur
                                                              the individual and their significant others and moves
from B2B partner sites. The most popular resources used
                                                              some care from the clinical setting into the community
on the platform are sleep aids. “Mood check-ins” and
                                                              and the home.
the self-assessment tool are also popular. The distribu-
tion of moods includes “tired”, “unmotivated”, “anxious”,
“positive”, “frustrated” and “sad” (in order of decreasing    2.2. The role of AI and machine learning
frequency). Scores for GAD-7 and PHQ-9 show a mod-
                                                              A natural language processing (NLP)-based chatbot from
erate number in a state of crisis: however, we believe
                                                              Wysa is deployed to engage, triage, chat with, and di-
that this frequency is affected by anonymous users trying
                                                              rect the users to a range of relaxation, mindfulness, and
out different answers to the questions to trigger differ-
                                                              CBT-inspired exercises. The Wysa chatbot is designed
ent triage levels and seeing what resources are offered,
                                                              by a team of psychologists actively involved in patient
mainly out of curiosity. We do not view this kind of usage
                                                              counseling and has been subjected to numerous studies
negatively, as we feel that it is important for users to be
                                                              evaluating effectiveness [13, 14]5 . Beyond the chatbot,
educated as to what resources are available in times of
                                                              however, we have limited data collected on the site (again,
need, either for themselves or a friend or relative. We
                                                              due to the anonymity feature), which limits any machine
have published these results in both process and impact
                                                              learning and AI efforts. A mobile App is presently being
evaluation studies [11, 12].
                                                              developed to give users an alternative option, which may

                                                              5
                                                                  https://www.wysa.com/clinical-evidence
be able to better leverage AI to improve user experience       have been over 2,800 posts and over 370 Ask-a-Therapist
and benefit.                                                   questions had been answered. As the platform’s user
                                                               base scales, we anticipate challenges in moderating the
2.3. What’s next?                                              platform and responding to questions in a timely manner
                                                               while maintaining quality. We have therefore started
The present wellness tool is mainly designed to serve          trialing the use of large language models (LLMs) to as-
those who are well or have mild conditions, with referral      sist our staff therapists in searching for relevant content
to human-based resources for moderate and crisis cases.        from a trusted knowledge base (mindline.sg) based on
We are also exploring the incorporation of clinical ad-        the user’s need (inferred from the posted question). We
junct tools such as validated and localized internet CBT       used GPT-3.5 from OpenAI, which we fine-tuned using
(iCBT) tools [15], which can be used by mental health-         over 300 question-answer pairs from the Ask-a-Therapist
trained primary care providers. The mobile App will            service. Retrieval-augmented generation (RAG) [17] is
enable longitudinal data tracking and clinical manage-         used to produce the most suitable resources from the
ment, which in turn may provide new opportunities for          knowledge base, which is indexed from all resources in
AI and machine learning in the service. We have already        mindline.sg. While the therapist may copy-and-paste rec-
conducted a feasibility study of the tool in this population   ommended resources (and their descriptions) from this
[16].                                                          tool, they remain fully responsible for the content in their
                                                               reply. The LLM assistant has helped our staff therapists
                                                               with close to 30 responses so far, where 88% of those
3. Digitally-Enabled Peer and                                  responses have been rated as helpful by the therapist.
   Professional Mental Wellness                                In fig. 5, we show information on the Ask-A-Therapist
                                                               service and screenshots on how our therapists use the
   Support for Youth                                           LLM assistant.
After extensive workshops, focus group discussions, and
co-design sessions with youth (including those with lived      3.1. Our value proposition and strategy
experience), we discovered a desire for social support and
meaningful human interactions delivered in a safe on-          Youth mindline is the tailored youth product within
line environment. In particular, human-based support           the mindline.sg platform described in the previous sec-
is now specifically sought amongst the proliferation of        tion, and it serves as a companion site to Let’s Talk.
purely digital self-management solutions. We therefore         Where youth mindline enables self-management and self-
co-created with youth advocates an online peer support         empowerment through digital-only “self-help” tools, Let’s
network called Let’s Talk (https://letstalk.mindline.sg).      Talk offers a purely human-based form of therapy and
The platform soft-launched in October 2022 and has been        engagement. We believe the platform may also address
piloting since. By end of December 2023, the site had re-      determinants of mental health based on sociability and
ceived over 80,000 unique visitors (as measured by Google      that users offering support to others has benefits for
Analytics). This peer-support network’s value proposi-         the helper and the helped. The strategy provides a low-
tion, over other platforms such as Reddit, includes close      barrier means of accessing professional support and pro-
oversight and management by trained moderators and             vides this support (which is at the individual level) at
professional therapists. The moderators and therapists         scale.
maintain a constructive and supportive atmosphere in
the forum; other comparable forums have suffered from          3.2. The role of AI and machine learning
trolling, toxicity, spam, and scams. It is noted that some
                                                               The LLM assistant exemplifies one of our strategies for
“medically themed” forums are overly commercial or are
                                                               generative models in healthcare in which an AI agent
used by practitioners to advertise their own practices.
                                                               acts as an assistant to care providers. As we scale, we
Let’s Talk also provides an Ask-a-Therapist service where
                                                               envision that such agents can dramatically reduce the
users can pose a question to a panel of qualified profes-
                                                               time to search for a pool of potentially optimal therapies
sional therapists whom we have engaged; their response
                                                               for a patient, client, or user’s exhibited needs at the ap-
is asynchronous but usually occurs within 24 hours. The
                                                               propriate time, saving the clinician or care provider time.
therapists follow a protocol defined by a Clinical Advisory
                                                               A sufficiently large pool of resources, as is the case in
Panel to deal with pressing needs and crisis conditions.
                                                               mindline.sg, can guard against repetition or a narrow set
   As of December 2023, from among the over 80,000 visi-
                                                               of recommended resources.
tors to the site, over 6,000 have registered an anonymous
user account (which is required to post content but is
not required to access the forum and read posts). There
Figure 5: (Left) The Ask-A-Therapist service on the Let’s Talk digital peer support forum for youth. (Middle) The Telegram
interface for the LLM therapist assistant used by Let’s Talk staff therapists to retrieve therapeutic content relevant for a post.
(Right) Feedback is collected from therapists possibly enabling future AI improvement efforts.


3.3. What’s next?                                          ing further care and ecosystem partners, including ad-
                                                           ditional tertiary care partners, primary care providers,
Future plans for AI and machine learning in Let’s Talk in-
                                                           allied health, community organizations, peer supporters,
clude using LLMs to review therapist responses to check
                                                           workplace partners, and educational partners. We hope
for or encourage a more empathetic tone. We may also
                                                           to communicate our learnings to others with a similar
use LLMs to train our peer supporter volunteers to a stan-
                                                           mission. We also hope to learn from others who are on a
dardized level. Finally, LLMs or other NLP techniques
                                                           similar journey.
can be used to continually assess content on the site to
detect toxicity, spam, and misinformation.
                                                                  Ethical Statement
4. Conclusion                                             The authors received IRB approval for the HOPE-S digital
                                                          phenotyping study described in the first section. The
The initiatives described have shown significant uptake
                                                          mindline.sg and Let’s Talk services both have Terms of
by patients under care or users among our population.
                                                          Use (available on their websites) that indicate that usage
They confirm the promise of usefulness of digital and
                                                          data may be collected and used for research purposes
AI tools in providing improved digitally-enabled thera-
                                                          and for service improvement. The authors declare no
peutics and interventions. These initiatives demonstrate
                                                          conflicts of interest.
our strategy of starting at extreme ends of the mental
health spectrum and working our way in toward the
middle, blending interventional tools as we go. Through Responses to Reviewer Comments
this strategy, we aim to cover the entire life-course and
spectrum of acuity. Along the way, we are incorporat- We thank the reviewers for their comments. In this pub-
                                                          lished manuscript, we have included succinct references
and further technical details on the machine learning and      [7] P. Henson, R. D’Mello, A. Vaidyam, M. Ke-
AI models, including quantitative experimental results.            shavan, J. Torous,          Anomaly detection to
We have added in precise definitions of mental wellness            predict relapse risk in schizophrenia, Trans-
and serious mental illnesses in a new Introduction sec-            lational Psychiatry 11 (2021) 28. URL: https:
tion. We have made it clear that we are describing results         //www.nature.com/articles/s41398-020-01123-7.
from implementation in the Abstract and Introduction.              doi:10.1038/s41398- 020- 01123- 7 .
These changes address all requests by the reviewers.           [8] D. Ben-Zeev, R. Brian, R. Wang, W. Wang, A. T.
                                                                   Campbell, M. S. H. Aung, M. Merrill, V. W. S. Tseng,
                                                                   T. Choudhury, M. Hauser, J. M. Kane, E. A. Scherer,
References                                                         CrossCheck: Integrating self-report, behavioral
                                                                   sensing, and smartphone use to identify digital in-
 [1] D. Robinson, M. G. Woerner, J. M. J. Alvir, R. Bilder,
                                                                   dicators of psychotic relapse., Psychiatric Rehabil-
     R. Goldman, S. Geisler, A. Koreen, B. Sheitman,
                                                                   itation Journal 40 (2017) 266–275. URL: http://doi.
     M. Chakos, D. Mayerhoff, J. A. Lieberman, Pre-
                                                                   apa.org/getdoi.cfm?doi=10.1037/prj0000243. doi:10.
     dictors of Relapse Following Response From a First
                                                                   1037/prj0000243 .
     Episode of Schizophrenia or Schizoaffective Dis-
                                                               [9] A. Cohen, J. A. Naslund, S. Chang, S. Nagendra,
     order, Archives of General Psychiatry 56 (1999)
                                                                   A. Bhan, A. Rozatkar, J. Thirthalli, A. Bondre,
     241. URL: http://archpsyc.jamanetwork.com/article.
                                                                   D. Tugnawat, P. V. Reddy, S. Dutt, S. Choudhary,
     aspx?doi=10.1001/archpsyc.56.3.241. doi:10.1001/
                                                                   P. K. Chand, V. Patel, M. Keshavan, D. Joshi,
     archpsyc.56.3.241 .
                                                                   U. M. Mehta, J. Torous,         Relapse prediction
 [2] J. Torous, M. V. Kiang, J. Lorme, J.-P. Onnela, New
                                                                   in schizophrenia with smartphone digital
     Tools for New Research in Psychiatry: A Scal-
                                                                   phenotyping during COVID-19: a prospec-
     able and Customizable Platform to Empower Data
                                                                   tive, three-site, two-country, longitudinal
     Driven Smartphone Research, JMIR Mental Health
                                                                   study, Schizophrenia 9 (2023) 6. URL: https:
     3 (2016) e16. URL: http://mental.jmir.org/2016/2/
                                                                   //www.nature.com/articles/s41537-023-00332-5.
     e16/. doi:10.2196/mental.5165 .
                                                                   doi:10.1038/s41537- 023- 00332- 5 .
 [3] N. A. Abdul Rashid, W. Martanto, Z. Yang, X. Wang,
                                                              [10] M. Dieterich, C. B. Irving, H. Bergman,
     C. Heaukulani, N. Vouk, T. Buddhika, Y. Wei,
                                                                   M. A. Khokhar, B. Park, M. Marshall,             In-
     S. Verma, C. Tang, R. J. T. Morris, J. Lee, Eval-
                                                                   tensive case management for severe men-
     uating the utility of digital phenotyping to pre-
                                                                   tal illness,        Cochrane Database of Sys-
     dict health outcomes in schizophrenia: protocol
                                                                   tematic Reviews 2017 (2017). URL: http:
     for the HOPE-S observational study, BMJ Open
                                                                   //doi.wiley.com/10.1002/14651858.CD007906.pub3.
     11 (2021) e046552. URL: https://bmjopen.bmj.com/
                                                                   doi:10.1002/14651858.CD007906.pub3 .
     lookup/doi/10.1136/bmjopen-2020-046552. doi:10.
                                                              [11] J. H. Weng, Y. Hu, C. Heaukulani, C. Tan, J. K.
     1136/bmjopen- 2020- 046552 .
                                                                   Chang, Y. S. Phang, P. Rajendram, W. M. Tan, W. C.
 [4] X. Wang, N. Vouk, C. Heaukulani, T. Buddhika,
                                                                   Loke, R. J. T. Morris, Mental Wellness Self-care in
     W. Martanto, J. Lee, R. J. Morris, HOPES: An Inte-
                                                                   Singapore with mindline.sg: A Framework for the
     grative Digital Phenotyping Platform for Data Col-
                                                                   Development of a Digital Mental Health Platform
     lection, Monitoring, and Machine Learning, Journal
                                                                   for Behaviour Change (Preprint), preprint, JMIR
     of Medical Internet Research 23 (2021) e23984. URL:
                                                                   Preprints, 2023. URL: http://preprints.jmir.org/
     https://www.jmir.org/2021/3/e23984. doi:10.2196/
                                                                   preprint/45761. doi:10.2196/preprints.45761 .
     23984 .
                                                              [12] S. Yoon, H. Goh, X. C. Low, J. H. Weng, C. Heauku-
 [5] Z. Yang, C. Heaukulani, A. Sim, T. Buddhika, N. A.
                                                                   lani, Perceived Usability, User Preferences and Im-
     Abdul Rashid, X. Wang, S. Zheng, Y. F. Quek,
                                                                   pact of a Workplace Digital Mental Wellness Plat-
     S. Basu, K. W. Lee, C. Tang, S. Verma, R. J. Mor-
                                                                   form “Mindline at Work”: A Mixed Methods Study,
     ris, Utility of wrist wearable and smartphone-based
                                                                   preprint, SSRN, 2023. URL: https://www.ssrn.com/
     digital phenotyping in psychosis, Submitted (2023).
                                                                   abstract=4579836. doi:10.2139/ssrn.4579836 .
 [6] W. Martanto, Y. Y. Koh, Z. Yang, C. Heaukulani,
                                                              [13] C. Beatty, T. Malik, S. Meheli, C. Sinha, Evaluat-
     X. Wang, N. A. Abdul Rashid, A. Sim, S. Zheng,
                                                                   ing the Therapeutic Alliance With a Free-Text CBT
     C. Tang, S. Verma, R. J. Morris, J. Lee, Association
                                                                   Conversational Agent (Wysa): A Mixed-Methods
     between wrist wearable digital markers and clinical
                                                                   Study, Frontiers in Digital Health 4 (2022) 847991.
     status in Schizophrenia, General Hospital Psychi-
                                                                   URL: https://www.frontiersin.org/articles/10.3389/
     atry 70 (2021) 134–136. URL: https://linkinghub.
                                                                   fdgth.2022.847991/full. doi:10.3389/fdgth.2022.
     elsevier.com/retrieve/pii/S0163834321000098.
                                                                   847991 .
     doi:10.1016/j.genhosppsych.2021.01.003 .
                                                              [14] B. Inkster, S. Sarda, V. Subramanian, An Empathy-
     Driven, Conversational Artificial Intelligence Agent
     (Wysa) for Digital Mental Well-Being: Real-World
     Data Evaluation Mixed-Methods Study, JMIR
     mHealth and uHealth 6 (2018) e12106. URL: http:
     //mhealth.jmir.org/2018/11/e12106/. doi:10.2196/
     12106 .
[15] S. H. X. Lu, H. A. Assudani, T. R. R. Kwek, S. W. H.
     Ng, T. E. L. Teoh, G. C. Y. Tan, A Randomised
     Controlled Trial of Clinician-Guided Internet-Based
     Cognitive Behavioural Therapy for Depressed Pa-
     tients in Singapore, Frontiers in Psychology
     12 (2021) 668384. URL: https://www.frontiersin.
     org/articles/10.3389/fpsyg.2021.668384/full. doi:10.
     3389/fpsyg.2021.668384 .
[16] Y. S. Phang, C. Heaukulani, W. Martanto, R. Morris,
     M. M. Tong, R. Ho, Perceptions of a Digital Mental
     Health Platform Among Participants With Depres-
     sive Disorder, Anxiety Disorder, and Other Clin-
     ically Diagnosed Mental Disorders in Singapore:
     Usability and Acceptability Study, JMIR Human Fac-
     tors 10 (2023) e42167. URL: https://humanfactors.
     jmir.org/2023/1/e42167. doi:10.2196/42167 .
[17] P. Lewis, E. Perez, A. Piktus, F. Petroni,
     V. Karpukhin, N. Goyal, H. Küttler, M. Lewis,
     W.-t. Yih, T. Rocktäschel, S. Riedel, D. Kiela,
     Retrieval-Augmented Generation for Knowledge-
     Intensive NLP Tasks,           Advances in Neu-
     ral Information Processing Systems (2020).
     URL:              https://arxiv.org/abs/2005.11401.
     doi:10.48550/ARXIV.2005.11401 ,            publisher:
     arXiv Version Number: 4.

</pre>