Deploying AI Methods for Mental Health in Singapore: From Mental Wellness to Serious Mental Health Conditions Creighton Heaukulani1,∗ , Ye Sheng Phang1 , Janice Huiqin Weng1 , Jimmy Lee2,3 and Robert J. T. Morris1,4 1 MOH Office for Healthcare Transformation, Singapore 2 Institute of Mental Health, Singapore 3 Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore 4 Yong Loo Lin School of Medicine, National University of Singapore, Singapore Abstract We describe our results from the implementation of machine learning and AI methods in three digital health initiatives serving individuals across the mental health spectrum in Singapore. The first initiative is Project HOPES, which we launched in 2019 for patients with serious mental illnesses. Originating as an observational study on digital phenotypes (collected via smartphones and wrist wearables) of 100 patients with schizophrenia, the tool has now been introduced as a service within a tertiary setting and has expanded to include patients with depression. The strategy dynamically prioritizes patients for early review by care coordinators according to need, which may avoid hospitalizations. Machine learning is used to predict clinical status, i.e., symptoms and functioning, and to predict relapses and other adverse clinical events; the latter can be done at 92% sensitivity and 90% specificity with the available digital biomarkers. The second initiative we describe is mindline.sg (www.mindline.sg), a platform for mental wellness in the general population that we created in 2020. Through a public-facing website, we deliver over 800 resources including wellness education, clinically validated self-assessments and triaging, and interactive resources, including an AI chatbot. Launched at the height of the COVID-19 pandemic, with all its attendant stresses, the platform has been visited by somewhere between 10 to 20% of the national population by the end of 2023. The third initiative we describe is Let’s Talk (https://letstalk.mindline.sg), an online peer-support mental health network, which was co-created with youth advocates. The need for this platform was discovered through extensive studies with youth who expressed a desire for human-based support beyond the proliferation of digital solutions. In its first year, the site has been visited by over 80,000 unique users. Trained moderators review content on the site for safety and accuracy, and qualified therapists provide professional support through the free and anonymous Ask-A-Therapist service. To scale this service with a growing user base, we have been trialing the use of generative models to aid our therapists in finding relevant resources according to a user’s need and to encourage empathetic writing. Keywords digital phenotyping, schizophrenia, depression, mental wellness, AI chatbots, large language models, digital health 1. Introduction in serious functional impairment, which substantially interferes with or limits one or more major life activi- Mental Health conditions comprise one of the largest ties.”1 On the other hand, we serve the mental wellness burdens of disease worldwide, especially when measured needs of the population through mental health promo- in Years Lived with Disability (YLDs). Stigma is preva- tion, where a local agency, the Singapore Association lent in many countries and cultures, which discourages for Mental Health, defines mental wellness as “a positive help-seeking. At the MOH Office for Healthcare Transfor- state of mental health [that] is more than the absence of mation and the Institute of Mental Health in Singapore, mental illness,” in which an individual is “able to think, we adopt a population health approach starting from feel and act in ways that create a positive impact on your each end of the mental health spectrum and “working physical and social well-being.”2 Our solutions include our way in”. On the one hand, we serve the needs of digital health tools that increase the continuity of care, patients with serious mental illnesses, which are defined shifting care from the hospital/clinic into the home and by the US National Institute of Mental Health (NIMH) as from healthcare providers into the hands of individuals. “mental, behavioral, or emotional disorder[s] resulting In this article, we describe our results from the implemen- tation of machine learning and AI in three digital health Machine Learning for Cognitive and Mental Health Workshop initiatives deployed in our strategy, together with the (ML4CMH), AAAI 2024, Vancouver, BC, Canada challenges we continually confront as we deploy these ∗ Corresponding author. Envelope-Open creighton.heaukulani@moht.com.sg (C. Heaukulani); tools across the mental health ecosystem in Singapore. yesheng.phang@moht.com.sg (Y. S. Phang); janice.weng@moht.com.sg (J. H. Weng); jimmy_lee@imh.com.sg 1 https://www.nimh.nih.gov/health/statistics/mental-illness (J. Lee); robert.morris@moht.com.sg (R. J. T. Morris) 2 https://www.samhealth.org.sg/understanding-mental-health/ © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). what-is-mental-wellness/ CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings 1.1. Serious Mental Illnesses: Digital other) digital measures were predictive of poor symp- Phenotyping and AI in Schizophrenia toms and functioning: irregular sleep habits (including and Depression increased time spent awake in bed and in light stage sleep), decreased steps and GPS mobility, decreased text Patients in Singapore with schizophrenia are usually messages sent, slowed tapping speed, and increased heart treated in the specialist or hospital setting. After dis- rate while asleep, among others [6, 5]. charge to the community, it is not uncommon to see We have also investigated the use of machine learn- relapses. Around 80% suffer at least one relapse within ing to predict adverse clinical events, including relapse five years of initial remission [1] which often result in of psychosis. For these purposes, we defined a relapse emergency room visits or re-hospitalizations. Relapse is as a rehospitalization due to psychosis symptoms or a extremely disruptive to a patient’s life and rehospitaliza- significant deterioration in clinical status defined using a tion incurs large costs. Relapsing patients often exhibit validated clinician-assessed scale measuring general psy- psychotic symptoms such as hallucinations, delusions, or chopathology symptoms. Other clinical events include disordered thinking. They might display changes in sleep emergency room visits, readmissions due to reasons other behaviors, mood, social withdrawal and disorganized be- than psychosis, and unscheduled clinical visits. For these haviors. An emerging strategy to detect such changes predictive models, we explored both unsupervised learn- involves digital phenotyping, defined as the “moment- ing approaches to anomaly detection, including time se- by-moment quantification of the individual-level human ries smoothing and forecasting methods (including the phenotype in situ using data from personal digital de- method originally used by Henson et al. [7]) and isola- vices” [2]. In 2019, we began the HOPE-S observational tion forests, as well as supervised learning approaches study [3], for which we developed the HOPES digital utilizing generalized linear models, random forests, and phenotyping platform described in detail by Wang et al. gradient boosting trees. The models first establish a pa- [4]. As of the end of 2023, the platform has been in con- tient’s baseline (on the multivariate data) and attempt tinuous operation with patients and clinicians for over to detect deviations from that baseline at the individual- four years. Events are currently recorded from the user’s level. Retrospective analyses from our observational trial smartphone including mobility (derived from obfuscated indicate that we can detect adverse clinical events. The GPS coordinates), tapping speed on the keyboard, am- model that performed best varied by situation. For ex- bient light, screen time, accelerometry, and sociability ample, when using all digital measures (constituting a indices (derived from calls, SMS and WhatsApp calls and very high-dimensional and not very interpretable feature messaging). Events are also captured from a wrist wear- space), the gradient boosting tree performed best. But able device measuring heart rate, heart rate variability, when we restricted the dataset to the one or two most in- activity (through step counts), and sleep (including stag- terpretable features from each sensor (which are studied ing and efficiency). by clinical staff to explain the model), the isolation forest Over the course of the study, we continuously col- always performed best. The best model to use therefore lected digital phenotyping data from 100 patients with depends on the operating mode, and ultimately will be schizophrenia (each patient was followed up for a six- guided by clinical requirements. month period), with clinical assessments performed ev- Here, we briefly report indicative performance metrics ery six weeks to measure symptoms and functioning. from the gradient boosting tree model. A more thorough The total data collected throughout the trial consists of report of the methodological descriptions and predictive over 220 million events. We found generally high com- results will be reported in upcoming publications. A clas- pliance in wearing of the wrist device (91% of all possible sification setup is used, as studied by Ben-Zeev et al. [8], data was successfully collected in the week following and sensitivity, specificity, and the harmonic mean score enrolment), which required patients to wear the device (between sensitivity and specificity) are reported. In our at all times, including to sleep, and successful data col- clinical setting, we are willing to accept a reasonable lection from the smartphone (82% of all possible data number of false alarms (i.e., lower the algorithm speci- was collected), which only required patients to not close ficity), to keep sensitivity high. A false positive alarm the App in the background [5]. We note, however, that may result in the care coordinator giving the participant this high compliance rate was likely aided by a modest a call to check in or sending them an inquiring and sup- inconvenience fee that was provided to the trial patients. portive text message. We therefore give more emphasis Some patients, to whom the study was offered, declined to sensitivity, i.e., being able to sense a deterioration in to participate for reasons including privacy or intrusive- patient health, even if mild. This is acceptable for our ness, leaving us with the interesting challenge as to how clinical partners who envision a shift from the traditional we can ameliorate and obviate these concerns in the fu- reactive model of care to a proactive one where early ture. In our initial analyses associating clinical status detection and intervention might provide extra support with digital markers, we found that the following (and and therefore prevent adverse clinical events. We there- android_complete feature model optimized for hm_beta2 tapping_speed_ms 0.95 hometime_hrs 0.90 num_text_msg_sent 0.85 dist_travelled_kms score sleep_efficiency_score 0.80 total_steps 0.75 accelerometry_mean ml_model RandomForest-100 awake_in_bed_hrs 0.70 GradBoostTree-100 sensitivityspecificity harmonic_mean(b=2) rem_sleep_hrs metric Figure 1: Boxplots of performance metrics for two popular sleep_total_hrs machine learning models on predicting adverse clinical events HR_bpm_resting in schizophrenia, including relapses. HRV_resting screen_time_hrs fore explored optimization of the following weighted harmonic mean score: ambient_light_lumens 0.00 0.05 0.10 0.15 0.20 0.25 0.30 sensitivity × specificity relative feature importance weight 𝐻𝛽 = (1 + 𝛽 2 ) 2 (𝛽 × specificity) + sensitivity Figure 2: Ranking by feature importance weight in the gradi- ent boosting tree model for the prototypical measures from which has the interpretation that sensitivity is 𝛽 times each sensor. as important as specificity (where we note that 𝛽 can be greater than or less than one depending on whether you value sensitivity or specificity more). This measure is directly analogous to the weighted 𝐹-score used in ture and display the “relative feature importance weights” classification. In the experiments that follow, we report (the impurity-based feature importance measures imple- 𝐻2 , where sensitivity is considered twice as important as mented by most packages) in fig. 2. In this experiment, specificity. we see that tapping speed is considered the most impor- Boxplots of 𝐻2 over ten 85%/15% training/testing tant feature, followed by GPS-based and wrist device dataset splits for two machine learning models are shown measures, including distance travelled, time at home, in fig. 1. The median score (over the test sets) using the sleep efficiency score, and number of steps. This example gradient boosting tree is 𝐻2 = 91.0% (91.5% sensitivity, only considers a small subset of the features, however, 89.6% specificity). Note that the displayed sensitivity and and we do see such rankings change depending on the specificity scores are the median scores for these metrics experimental setup and the feature set provided. over the test sets; they do not correspond to the compo- Having successfully completed the above study, we nents used to compute the reported 𝐻2 score. This pre- are now piloting the use of this model as a preventative dictive power is high but may be difficult to achieve in a service at the Institute of Mental Health (a large men- clinical service (versus a controlled study) where compli- tal healthcare facility) in Singapore. This service is now ance to wearable and smartphone data collection without being extended to serve both patients with schizophre- financial incentives could be more challenging. The study nia and mood disorders including major depression. For by Cohen et al. [9] indicates that this real-world compli- this new HOPES Clinical Service, patient-facing compo- ance could drop to 50%. Missing and delayed data uploads nents for the smartphone App have been developed to are common in clinical practice. Experiments on subsets supplement what was purely passive monitoring in the of the features suggest that performance could drop to previous observational trial. Supported on both Android 𝐻2 = 85.6% (87.4% sensitivity, 78.7% specificity) with a and iOS, patients may now interact with the App through sparse model containing a subset of measures that are rel- Ecological Momentary Assessments (EMAs), wherein they atively easier to collect (steps, heart rate, accelerometer, may answer some questions as to how they are feeling at screen time, taps in Apps, and tapping speed). the time of a prompt, and they may additionally record To explore those digital measures that appear most what factors or stressors might be contributing to those important for prediction, we fit the gradient boosting feelings. Such data collection is sometimes referred to as tree model on the prototypical measures from each fea- active monitoring. The EMA responses are transmitted Digital phenotyping and EMA signals Clinician- defined algorithms • EMAs • Sleep • Activity • Heart rate • Sociability indices • Mobility patterns Machine • Tapping speed learning • Screen time routines Figure 3: The clinical care coordinator dashboard in the HOPES service for serious mental illnesses surfaces patients according to R/Y/G triage status, as determined by clinician-defined rules on the digital phenotyping data. Machine learning algorithms may only ever increase the severity of a patient’s status. to the care coordinators to aid in care and supplement the HOPES App. These interventions, based on the EMAs the digital phenotyping data, which together surface pa- as well as the passive data, are timely and individualized, tients on a clinical dashboard. Our strategy is to have for example, sleep exercises are delivered when patients clinicians in a care monitoring center regularly observe have poor sleep for two nights or more, and mindful- the anomaly detection signals on the dashboard, and with ness exercises are delivered when patients indicate low the help of digital tools that can explore a patient’s data mood on the EMAs, as just two examples. Such interven- the clinical staff may decide to take further interventions, tions are known as Ecological Momentary Interventions including earlier clinical review for a patient. (EMIs), and some of these digital therapeutic exercises The clinical dashboard is designed to achieve the effec- are inspired by cognitive behavioral therapy (CBT). tiveness of Intensive Case Management [10] in the pres- ence of large caseloads. Cost-effective staffing dictates 1.2. Our value proposition and strategy that the caseload of care coordinators (caring for patients that have returned to the community) is prioritized by a Digital phenotyping enables the continual sensing of positional ordering that is determined by a Red/Yellow/- needs to drive both automated EMIs and to trigger Green (R/Y/G) status, shown in fig. 3. A Green (G) status stepped-up human-based care. The previous standard indicates normal behavior and requires no action. A Yel- of care had no such real-time sensing capabilities and low (Y) status indicates moderate deviation of behavior has been entirely reactive and episodic. This strategy from normal (or possibly oncoming and escalating symp- strengthens the connectivity between care team and pa- toms) and triggers automated interventions (described tient and augments the mission to increase continuity of later). A Red (R) status indicates persistent and signifi- care and extend care beyond the clinic into the commu- cant deviation from normal and requires review by the nity. We initially focused on schizophrenia because we clinical team. To ensure patient safety, it is important anticipated quite strong indicative behavior-based sig- that this prioritization be effective and based on an ex- nals. We are now scaling both “up” and “out” by moving plainable clinical rationale. Hence a rule-based and fully into other diagnosed conditions such as depression and explainable set of criteria has been developed (working into adjacent populations, such as well populations, and closely with the clinicians who oversee the welfare of the into physical health conditions. patients), which determines whether a patient is flagged Y or R or left in the G state. This explainable property is 1.3. The role of AI and machine learning also desirable to assist a care coordinator when they call the patient: they may explain what they have noticed in Machine learning is used to predict a patient’s clinical the digital markers (e.g., “you don’t seem to have been status based on the multivariate set of digital markers, very active lately” or “it seems you haven’t been sleeping i.e., potential predictors of a patient’s current symptom well”). severity and functioning. This may potentially avert the Patients with detected needs of lower acuity (i.e., those need to come in for clinical visits as is currently required in the Y state) are supported with automated interven- under standard care. Machine learning is also used in tions comprised of digital therapeutics delivered through the longitudinal prediction of relapses and may raise the color coding of severity and prioritization of the patient for attention by the care coordinator. We also note that 2. Mental Wellness in the AI is used in a self-contained way in some of the EMI tools in the form of a mental wellness chatbot, which we Community: Tools for the Well will describe in the next section on our digital mental and Mild-to-Moderate Needs wellness platform, mindline.sg. The machine learning models utilized for clinical event Many countries and cultures face persistent challenges prediction include traditional time series smoothing, iso- to mental health awareness and promotion including lation forests, generalized linear models, random forests, stigma toward mental disorders, reluctance to seek help, and gradient boosting trees. The methods used for dis- low mental health literacy, a lack of trained mental health covering associations between clinical scales and digi- personnel, and underdeveloped mental healthcare ecosys- tal markers were multiple linear regression and multi- tems. The COVID-19 pandemic created a surge in mental level/hierarchical linear regression. A detailed descrip- healthcare needs, demanding a new impetus to address- tion of these methods is beyond the scope of this paper ing these shortcomings. The pandemic also accelerated and will appear in upcoming publications. the adoption and acceptance of digital health solutions, Explainable signals are important for the clinical ser- creating a new opportunity for innovative approaches to vice. The multivariate nature of our digital phenotyping address mental healthcare needs. data aids clinicians in the interpretation of alerts – for In June 2020, we launched a Web App, mindline.sg which we provide digital tools to allow clinicians to keep (www.mindline.sg), serving as a digital mental health re- track of patient data. It is important to note that our ma- source website that has grown to include over 800 curated chine learning-based relapse prediction algorithm is only resources, a clinically-validated self-assessment tool for permitted to increase patient severity (from G to Y or from depression and anxiety, and a fully integrated AI chatbot Y to R), which may prompt attention from a care coordi- developed for mental health applications from Wysa3 , nator (see fig. 3). The AI algorithm may never relegate a a leading partner in the field. More recently, we added patient to lower acuity. This is not only for safety, but curricular and structured learning materials with Intel- also aligns with Singapore’s national regulatory guide- lect 4 , another leading partner in digital mental health. lines for predictive models in clinical service. As further The landing page and a therapeutic exercise with the experience is gained in the detection and management chatbot are shown in fig. 4. The self-assessment tool of patients with digital phenotyping, we may be able to is comprised of the conventional scales of GAD-7 and move to a wider use of AI algorithms and to allow them PHQ-9 and triages users into well, mild, moderate, and to play a more definitive role in the selection of patients crisis levels. The triage status allows us to customize con- for clinical review. As new regulatory guidelines emerge tent and recommend appropriate therapeutic exercises. and are navigated, pivots to our design and strategy may Tailored products for youth and working adults are also occur. provided to the public. Business-to-Business (B2B) cus- tomizations and engagements serve a range of ecosystem partners, including workplace partners and educational 1.4. What’s next? institutions. Resources on common risk factors (such as So far, the bulk of our experience has been with financial, employment and caregiver stress) are included, schizophrenia. Our service, however, has expanded to addressing a broad spectrum of determinants of mental include patients with depression and takes a transdiag- health. The platform was developed to be anonymous nostic approach, which may justify further research trials and to contain authoritative and localised content for var- to develop and refine algorithms for depression in the ious levels of distress, with a focus on wellness and mild local context. Additionally, an upcoming research study needs. Moderate-to-severe needs are primarily served will evaluate our clinical service. Finally, we are currently by detection (through the self-assessment triage and AI expanding our digital phenotyping strategy to mental chatbot) and referral to professional support, which in- wellness. In this way, our digital phenotyping and ma- cludes counselling centres and emergency 24/7 services, chine learning tools are moving “inward” toward mild all according to a clinician-designed protocol. severity and well-populations. The platform has shown remarkable uptake. The site has been visited by between 10 and 20% of the (targeted) national population. The variability of this estimate is due to the anonymity feature of the site (we can only de- tect cookie IDs), which is a key feature enabling barrier- free access. If unique users visit from multiple machines or browsers, we may record them more than once. Indeed, 3 www.wysa.com 4 https://intellect.co Figure 4: (Left) The mindline.sg landing page with the AI chatbot and self-assessment triage tools in the top two panels. (Middle and Right) The Wysa AI chatbot directs a user to an exercise inspired by cognitive behavioral therapy. it is a learning from our implementation that anonymity 2.1. Our value proposition and strategy does limit our ability to evaluate the platform. Another The goal of the mindline.sg platform is to empower in- learning has been that successfully scaling the platform dividuals in the community to take charge of their own required expansive and sustained digital marketing ef- mental health and to provide them the tools they need to forts, as well as strategic ecosystem partnerships through offer basic support (“first aid”) to themselves and those the B2B products, and investment in partnership with around them, all through the ease and convenience of a educational institutions and healthcare providers. About barrier-free digital solution. This aligns with our strategy 60% of our user acquisitions come from ads that we post to improve population health through digital tools that on social media; the next most frequent acquisitions are enable self-empowerment and self-management. Such a from search, direct entry of the URL, or use of a QR strategy transfers some of the care from the system onto code from our flyers and posters. Referrals also occur the individual and their significant others and moves from B2B partner sites. The most popular resources used some care from the clinical setting into the community on the platform are sleep aids. “Mood check-ins” and and the home. the self-assessment tool are also popular. The distribu- tion of moods includes “tired”, “unmotivated”, “anxious”, “positive”, “frustrated” and “sad” (in order of decreasing 2.2. The role of AI and machine learning frequency). Scores for GAD-7 and PHQ-9 show a mod- A natural language processing (NLP)-based chatbot from erate number in a state of crisis: however, we believe Wysa is deployed to engage, triage, chat with, and di- that this frequency is affected by anonymous users trying rect the users to a range of relaxation, mindfulness, and out different answers to the questions to trigger differ- CBT-inspired exercises. The Wysa chatbot is designed ent triage levels and seeing what resources are offered, by a team of psychologists actively involved in patient mainly out of curiosity. We do not view this kind of usage counseling and has been subjected to numerous studies negatively, as we feel that it is important for users to be evaluating effectiveness [13, 14]5 . Beyond the chatbot, educated as to what resources are available in times of however, we have limited data collected on the site (again, need, either for themselves or a friend or relative. We due to the anonymity feature), which limits any machine have published these results in both process and impact learning and AI efforts. A mobile App is presently being evaluation studies [11, 12]. developed to give users an alternative option, which may 5 https://www.wysa.com/clinical-evidence be able to better leverage AI to improve user experience have been over 2,800 posts and over 370 Ask-a-Therapist and benefit. questions had been answered. As the platform’s user base scales, we anticipate challenges in moderating the 2.3. What’s next? platform and responding to questions in a timely manner while maintaining quality. We have therefore started The present wellness tool is mainly designed to serve trialing the use of large language models (LLMs) to as- those who are well or have mild conditions, with referral sist our staff therapists in searching for relevant content to human-based resources for moderate and crisis cases. from a trusted knowledge base (mindline.sg) based on We are also exploring the incorporation of clinical ad- the user’s need (inferred from the posted question). We junct tools such as validated and localized internet CBT used GPT-3.5 from OpenAI, which we fine-tuned using (iCBT) tools [15], which can be used by mental health- over 300 question-answer pairs from the Ask-a-Therapist trained primary care providers. The mobile App will service. Retrieval-augmented generation (RAG) [17] is enable longitudinal data tracking and clinical manage- used to produce the most suitable resources from the ment, which in turn may provide new opportunities for knowledge base, which is indexed from all resources in AI and machine learning in the service. We have already mindline.sg. While the therapist may copy-and-paste rec- conducted a feasibility study of the tool in this population ommended resources (and their descriptions) from this [16]. tool, they remain fully responsible for the content in their reply. The LLM assistant has helped our staff therapists with close to 30 responses so far, where 88% of those 3. Digitally-Enabled Peer and responses have been rated as helpful by the therapist. Professional Mental Wellness In fig. 5, we show information on the Ask-A-Therapist service and screenshots on how our therapists use the Support for Youth LLM assistant. After extensive workshops, focus group discussions, and co-design sessions with youth (including those with lived 3.1. Our value proposition and strategy experience), we discovered a desire for social support and meaningful human interactions delivered in a safe on- Youth mindline is the tailored youth product within line environment. In particular, human-based support the mindline.sg platform described in the previous sec- is now specifically sought amongst the proliferation of tion, and it serves as a companion site to Let’s Talk. purely digital self-management solutions. We therefore Where youth mindline enables self-management and self- co-created with youth advocates an online peer support empowerment through digital-only “self-help” tools, Let’s network called Let’s Talk (https://letstalk.mindline.sg). Talk offers a purely human-based form of therapy and The platform soft-launched in October 2022 and has been engagement. We believe the platform may also address piloting since. By end of December 2023, the site had re- determinants of mental health based on sociability and ceived over 80,000 unique visitors (as measured by Google that users offering support to others has benefits for Analytics). This peer-support network’s value proposi- the helper and the helped. The strategy provides a low- tion, over other platforms such as Reddit, includes close barrier means of accessing professional support and pro- oversight and management by trained moderators and vides this support (which is at the individual level) at professional therapists. The moderators and therapists scale. maintain a constructive and supportive atmosphere in the forum; other comparable forums have suffered from 3.2. The role of AI and machine learning trolling, toxicity, spam, and scams. It is noted that some The LLM assistant exemplifies one of our strategies for “medically themed” forums are overly commercial or are generative models in healthcare in which an AI agent used by practitioners to advertise their own practices. acts as an assistant to care providers. As we scale, we Let’s Talk also provides an Ask-a-Therapist service where envision that such agents can dramatically reduce the users can pose a question to a panel of qualified profes- time to search for a pool of potentially optimal therapies sional therapists whom we have engaged; their response for a patient, client, or user’s exhibited needs at the ap- is asynchronous but usually occurs within 24 hours. The propriate time, saving the clinician or care provider time. therapists follow a protocol defined by a Clinical Advisory A sufficiently large pool of resources, as is the case in Panel to deal with pressing needs and crisis conditions. mindline.sg, can guard against repetition or a narrow set As of December 2023, from among the over 80,000 visi- of recommended resources. tors to the site, over 6,000 have registered an anonymous user account (which is required to post content but is not required to access the forum and read posts). There Figure 5: (Left) The Ask-A-Therapist service on the Let’s Talk digital peer support forum for youth. (Middle) The Telegram interface for the LLM therapist assistant used by Let’s Talk staff therapists to retrieve therapeutic content relevant for a post. (Right) Feedback is collected from therapists possibly enabling future AI improvement efforts. 3.3. What’s next? ing further care and ecosystem partners, including ad- ditional tertiary care partners, primary care providers, Future plans for AI and machine learning in Let’s Talk in- allied health, community organizations, peer supporters, clude using LLMs to review therapist responses to check workplace partners, and educational partners. We hope for or encourage a more empathetic tone. We may also to communicate our learnings to others with a similar use LLMs to train our peer supporter volunteers to a stan- mission. We also hope to learn from others who are on a dardized level. Finally, LLMs or other NLP techniques similar journey. can be used to continually assess content on the site to detect toxicity, spam, and misinformation. Ethical Statement 4. Conclusion The authors received IRB approval for the HOPE-S digital phenotyping study described in the first section. The The initiatives described have shown significant uptake mindline.sg and Let’s Talk services both have Terms of by patients under care or users among our population. Use (available on their websites) that indicate that usage They confirm the promise of usefulness of digital and data may be collected and used for research purposes AI tools in providing improved digitally-enabled thera- and for service improvement. The authors declare no peutics and interventions. These initiatives demonstrate conflicts of interest. our strategy of starting at extreme ends of the mental health spectrum and working our way in toward the middle, blending interventional tools as we go. Through Responses to Reviewer Comments this strategy, we aim to cover the entire life-course and spectrum of acuity. Along the way, we are incorporat- We thank the reviewers for their comments. In this pub- lished manuscript, we have included succinct references and further technical details on the machine learning and [7] P. Henson, R. D’Mello, A. Vaidyam, M. Ke- AI models, including quantitative experimental results. shavan, J. Torous, Anomaly detection to We have added in precise definitions of mental wellness predict relapse risk in schizophrenia, Trans- and serious mental illnesses in a new Introduction sec- lational Psychiatry 11 (2021) 28. URL: https: tion. We have made it clear that we are describing results //www.nature.com/articles/s41398-020-01123-7. from implementation in the Abstract and Introduction. doi:10.1038/s41398- 020- 01123- 7 . These changes address all requests by the reviewers. [8] D. Ben-Zeev, R. Brian, R. Wang, W. Wang, A. T. Campbell, M. S. H. Aung, M. Merrill, V. W. S. Tseng, T. Choudhury, M. Hauser, J. M. Kane, E. A. Scherer, References CrossCheck: Integrating self-report, behavioral sensing, and smartphone use to identify digital in- [1] D. Robinson, M. G. Woerner, J. M. J. Alvir, R. Bilder, dicators of psychotic relapse., Psychiatric Rehabil- R. Goldman, S. Geisler, A. Koreen, B. Sheitman, itation Journal 40 (2017) 266–275. URL: http://doi. M. Chakos, D. Mayerhoff, J. A. Lieberman, Pre- apa.org/getdoi.cfm?doi=10.1037/prj0000243. doi:10. dictors of Relapse Following Response From a First 1037/prj0000243 . Episode of Schizophrenia or Schizoaffective Dis- [9] A. Cohen, J. A. Naslund, S. Chang, S. Nagendra, order, Archives of General Psychiatry 56 (1999) A. Bhan, A. Rozatkar, J. Thirthalli, A. Bondre, 241. URL: http://archpsyc.jamanetwork.com/article. D. Tugnawat, P. V. Reddy, S. Dutt, S. Choudhary, aspx?doi=10.1001/archpsyc.56.3.241. doi:10.1001/ P. K. Chand, V. Patel, M. Keshavan, D. Joshi, archpsyc.56.3.241 . U. M. Mehta, J. Torous, Relapse prediction [2] J. Torous, M. V. Kiang, J. Lorme, J.-P. Onnela, New in schizophrenia with smartphone digital Tools for New Research in Psychiatry: A Scal- phenotyping during COVID-19: a prospec- able and Customizable Platform to Empower Data tive, three-site, two-country, longitudinal Driven Smartphone Research, JMIR Mental Health study, Schizophrenia 9 (2023) 6. URL: https: 3 (2016) e16. URL: http://mental.jmir.org/2016/2/ //www.nature.com/articles/s41537-023-00332-5. e16/. doi:10.2196/mental.5165 . doi:10.1038/s41537- 023- 00332- 5 . [3] N. A. Abdul Rashid, W. Martanto, Z. Yang, X. Wang, [10] M. Dieterich, C. B. Irving, H. Bergman, C. Heaukulani, N. Vouk, T. Buddhika, Y. Wei, M. A. Khokhar, B. Park, M. Marshall, In- S. Verma, C. Tang, R. J. T. Morris, J. Lee, Eval- tensive case management for severe men- uating the utility of digital phenotyping to pre- tal illness, Cochrane Database of Sys- dict health outcomes in schizophrenia: protocol tematic Reviews 2017 (2017). URL: http: for the HOPE-S observational study, BMJ Open //doi.wiley.com/10.1002/14651858.CD007906.pub3. 11 (2021) e046552. URL: https://bmjopen.bmj.com/ doi:10.1002/14651858.CD007906.pub3 . lookup/doi/10.1136/bmjopen-2020-046552. doi:10. [11] J. H. Weng, Y. Hu, C. Heaukulani, C. Tan, J. K. 1136/bmjopen- 2020- 046552 . Chang, Y. S. Phang, P. Rajendram, W. M. Tan, W. C. [4] X. Wang, N. Vouk, C. Heaukulani, T. Buddhika, Loke, R. J. T. Morris, Mental Wellness Self-care in W. Martanto, J. Lee, R. J. Morris, HOPES: An Inte- Singapore with mindline.sg: A Framework for the grative Digital Phenotyping Platform for Data Col- Development of a Digital Mental Health Platform lection, Monitoring, and Machine Learning, Journal for Behaviour Change (Preprint), preprint, JMIR of Medical Internet Research 23 (2021) e23984. URL: Preprints, 2023. URL: http://preprints.jmir.org/ https://www.jmir.org/2021/3/e23984. doi:10.2196/ preprint/45761. doi:10.2196/preprints.45761 . 23984 . [12] S. Yoon, H. Goh, X. C. Low, J. H. Weng, C. Heauku- [5] Z. Yang, C. Heaukulani, A. Sim, T. Buddhika, N. A. lani, Perceived Usability, User Preferences and Im- Abdul Rashid, X. Wang, S. Zheng, Y. F. Quek, pact of a Workplace Digital Mental Wellness Plat- S. Basu, K. W. Lee, C. Tang, S. Verma, R. J. Mor- form “Mindline at Work”: A Mixed Methods Study, ris, Utility of wrist wearable and smartphone-based preprint, SSRN, 2023. URL: https://www.ssrn.com/ digital phenotyping in psychosis, Submitted (2023). abstract=4579836. doi:10.2139/ssrn.4579836 . [6] W. Martanto, Y. Y. Koh, Z. Yang, C. Heaukulani, [13] C. Beatty, T. Malik, S. Meheli, C. Sinha, Evaluat- X. Wang, N. A. Abdul Rashid, A. Sim, S. Zheng, ing the Therapeutic Alliance With a Free-Text CBT C. Tang, S. Verma, R. J. Morris, J. Lee, Association Conversational Agent (Wysa): A Mixed-Methods between wrist wearable digital markers and clinical Study, Frontiers in Digital Health 4 (2022) 847991. status in Schizophrenia, General Hospital Psychi- URL: https://www.frontiersin.org/articles/10.3389/ atry 70 (2021) 134–136. URL: https://linkinghub. fdgth.2022.847991/full. doi:10.3389/fdgth.2022. elsevier.com/retrieve/pii/S0163834321000098. 847991 . doi:10.1016/j.genhosppsych.2021.01.003 . [14] B. Inkster, S. Sarda, V. Subramanian, An Empathy- Driven, Conversational Artificial Intelligence Agent (Wysa) for Digital Mental Well-Being: Real-World Data Evaluation Mixed-Methods Study, JMIR mHealth and uHealth 6 (2018) e12106. URL: http: //mhealth.jmir.org/2018/11/e12106/. doi:10.2196/ 12106 . [15] S. H. X. Lu, H. A. Assudani, T. R. R. Kwek, S. W. H. Ng, T. E. L. Teoh, G. C. Y. Tan, A Randomised Controlled Trial of Clinician-Guided Internet-Based Cognitive Behavioural Therapy for Depressed Pa- tients in Singapore, Frontiers in Psychology 12 (2021) 668384. URL: https://www.frontiersin. org/articles/10.3389/fpsyg.2021.668384/full. doi:10. 3389/fpsyg.2021.668384 . [16] Y. S. Phang, C. Heaukulani, W. Martanto, R. Morris, M. M. Tong, R. Ho, Perceptions of a Digital Mental Health Platform Among Participants With Depres- sive Disorder, Anxiety Disorder, and Other Clin- ically Diagnosed Mental Disorders in Singapore: Usability and Acceptability Study, JMIR Human Fac- tors 10 (2023) e42167. URL: https://humanfactors. jmir.org/2023/1/e42167. doi:10.2196/42167 . [17] P. Lewis, E. Perez, A. Piktus, F. Petroni, V. Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, S. Riedel, D. Kiela, Retrieval-Augmented Generation for Knowledge- Intensive NLP Tasks, Advances in Neu- ral Information Processing Systems (2020). URL: https://arxiv.org/abs/2005.11401. doi:10.48550/ARXIV.2005.11401 , publisher: arXiv Version Number: 4.