ML4CMH: First Workshop on Machine Learning for Cognitive and Mental Health⋆ Marija Stanojevic1,∗ 1 Cambridge Cognition, Toronto, ON, Canada Abstract With a COVID-19 magnified mental health crisis and growing old population (10.7% of population aged over 65 is diagnosed with Alzheimer’s disease and 18% is diagnosed with mild cognitive impairment (MCI)) there is an immediate need for developing systems that can better understand and characterize cognitive and mental health (CMH) by tracking various biomarkers from functional magnetic resonance imaging (fMRI), electroencephalogram (EEG), speech, electronic health record (EHR), movement, cognitive surveys, wearable devices, structured, genomic, and epigenomic data. One of the core technical opportunities for accelerating the computational analysis of CMH lies in multimodal (MM) ML: learning representations that model the heterogeneity and interconnections between diverse input signals. MM is particularly important in CMH primarily due to the presence of noisy labels and subjectivity inherent in surveys. The utilization of multiple signals and modalities offers a potential solution to overcome these challenges. In addition, it is imperative to emphasize the necessity for increased data sharing and enhanced collaboration within the CMH research community. As we endeavor to tackle the multifaceted challenges posed by cognitive and mental health disorders, a collective effort is essential to facilitate access to high-quality datasets and promote collaborative initiatives. By promoting transparency and facilitating the exchange of insights and methodologies, we can accelerate progress and drive innovation in CMH research. This workshop serves as a platform for fostering such collaboration, inviting participants to contribute their expertise and insights towards the shared goal of advancing our understanding and treatment of cognitive and mental health disorders. Together, through open dialogue and shared resources, we can chart a path towards a brighter future for individuals affected by CMH conditions. Keywords Mental health crisis, Cognitive health, Biomarkers, Multimodal Learning, Deep learning, Multilingual clinical data 1. Introduction This workshop has three primary goals: Recently, major progress has been made in pre-trained 1. bring together experts from multiple disciplines deep and MM learning from text, speech, images, video, working on ML and CMH to learn from each signals, and structured data [1, 2, 3, 4], and there has other, also been initial success towards using deep learning 2. encourage the development of shared goals and and MM streams to improve prediction of patient status approaches across these communities, and or response to treatment in CMH applications [5, 6, 7, 3. stimulate creation of better MM technologies for 8, 9, 10, 11, 12, 13, 14, 15, 16]. However, there remains real-world CMH impact. computational and theoretical challenges that need to To achieve these goals, this workshop includes a di- be solved in machine learning for CMH, spanning verse lineup of invited speakers across fields associated 1. collecting and sharing quality data for moderate with ML and CMH, hosting experts from computer vi- and severe patients, sion (CV), natural language processing (NLP), MM learn- 2. learning from many diverse and understudied ing, signal processing, human-computer interaction, neu- signals, roscience, psychiatry, and psychology. To encourage 3. theoretically understanding the natural way of discussion and further collaboration toward the ad- modality connections and interactions in MM vancement of ML for CMH, the workshop combines in- learning, vited talks, contributed papers and posters, and panel 4. real-world deployment concerns such as safety, discussion. In addition, organizers hosted a mentorship robustness, interpretability, and collaboration program with help of mentors from the program com- with various stakeholders, and mittee, similar to mentorship program of ACL-SRW1 , 5. extending models to low resource and multilin- in order to increase reach and to help researchers from gual environments. across the world who are new to this field to improve the Machine Learning for Cognitive and Mental Health Workshop quality of their papers before the submission time. (ML4CMH), AAAI 2024, Vancouver, BC, Canada This workshop contributes to the diversity of the field https://winterlightlabs.github.io/ml4cmh2024/ and increases collaboration between machine learning, ∗ Corresponding author. psychiatry, psychology, and neuroscience researchers. It Envelope-Open mstanojevic118@gmail.com (M. Stanojevic) © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License 1 Attribution 4.0 International (CC BY 4.0). https://acl2023-srw.github.io/ CEUR ceur-ws.org Workshop ISSN 1613-0073 Proceedings Session type Speaker Time Title Welcome Note Dr. Marija Stanojevic 9:00 - 9:05 am - Three Challenges to Ai-Based Measurement of Keynote 1 Prof. Peter Foltz 9:05 - 9:35 am Mental State and Cognitive Function Windows on Psychosis: The Interplay Among Speech, Keynote 2 Dr. Sunny Tang 9:35 - 10:05 am Language, Cognition and Clinical Symptoms Harmony in Minds: Unleashing the Potential of Keynote 3 Dr. Paola Pedrelli 10:05 - 10:35 am Interdisciplinary Collaboration in Computer Science and Psychiatry for Ai-Powered Mental Health Innovations Poster Session See below 10:35 - 11:00 am - Knowledge-enhanced Memory Model for Oral Session 1 See below 11:00 am - 11:20 am Emotional Support Conversation Learning to Generate Context-Sensitive Backchannel Smiles for Oral Session 1 See below 11:20 am - 11:40 am Embodied AI Agents with Applications in Mental Health Dialogues Oral Session 1 See below 11:40 am - 12:00 pm A Pretrained Language Model for Mental Health Risk Detection Lunch - 12:00 - 1:15 pm - Machine Learning Challenges for Large Longitudinalh Keynote 4 Dr. Guillermo Cecchi 1:15 - 1:45 pm Clinical Trials in Mental Health Safe Deployment of AI Methods for Mental Health: Keynote 5 Prof. Robert JT Morris 1:45 - 2:15 pm From Mental Wellness to Serious Mental Conditions AI 4 Psychology and Psychology 4 AI: Towards Better Keynote 6 Prof. Irina Rish 2:15 - 2:45 pm Alignment Among Humans and Machines PMC: Paired Multi-Contrast MRI Dataset at Oral Session 2 See below 2:45 - 3:00 pm 1.5T and 3T for Supervised Image2Image Translation Oral Session 2 See below 3:00 - 3:15 pm Dance of the Neurons: Unraveling Sex from Brain Signals Mental Health Stigma across Diverse Genders Oral Session 2 See below 3:15 - 3:30 pm in Generative Large Language Models Poster Session See below 3:30 - 4:00 pm - Panel See below 4:00 - 5:00 pm Future Directions and Biggest Obstacles Table 1 A Full Day Workshop - Schedule encourages collaboration to solve critical CMH tasks and https://winterlightlabs.github.io/ml4cmh2024/. create new datasets and resources to foster CMH research. In addition, it encourages multilingual and multimodal research. The organizers put an effort to invite keynote 3. Keynote Speakers speakers, panelists, and program committee members 1. Peter Foltz2 , University of Colorado, Boulder, Pro- from diverse backgrounds, involving both academia and fessor, Cognitive Science & Computational Psychi- industry. Specifically, organizers made concerted efforts atry to involve underrepresented groups, so speakers include 2. Irina Rish3 , University of Montreal, MILA, CIFAR, LGBTQ people, and 50% of female. Moreover, program Professor, ML for Neuroscience committee comprises researchers come from 12 countries 3. Guillermo Cecchi4 , IBM, Principal Researcher, across 5 continents. Computational Psychiatry & Neuroimaging 4. Paola Pedrelli5 , Harvard Medical School, Assis- 2. Workshop Structure tant Professor, ML for Psychology 5. Robert JT Morris6 , National University of Sin- The workshop will take place at Vancouver Convention gapore, Singapore MOH Office for Healthcare Centre - West Building, Room 205, on February 26th, Transformation, Professor, Digital Mental Health 2023. It features six keynote speakers, oral sessions, 6. Sunny X. Tang 7 , Northwell Health, Assistant poster sessions, and panel discussion, and networking Professor, ML for Psychiatry lunch. From 20 submitted papers, six were selected for 2 https://scholar.google.com/citations?user=UwQSEOkAAAAJ oral and poster presentation and additional nine papers 3 https://scholar.google.com/citations?user=Avse5gIAAAAJ 4 were selected for poster presentation only. Acceptance 5 https://scholar.google.com/citations?user=pQZaTGEAAAAJ https://scholar.google.com/citations?user=E_Ug5tsAAAAJ rate was therefore 75%. See detailed schedule in Table 2. 6 https://scholar.google.com/citations?user=QLaCxaoAAAAJ Further details about the workshop can be accessed at https://scholar.google.com/citations?user=ar-oFSwAAAAJ 7 4. Panel Speakers NeurIPS 2020, NAACL 2021, and NAACL 2022, and was a workflow chair for ICML 2019. Program Co-chair. 1. Peter Foltz8 , University of Colorado, Boulder, Pro- Jelena Curcic20 , Ph.D. is a Senior Data Scientist at fessor, Cognitive Science & Computational Psychi- Novartis Institutes for Biomedical Research with the ex- atry pertise in development, deployment, and advanced ana- 2. Paola Pedrelli9 , Harvard Medical School, Assis- lytics of digital endpoints and biomarkers in neuroscience tant Professor, ML for Psychology disease area. Her topics of interest are cognition and neu- 3. Frank Rudzicz10 , Dalhousie University, Vector ropsychiatric symptoms in neurodegenerative and mood Institute, CIFAR, Associate Professor, ML for disorders. Publication Chair. Healthcare Zining Zhu21 is an Assistant Professor at Stevens 4. Jekaterina Novikova11 , Winterlight Labs, ML Di- Institute of Technology. He is interested in building in- rector, NLP & Speech, ML for CMH terpretable and trustworthy systems with deep neural 5. Vikram Ramanarayanan12 , Modality.AI, CSO, networks. His researches apply the developments of deep Speech & Image Processing for CMH neural network (DNN)-based systems to the detection of 6. Xiaoxiao Li13 , University of British Columbia, cognitive impairments using data from multiple modali- University of British Columbia, Trustworthy AI ties. Mentorship Chair. Malikeh Ehghaghi22 is a machine learning research scientist at Arcee.ai. She graduated with a Master of Organizers Science in Applied Computing from the University of Toronto. She has over 4 years of research experience Organization Team in applied data science and machine learning, particu- larly interested in natural language processing, speech Marija Stanojevic14 , Ph.D. is an Applied Machine Learn- processing, multimodal machine learning for health, and ing Scientist at Winterlight Labs. She focuses on repre- interpretability. Program Co-chair. sentation learning, multimodal, multilingual, and trans- Ali Akram23 is a Machine Learning Engineer at Cam- fer learning for cognitive and mental health. She was bridge Cognition, and graduated from the Systems De- a virtual chair of ICLR 2021 and ICML 2021 and main sign Engineering program at the University of Water- organizer of the 9th Mid-Atlantic Student Colloquium loo. Interested in the efficient orchestration of machine on Speech, Language and Learning (MASC-SLL 2022). learning models, and applications of multimodal machine General chair. learning which leverage speech as the modality of choice. Elizabeth Shriberg15 , Ph.D. specializes in the com- Technical Chair. putational modeling of speech and language. She is currently CSO at Ellipsis Health, a start-up developing speech-based mental health screening technologies for 5. Program Committee clinical applications. She previously held Senior Prin- cipal Scientist roles at Amazon, SRI International, and 1) Brandon M Booth, University of Colorado; Microsoft. She is a Fellow of ISCA16 , SRI17 , and AAIA18 , 2) Kathleen C. Fraser, National Research Council Canada; and has over 300 publications and patents in speech tech- 3) Wilson Y. Lee, HubSpot; nology and related fields. Speaker & Panel Chair. 4) Ashutosh Modi, Indian Institute of Technology Kanpur; Paul Pu Liang19 is a PhD student at CMU. He re- 5) Albert Ali Salah, Utrecht University; searches foundations of multimodal machine learning 6) Roland Goecke, University of Canberra; with applications in socially intelligent AI, understanding 7) Andreas Triantafyllopoulos, University of Augsburg; human and machine intelligence, natural language pro- 8) Daniele Riboni, University of Cagliari; cessing, healthcare, and education. He organized work- 9) Korbinian Riedhammer, Technische Hochschule Nürn- shops on multimodal learning at ACL 2018, ACL 2020, berg; 10) Paula A. Perez-Toro, Friedrich-Alexander Universitat; 8 https://scholar.google.com/citations?user=UwQSEOkAAAAJ 11) Torsten Wörtwein, Carnegie Mellon University; 9 https://scholar.google.com/citations?user=E_Ug5tsAAAAJ 10 https://scholar.google.ca/citations?user=elXOB1sAAAAJ 12) Loukas Ilias, National Technical University of Greece; 11 https://scholar.google.com/citations?user=C75JskwAAAAJ 13) Arun Das, University of Pittsburgh Medical Center; 12 https://scholar.google.com/citations?user=mUm8U2IAAAAJ 14) Jingqi Chen, Fudan University; 13 https://scholar.google.com/citations?user=sdENOQ4AAAAJ 15) Eloy Geenjaar, Georgia Institute of Technology; 14 https://scholar.google.com/citations?user=pAyfhIkAAAAJ 15 https://scholar.google.com/citations?user=nRZJYPIAAAAJ 16 20 https://www.isca-speech.org/iscaweb/ https://scholar.google.com/citations?user=Se8a2b8AAAAJ 17 21 https://www.sri.com/about-us/ https://scholar.google.ca/citations?user=Xr_hCJMAAAAJ 18 22 https://www.aaia-ai.org/ https://scholar.google.com/citations?user=les29Z8AAAAJ 19 23 https://scholar.google.com/citations?user=pKf5LtQAAAAJ https://www.akramsystems.com/ 16) Samina Khalid, Mirpur University of Science and how combining both may allow us to isolate differ- Technology; ent core symptoms of depression, arXiv preprint 17) Minyechil Alehegn, Mizan - Tepi University; arXiv:2204.00088 (2022). 18) Vidya Venkiteswaran, Google [8] M. Chatzianastasis, L. Ilias, D. Askounis, M. Vazir- 19) Akshata Kishore Moharir, Microsoft giannis, Neural architecture search with multi- 20) Nikhil Khani, YouTube modal fusion methods for diagnosing dementia, 21) Divij Gupta, Vector Institute in: ICASSP 2023-2023 IEEE International Confer- ence on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 2023, pp. 1–5. 6. Acknowledgement [9] B. Diep, M. Stanojevic, J. Novikova, Multi-modal deep learning system for depression and anxiety We would like to thank you to the following people for detection, arXiv preprint arXiv:2212.14490 (2022). their help and support during workshop preparation: 1) [10] M. Ehghaghi, F. Rudzicz, J. Novikova, Data-driven Aparna Balagopalan PhD Student at MIT; 2) Thomas approach to differentiating between depression and Hartvigsen, PhD, Assistant Professor at University of dementia from noisy speech and language data, Virginia; and 3) William Jarrold, Trade Desk. arXiv preprint arXiv:2210.03303 (2022). We would like to express our sincere gratitude to Win- [11] M. Golovanevsky, C. Eickhoff, R. Singh, Multimodal terlight Labs24 , Canada and Cambridge Cognition25 , UK attention-based deep learning for alzheimer’s dis- companies for their generous support and contribution ease diagnosis, Journal of the American Medical to the success of this event. We are deeply apprecia- Informatics Association 29 (2022) 2014–2022. tive of their support and partnership, which has been [12] L. Ilias, D. Askounis, Multimodal deep learning instrumental in making this event possible. models for detecting dementia from speech and transcripts, Frontiers in Aging Neuroscience 14 References (2022). [13] Y. Guo, C. Zhu, S. Hao, R. Hong, Automatic depres- [1] A. Baevski, W.-N. Hsu, Q. Xu, A. Babu, J. Gu, M. Auli, sion detection via learning and fusing features from Data2vec: A general framework for self-supervised visual cues, IEEE Transactions on Computational learning in speech, vision and language, in: Inter- Social Systems (2022). national Conference on Machine Learning, PMLR, [14] P.-C. Wei, K. Peng, A. Roitberg, K. Yang, J. Zhang, 2022, pp. 1298–1312. R. Stiefelhagen, Multi-modal depression estima- [2] R. Girdhar, A. El-Nouby, Z. Liu, M. Singh, K. V. tion based on sub-attentional fusion, in: Computer Alwala, A. Joulin, I. Misra, Imagebind: One em- Vision–ECCV 2022 Workshops: Tel Aviv, Israel, Oc- bedding space to bind them all, in: Proceedings of tober 23–27, 2022, Proceedings, Part VI, Springer, the IEEE/CVF Conference on Computer Vision and 2023, pp. 623–639. Pattern Recognition, 2023, pp. 15180–15190. [15] A.-M. Bucur, A. Cosma, P. Rosso, L. P. Dinu, It’s [3] OpenAI, Gpt-4 technical report, arXiv preprint just a matter of time: Detecting depression with arXiv:2303.08774 (2023). time-enriched multimodal transformers, Advances [4] C. Akkus, L. Chu, V. Djakovic, S. Jauch-Walser, in Information Retrieval. ECIR 2023. Lecture Notes P. Koch, G. Loss, C. Marquardt, M. Moldovan, in Computer Science (2023) 200–215. N. Sauter, M. Schneider, et al., Multimodal deep [16] D. M. Jacobs, M. Sano, G. Dooneief, K. Marder, K. L. learning, arXiv preprint arXiv:2301.04856 (2023). Bell, Y. Stern, Neuropsychological detection and [5] J. Yoon, C. Kang, S. Kim, J. Han, D-vlog: Multi- characterization of preclinical alzheimer’s disease, modal vlog dataset for depression detection, in: Neurology 45 (1995) 957–962. Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2022, pp. 12226–12234. [6] S. Qiu, M. I. Miller, P. S. Joshi, J. C. Lee, C. Xue, Y. Ni, Y. Wang, I. De Anda-Duran, P. H. Hwang, J. A. Cramer, et al., Multimodal deep learning for alzheimer’s disease dementia assessment, Nature communications 13 (2022) 3404. [7] S. Fara, S. Goria, E. Molimpakis, N. Cummins, Speech and the n-back task as a lens into depression. 24 https://winterlightlabs.com/ 25 https://cambridgecognition.com/ Table of Contents Oral Presentations Paper Title Authors [Long] Knowledge-enhanced Memory Model for Mengzhao Jia, Qianglong Chen, Emotional Support Conversation Liqiang Jing, Dawei Fu, Renyu Li [Long] Learning to Generate Context-Sensitive Backchannel Maneesh Bilalpur, Mert Inan, Smiles for Embodied AI Agents with Applications Dorsa Zeinali, Jeffrey F. Cohn in Mental Health Dialogues Malihe Alikhani [Short] A Pretrained Language Model for Diego Maupomé, Fanny Rancourt, Mental Health Risk Detection Raouf Belbahar, Marie-Jean Meurs [Short] PMC: Paired Multi-Contrast MRI Dataset at 1.5T Fatemeh Bagheri, and 3T for Supervised Image2Image Translation Kamil Uludag [Short] Dance of the Neurons: Unraveling Mohammad Javad Darvishi Bayazi, Mohammad Sex from Brain Signals Sajjad Ghaemi, Jocelyn Faubert, Irina Rish [Abstract] Mental Health Stigma across Diverse Lucille Njoo, Lee Janzen-Morel, Generative Large Language Models - Abstract Inna Wanyin Lin, Yulia Tsvetkov Poster Presentations Paper Title Authors [Long] ConversationMoC: Encoding Conversational Dynamics Loitongbam Gyanendro Singh, Stuart E. using Multiplex Network for Identifying Moment of Middleton, Tayyaba Azim, Elena Nichele, Change in Mood and Mental Health Classification Pinyi Lyu, Santiago De Ossorno Garcia [Short] A Privacy-Preserving Unsupervised Speaker Vijay Ravi, Jinhan Wang, Disentanglement Method for Depression Detection from Speech Jonathan Flint, Abeer Alwan [Long] Ordinal Scale Evaluation of Smiling Intensity Kei shimonishi, Kazuaki Kondo, using Comparison-Based Network Hirotada Ueda, Yuichi Nakamura [Long] Natural Language Explanations William Stern, Seng Jhing Goh, Nasheen Nur, for Suicide Risk Classification Patrick J Aragon, Thomas Mercer, Siddhartha Using Large Language Models Bhattacharyya, Chiradeep Sen, Van Minh Nguyen [Long] Deploying AI Methods for Mental Health Creighton Heaukulani, Ye Sheng Phang, in Singapore: From Mental Wellness to Janice Huiqin Weng, Jimmy Serious Mental Health Conditions Lee, Robert J.T. Morris [Short] Investigating Bias in Affective State Yuxin Zhi, Bilal Taha, Detection Using Eye Biometrics Dimitrios Hatzinakos [Long] Towards Remote Differential Diagnosis of Mental Vanessa Richter, and Neurological Disorders using Automatically Extracted Michael Neumann, Speech and Facial Features Vikram Ramanarayanan [Short] Prediction of Relapse in Adolescent Depression Christopher Lucasius, Mai Ali, Marco Battaglia, using Fusion of Video and Speech Data John Strauss, Peter Szatmari, Deepa Kundur Pavlos Constas, Vikram Rawal, Matthew Honorio Oliveira, Andreas Constas, Aditya Khan, Kaison [Long] Toward A Reinforcement-Learning-Based System for Cheung, Najma Sultani, Carrie Chen, Micol Altomare, Adjusting Medication to Minimize Speech Disfluency Michael Akzam, Jiacheng Chen, Vhea He, Lauren Altomare, Heraa Murqi, Asad Khan, Nimit Amikumar Bhanshali, Youssef Rachad, Michael Guerzhoy