<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Language Technologies⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Alicia Pérez</string-name>
          <email>alicia.perez@ehu.eus</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>MaiteOronoz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Juan Martinez-Romo</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>LourdesAraujo</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>HiTZ Basque Center for Language Technologies - Ixa (UPV/EHU)</institution>
          ,
          <addr-line>Manuel Lardizabal 1, 20018 Donostia</addr-line>
          ,
          <country>España (</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <abstract>
        <p>This project highlights the importance of analyzing social networks in the digital transition that must take place in the administration in order to adapt to the new sources of information. The creation of the mental health observatory proposed could help hospitals to: receive a global image of the social perception and its changing dynamics as a source of social information, manage their human resources by anticipating work peaks based on the analysis of trends found in social networks, consult demographic profiles, obtain an analysis of temporal or seasonal characteristics, and discover relationships. This project makes technological contributions in: the generation and availability to the community of i) linguistic resources; and ii) tools to carry out natural language understanding adapted to Spanish and the terminology of these networks. The main challenge of the project is to convert unstructured textual information into interpretable knowledge and story-telling.</p>
      </abstract>
      <kwd-group>
        <kwd>Mental health</kwd>
        <kwd>social networks</kwd>
        <kwd>health care institutions</kwd>
        <kwd>natural language processing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
tims of trafic accidents [1]. The correct identification
According to the World Health Organization, 80 peoplteechnology in mental health provides an opportunity to
die by suicide every hour, being suicide among the 10create breakthrough solutions that improve mental health
most common causes of death globally. In Spain, in 2022,and well-being outcomes on a greater scale than ever
be2015 people died due to suicide and self-harm injuriesfore”. It is not new the fact that artificial intelligence
while 761 were victims of trafic accidents, that is, lastleverages valuable knowledge from collective behaviors
year the number of suicides almost tripled that of vaics-it is the case of the social observatories in polit6i]c.s [
Our hypothesis rests on the fact that collective
emotors 2[], job occupation, age (young 3[] and elder people
of signals that could indicate suicide risk is a core, yettional and behavioral reflections in the social media entail
undisclosed, cornerstone. Some risks and factors conin-herent psycho-linguistic features connected to general
nected to suicide ideation or commitment include: mewn-ell-being. In our case, we will pay special attention
tal health disorders, loneliness or lack of connection wittho risk factors such as self-harm, loneliness, depression
significant ones, depression, bullying, consumption of and anxiety. Moreover, changes in these features might
substances, lack of adherence to treatment, self-harm ertecv.eal changes that can light pathways to aid healthcare
There are documented diferences in the population notpractitioners envisaging a punctual social picture related
only by health condition but also by socio-economic fact-o mental illnesses. The work carried out by Fine e[t7a]l.</p>
      <p>and [8] is a proof of concept of how publicly available
so[4]), gender and other social and demographic factors.cial media data might be used to assess population-level
In the foreword of the “Global Governance Toolkit fomrental health. The analysis of data outside the
health</p>
    </sec>
    <sec id="sec-2">
      <title>Digital Mental Health” document launched by World Economic Forum and Deloitt[e5], it is stated that “Disruptive</title>
      <p>tions
∗Corresponding author.
nEvelop-O</p>
      <p>0000-0003-2638-9598 (A. Pérez);0000-0001-9097-6047
care system, and in particular social media data, may lead
to the development of automatic systems for assessing
suicide risk based on what people share on their social
networks. This kind of information allows looking into a
person’s life in a more frequent and daily basis, when the
potential risk of committing suicide can be monitored
much more eficiently [ 9]. We find, however, a gap for
this type of approaches involving deep natural language
understanding in Spanish.</p>
      <p>Obser-Menh project1 involves advanced computa- data is the core of our project on which rely the
protional semantic technologies and makes a technologicaplosed artificial intelligence approaches. Annotating data
contribution generating tools to carry out natural alaidne-d by human experts provides extra possibilities for
guage understanding adapted to Spanish and the termtih- e algorithms to learn in semi-supervised fashions or
nology of these networks with the added value of rendefro-llowing few-shot learning schemes. However, general
ing unstructured textual information into interpretpaubrlepose tools to process natural language are still to be
knowledge. adapted to the nuances of language employed in social
media. A contribution of this project rests on the
retraining or adaptation of tools to analyse and process
2. Objectives social-media texts. Specific lexica will be created to suit
domain adapted language models that are employed by
In brief, a review to the antecedents outlined the
im</p>
      <p>deep neural language understanding approaches such as
portance of detecting negation, pronouns, variations in</p>
      <p>BERT and related transformers. These tools are crucial to
the emotional state, abstract adjectives, social factors</p>
      <p>gain insights beyond static linguistic lists and represent
etc. Thus, tools should be adapted to Spanish in this</p>
      <p>a step ahead in semantic relatedness. Indeed, it is a deep
domain. All in all, we find that there is a gap in Spanish language understanding the approach fostering the
novpsico-linguistic analysis. We miss annotated data andelty of our approach, an observatory displaying
psychoalso the application of medical entity recognition (MER)</p>
      <p>linguistic features and revealing chronological dynamics
prior to polarity discovery approaches. Having detected</p>
      <p>in the embedded topics, discovering latent demographic
the strenghts and weaknesses of the related work, we</p>
      <p>profiles of anxiety, loneliness, depression, self-harm risk
focus on the gaps, and thus define the objectives. Generaland suicide ideation. The novelty rests, not only on the
objectives include:</p>
      <p>information extracted from unstructured media but also
• Development of tools for information retrievainl the purpose itself aiming at providing insights from
in Spanish focused on the analysis of data fromlanguage to experts in Mental eHealth and bridging the
social networks related to mental health for tghaep between users and specialised health facilities and
early detection of changing trends in psycho‐linas-sociations. In fact, the project shall contribute with
guistic characteristics. suited negation and speculation detection tools,
medi• Analysis of existing public data on mental healthc,al entity recognition and emotional language discovery.
development of annotation schemes for the proTb-he valuable resources developed in this project shall be
lems addressed in the sub‐projects (including genr-eported and left to the research community.
der data). In our project, the added value in terms of return to
• Creation and adaptation of NLP tools to the dot-he society rests on the formulation of the time
compomain of mental health in social media. nent within an observatory in order to keep track and,
• Generation of demographic profiles focusing on particularly, focus on changes across time in social
perthe psycho‐linguistic characteristics expressedcineption, emotions and behaviour latent in language and
building demographic profiles of latent linguistic features.
the texts and inherent to potential mental health
risks for the benefit of individuals, taking genderMoreover, we envisage to render unstructured linguistic
dimension into account. data into human-friendly graphical data to aid decision
support by health practitioners.
• Analysis of the relationships among the difer- The expected findings would be in the research line of
ent problems addressed in the sub-projects thacto,mputational social psychology, particularly, in
psycothrough artificial intelligence, provide dynamismlinguistics. In brief, user generated content is a valuable
for the benefit of better healthcare. means of collective intelligence and subjective
perception of reality, it is dynamic and fast. Connective action
3. Novelty and added value and post sharing leads to under-used complex though
valuable big data in terms of natural language.</p>
      <p>A careful review to antecedents disclosed that eHealth in
social media has been addressed in English. Our project4. Methodological approach
will bridge the gap to adapt deep natural language
understanding tools to Spanish and has an added value tTohe methodology proposed to cope witOhbser-Menh
alternative Spanish variants (e.g. Argentina, Chile, Meixs-broadly depicted in Figur1e. In brief, in order to cope
ico etc.) with the particular analysis of each variantwbyith the aforementioned goals, the design encompasses
means of the demographic-profile analysis. Social media the following modules and tasks:
1Obser-Menh project:http://nlp.uned.es/obser-menh-project
• Data: this module involves two important tasks:
1. Crawling a large data‐set from social netc-hanging emotions and behaviors potentially connected
works (e.g. Twitter) in Spanish. with substantial changes in mental health conditions.
2. Health‐expert aided manual and
semi‐automatic annotation of relevant linguistic
cues e.g. drugs, key phrases, specific lexi- 5. Research groups
cons (e.g. chapter F of ICD‐10 or, alternaT-he Obser-Menh project is arranged as two coordinated
tively, Snomed-ct) and gender indicators.
• NLP&amp;IR group at UNED (http://nlp.uned.es)/ References</p>
      <p>Has a long trajectory in Intelligent Access to
Information and Knowledge Representation. The[1] Instituto Nacional de Estadística, Defunciones por
group counts on several research lines within the causa de muerte,https://www.ine.es/jaxi/Tabla.
clinical domain and demographic cue extraction htm?tpx=55863&amp;L=,02022. Accessed: 2023-03-06.
from text (of particular relevance in the segmen[-2] F. Ferretti, A. Coluccia, Socio-economic factors and
tation study underlying mental health). NLP&amp;IR suicide rates in European Union countries, Legal
group at UNED is the leader of the aforemen- Medicine 11 (2009) S92–S94.</p>
      <p>tioned GELP sub-project. [3] D. Wasserman, Q. Cheng, G.-X. Jiang, Global
sui• The psychiatry service of thHeospital Clínico cide rates among young people aged 15-19, World
San Carlos in Madrid, cooperates with NLP&amp;IR psychiatry 4 (2005) 114.</p>
      <p>group at UNED in the GELP project. [4] M. Waern, E. Rubenowitz, B. Runeson, I. Skoog,
• HiTZ (http://www.hitz.e u)sHiTZ is a reference K. Wilhelmson, P. Allebeck, Burden of illness and
center on Language Technologies aiming at the suicide in elderly people: case-control study, Bmj
promotion of research, training, technological 324 (2002) 1355.
transfer and innovation in Artificial Intelligence [5] World Economic Forum, Deloitte, Global
goverfocused on language and speech technologies. nance toolkit for digital mental health: Building
The subset of researchers from HiTZ involved trust in disruptive technology for mental health,
in this project are all within Ixa research group 2021. URL: https://www3.weforum.org/docs/WEF_
(http://www.ixa.eu)sa consolidated group with Global_Governance_Toolkit_for_Digital_Mental_
beyond than 35 years of existence in which a Health_2021.pd.f
research line started more than 12 years ago is[6] S. Caton, M. Hall, C. Weinhardt, How do politicians
devoted to NLP in the clinical domain. HiTZ is use Facebook? An applied social observatory, Big
the leader of LOTU sub-project. Data &amp; Society 2 (2015) 2053951715612822.
• Osakidetza, the Public Health System within the[7] A. Fine, P. Crutchley, J. Blase, J. Carroll, G.
CopBasque Country and, specifically, experts within persmith, Assessing population-level symptoms of
the Network of Mental Health in Biscay involved anxiety, depression, and suicide risk in real time
in BioCruces (the Health Research Center of Bis- using NLP applied to social media data, in:
Proceedcay) cooperate with HiTZ in the LOTU project. ings of the fourth workshop on natural language
processing and computational social science, 2020,
pp. 50–54.
6. Conclusions [8] H. Moradian, M. A. Lau, A. Miki, E. D. Klonsky,
A. L. Chapman, Identifying suicide ideation in
menThe creation of the mental health observatory presented tal health application posts: A random forest
algoin this project could help to: (1) receive a global image rithm, Death Studies (2022) 1–9.
of the social perception and its changing dynamics as a[9] P. Resnik, A. Foreman, M. Kuchuk, K.
Musacsource of social information, (2) manage their human re- chio Schafer, B. Pinkham, Naturally occurring
lansources by anticipating work peaks based on the analysis guage as a source of evidence in suicide prevention,
of trends found in social networks, (3) consult demo- Suicide and Life-Threatening Behavior 51 (2021)
graphic profiles, (4) obtain an analysis of temporal or 88–96.
seasonal characteristics, and (5) infer latent relationsh[i1p0s] M. K. Nock, E. M. Kleiman, M. Abraham, K. H.
Bentbetween diferent problems. ley, D. A. Brent, R. J. Buonopane, F. Castro-Ramirez,
C. B. Cha, W. Dempsey, J. Draper, et al., Consensus
statement on ethical &amp; safety practices for
conduct7. Acknowledgements ing digital monitoring studies with people at risk of
suicide and related behaviors, Psychiatric research
and clinical practice 3 (2021) 57–66.</p>
    </sec>
    <sec id="sec-3">
      <title>OBSER-MENH, with subprojects GELP (TED2021</title>
      <p>130398B-C21) and LOTU (TED2021-130398B-C22) are
funded by MCIN/AEI/10.13039/501100011033 and by the
European Union “NextGenerationEU”/PRTR. In addition,
this work was partially funded by the Spanish Ministry
of Science and Innovation (DOTT-HEALTH/PAT-MED
PID2019-106942RB-C31 and INDICA-MED
PID2019106942RB-C32) by the Basque Government (IXA
IT-157022); and by EXTEPA within Misiones Euskampus2.0.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>