<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>June</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Unveiling Strategic Research Priorities: A Terminological Analysis of the ARCHE SRIA Using Word Rain</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elisa Squadrito</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesca Frontini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Vania Virgili</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Monica Monachini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Institute of Computational Linguistics “A. Zampolli”, National Research Council (CNR-ILC)</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Institute of Heritage Science, National Research Council (CNR-ISPC)</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Macerata</institution>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>1</volume>
      <fpage>9</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>This paper presents an experiment conducted through collaboration between the CLARIN and E-RIHS Research Infrastructures to analyse the Strategic Research and Innovation Agenda (SRIA) of the Alliance for Cultural Heritage Research in Europe (ARCHE). Using the Word Rain tool, semantically structured word clouds were generated to uncover key domain conceptualizations within the SRIA preparatory documents. The experiment aims to reveal the terminology that shapes the research coverage of the agenda and the conceptual framework that guides future initiatives in the field. Through this approach, we highlight the utility of corpus-based analysis in enhancing strategic policy development.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Cultural Heritage</kwd>
        <kwd>ARCHE</kwd>
        <kwd>Distant Reading</kwd>
        <kwd>Terminology</kwd>
        <kwd>Word Cloud</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
    </sec>
    <sec id="sec-2">
      <title>2. The ARCHE Project and SRIA Development</title>
      <p>
        The socioeconomic, technological, and environmental challenges that our society has faced in recent
years have made it necessary to redefine classical paradigms for Cultural Heritage management,
protection, and restoration. To this end, the Alliance for Research on Cultural Heritage in Europe2 (ARCHE)
was established. Funded by Horizon Europe, the project revolves around the creation of a holistic
network of stakeholders in the Cultural Heritage domain, including researchers, heritage professionals,
organizations, and institutional bodies. Operating at the academic, governmental, and public levels,
ARCHE aims to address current gaps in Heritage Science research and to develop multidisciplinary
and sustainable approaches to innovation in the field. The primary tool for achieving these objectives
has been identified as the drafting of a SRIA. Building on the JPI CH 2020 SRIA 3, the forthcoming
agenda will ensure the implementation of the Alliance’s vision into tangible and actionable goals, thus
serving as a driving force of innovation in the domain. More specifically, it will act as a “roadmap
with research priorities that will form the basis of calls for projects and other activities due to start in
2026, within the European Partnership for Resilient Cultural Heritage (RCH)”4. Started in 2024, the
development of the ARCHE SRIA was rooted in a mapping and assessment phase aimed at providing a
comprehensive initial overview of innovative research areas in the Cultural Heritage domain. Building
on this foundation, all stakeholders involved in the ARCHE project played an active role in drafting the
agenda through multiple consultations in the form of workshops and surveys. The direct involvement
of experts was crucial to capturing the perspectives of the many communities of practice targeted
by the policy document, as well as ensuring its accuracy and precision. It was nevertheless believed
that adopting a distant reading approach[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] to explore the specialized documents resulting from these
consultations could provide valuable insight into both linguistic and extralinguistic issues. Examining
the terminology of documents produced by ARCHE stakeholders from a broader perspective proved
useful for identifying recurrent patterns, prevalent concepts, and how they clustered.
      </p>
      <p>This paper presents a distant reading experiment conducted on a corpus of documents drafted within
the ARCHE project, including the Key Messages and Preliminary Findings that will form the basis of
the future SRIA. The tool selected for the analysis was Word Rain5, an advanced data visualization
tool capable of generating semantically structured word clouds. Given the terminologically rich nature
of such specialized texts, the rationale for the experiment was that examining how terms cluster and
appear in the word cloud could help uncover recurrent themes and reveal diferences in their distribution
between early and later project documents.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Methodology</title>
      <p>In this section, we will briefly give an overview of the tool chosen for conducting the analysis, that
is, Word Rain, as well as a brief description of its key functionalities. A few lines will be devoted to
the comparison of the tool against the backdrop of classic word cloud visualisation tools. Finally, the
reasons for choosing to use Word Rain in the context of this study and the opportunities it ofers will
be presented, without forgetting to highlight the limitations in its application.</p>
      <sec id="sec-3-1">
        <title>3.1. Introducing Word Rain</title>
        <p>
          Word Rain[
          <xref ref-type="bibr" rid="ref4">4</xref>
          ] is an advanced data visualisation tool that generates a semantically structured word cloud,
referred to as "word rain". It enables users to visualise word distributions within a text, adjusting the
size and placement of terms based on their frequency and semantic relevance. Word Rain was developed
through a collaboration between the Centre for Digital Humanities and Social Sciences at Uppsala
(CDHU), the National Language Bank of Sweden/CLARIN Knowledge Centre for the Languages of
2https://www.heritageresearch-hub.eu/arche-home/about-arche/
3https://www.heritageresearch-hub.eu/strategic-research-and-innovation-agenda-2020-sria/
4For further information, see https://www.heritageresearch-hub.eu/event/arche-2nd-stakeholders-workshop-to-take-place-in-florence-on-septemb
5Accessible at: https://wordrain.isof.se/
Sweden (SWELEN) and the iVis group at Linköping University. The tool is available as a web-based
application and has also an open source version of GitHub6 for those interested in customising or
integrating it into other projects.
        </p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Word Rain compared to Classic Word Clouds</title>
        <p>
          Word clouds are popular tools for visualising text content in an immediate and intuitive manner, due
to their compact and static layout. However, traditional word clouds typically feature tightly packed
words without a semantically motivated positioning, and prominence is solely indicated by font size,
based on the word frequency within the text. Word Rain draws from classic word clouds but features a
few salient innovations. As highlighted by its creators, it enhances the traditional word cloud model by
incorporating a distributional semantics-based approach, reduced to one dimension, to position words
along a semantically meaningful x-axis[
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. Colour-coded bars further enhance the visual grouping
of related terms. While font size continues to indicate prominence, additional indicators, such as bar
height and vertical positioning along the y-axis, allow for a more nuanced interpretation of the data.
Less prominent words are positioned lower on the y-axis, creating a sense of "falling," hence the term
"word rain."
        </p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Word Rain within the ARCHE project: opportunities and limitations</title>
        <p>Because of their ability to give an intuitive and immediate snapshot of a text or a collection of texts,
word cloud visualisations outlets are frequently incorporated into main corpus manager software, such
as Sketch Engine and Voyant Tools. However, such simplicity and intuitiveness comes at the expense
of depth and granularity in the type of information provided. A similar situation can be observed
when analysing the purpose of wordlists. Such instruments are frequently used by corpus linguists and
terminologists to evaluate the fitness and obtain an initial picture of a corpus. Nevertheless, they are
exhaustive enough for a comprehensive terminological analysis, The same limitations can be stated
for Word Rain. However, because of its distributional semantics-based approach, the tool adds a layer
of complexity to classic word clouds. One of the main reasons for which it was selected, other than
keywords and n-grams extraction, was indeed its ability to visually group related terms into easily
readable semantic clusters. These clusters can both give a quick view on narratives in the corpus, and
be used as indicators of the main topics covered in the document, thus providing guidance on how to
later proceed with the terminological analysis. As an example, the terminology extracted can be later
analysed, subdivided, and compared in accordance with the thematic areas highlighted from the Word
Rain visualisations. In doing so, imbalances in representation might be detected and motivated.</p>
        <p>Word Rain, despite its enhanced and improved features, is still a word cloud. However, alongside
its semantically motivated visualisations, its ease in use by non-experts too is the second main reason
for which it was selected for this analysis. The main goal of this study was to allow experts in the
Cultural Heritage domain involved with the preparation of the ARCHE SRIA to learn how to make
sense of their specialised documents, without requiring any prior background in linguistics. Although
the Word Rain analysis was performed by CLARIN experts and then validated by ARCHE experts, its
intuitive character allowed domain experts to follow its reasoning and, hopefully, replicate it in the
future. In short, findings from the Word Rain can 1) provide an initial picture of the documents at hand
by unveiling possible hidden narratives 2) highlight keyword clusters that might guide further term
extractions and analysis of the corpus and 3) help inform experts involved in the ARCHE SRIA writing
for the preparation of the agenda.
6Accessible at : https://github.com/CDHUppsala/word-rain</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Word Rain Key Functionalities</title>
        <p>Word Rain allows its visualisations to be easily customised, according to one’s goals and needs. Its main
key functionalities7 can be listed as such:
1. Term Frequency-Inverse Document Frequency (TF-IDF): Word Rain allows users to visualise the
distribution of words in a text by adjusting word size and placement based on frequency and
semantic relevance, using techniques such as TF-IDF.
2. N-gram extraction: This parameter allows users to extract and visualise common n-grams, either
in addition to or instead of individual words. As specified by developers of the tool, this feature
works best for languages where compounds are constructed as: more common word followed by
specifying word.
3. Background corpus selection: Users can upload a background corpus, refining TF-IDF values
and creating more nuanced relevance for terms in the main document. The tool is handy for
exploring and comparing corpora, as it can visualise language patterns and shifts within datasets,
such as climate change reports or other specialised texts.
4. Customised visualisation: Users can modify settings, including font size and word vertical position,
to optimise readability. It is also possible to regulate the maximum font size and decide how
to arrange words when they overlap on the vertical axis. These settings allow to control the
visualisation’s airiness or density, depending on the desired appearance.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Analysis and Results</title>
      <p>Word Rain was chosen to visually analyse, extract meaningful information from and compare the
following documents:
1. ARCHE D2.1 Future Trends on Cultural Heritage (Foresight Analysis);
2. ARCHE D2.4 Vision and Mission;
3. ARCHE D2.5 SRIA Key Messages and Preliminary Findings;
4. WGs forms on ARCHE SRIA.</p>
      <p>The main objectives of the experiment were to: a) Identify similarities and diferences between the
four documents’ discourse; b) Determine if any themes originally identified in the WGs forms are
missing or appear less emphasised in the SRIA Key Messages and preliminary findings; c) Examine
how keywords clustered. To get an initial general picture of the recurring themes in the reports, it was
decided to generate four word rains, one per document. By generating four visualisations, it will be
possible not only to have a look at each document individually and understand its structure but also to
compare every one of them against each other, retrieve shifts in argumentation and analyse themes
that were not covered homogeneously.</p>
      <sec id="sec-4-1">
        <title>4.1. Preprocessing of data: conversion and cleaning of the reports</title>
        <p>In order to generate the four Word Rains, source reports in the .pdf format were converted to the
.txt format and cleaned of any unnecessary elements that could introduce noise into the word rain
visualisation. The noise was removed in two phases:
1. Documents were first cleaned in a standardised manner, following general rules commonly used
for cleaning data in textual corpora before processing.
2. A second cleanup was carried out after having generated four mock-up word rains, one per
document. Words of no interest that frequently appeared in the clouds were removed from the
documents, such as: https, www, .com, homepage, org, pdf, core theme, title, keyword, ch, hub.
7For any in depth discussion on the tool and its key functionalities, please refer to Skeppstedt, Maria, et al. "From word
clouds to Word Rain: Revisiting the classic word cloud to visualize climate change texts." Information Visualization (2024):
14738716241236188.</p>
        <p>For instance, “core theme" frequently appeared in the word rains but held no meaning in this type of
analysis. For this experiment, the Word Rain web application was used. However, for more fine-grained
analysis, it would be possible to generate the clouds in a coding environment, allowing the use of one
or more stop-word lists. This would eliminate the need for modifications to the data.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Processing of data: generating word rains</title>
        <p>Once the documents were cleaned, word rains were generated using the web application tool. The
parameters selected for plotting the clouds were the following:
1. Language selected: English
2. Word count: 300
3. Frequency: TF-IDF
4. Word combinations: extraction of n-grams
5. Background corpus: none
6. Word size fall-of: 0.7
7. Bar height: 40
The rationale for choosing 300 word count instead of the 600 option was to create an airy and easily
interpretable word cloud. Similarly, a 0.7 fall and a 40% bar height will prevent the vertical y-axis from
becoming clogged and dificult to read. However, no background corpus was selected at this stage.
This choice was motivated by the goal of the analysis, which was to compare only term occurrences
contained in the four ARCHE reports. Figures 1 to 4 display the Word Rain visualizations corresponding
to the four ARCHE documents processed.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.3. Commenting on Word rains: a few considerations</title>
        <p>By uploading ARCHE documents to the tool in one session, it was possible to generate comparable
word clouds. The first thing that catches the eye is, to the far right of the infographic, the Cultural
Heritage thematic cluster. The biggest terms, hence with the highest semantic value, are as expected
fundamental ones, such as heritage, cultural and cultural heritage. Although this information is not
of particular interest, it was decided not to exclude “heritage” and “cultural” from the word cloud.
Excluding these terms would have afected all n-grams containing "heritage". In the adjacent section,
a few meaningful terms start to emerge. What is interesting here is the comparison of word clusters
for the SRIA preliminary findings and WGs forms. Both word rains share similarities in the themes of
cultural resilient, natural heritage, heritage preservation. The word indigenous appears in both, but that
of tourism only appears in the WGs forms. Notably, tourism does not appear in any other word cluster
along the horizontal axis of the SRIA graphic, and might thus be an indicator of a possible topic that
was missed to include.</p>
        <p>Another point worth further exploration is the representation of gender issues within the reports.
While gender appears eight times in the Word Cloud for the WG forms — both as a standalone term
and within n-grams like gender perspective, gender balance, and gender research— it is notably absent in
other visualisations. A quick review of the preliminary SRIA document reveals only two occurrences of
gender, confirming its lack of visibility in the Word Cloud.</p>
        <p>Climate is a fundamental theme in all reports, appearing at least once in every word cloud, with
increasing frequency from the Vision and Mission documents to the preliminary SRIA. In the latter,
the word cluster related to climate change occupies a significant portion of the entire vector graphic.
Next to climate change and climate education are n-grams such as risk reduction, resilience, temperature,
warming, mitigation and adaptation. Very interestingly, the concept of climate change is semantically
close on the x-axis to that of societal resilience, and public awareness (right) and to the cluster related to
community and public goods (left). At the core of the SRIA we find geological, stratigraphy, greenhouse,
emissions temperature, environmental change, mitigation, adaptation, risk research, and risk reduction.
These findings align closely with the report, reflecting its preliminary focus on a comprehensive and
multidisciplinary approach to climate change. Terms like Stratigraphy resonate with SRIA’s emphasis
on understanding climate change within a long-term geological and human context, acknowledging
humanity’s impact on Earth’s systems. Key scientific terms such as greenhouse, temperature, and
emissions directly address the priorities of SRIA’s around the core mechanisms driving climate change,
while terms like environmental change, mitigation, and adaptation highlight the agenda’s commitment
to developing strategies that respond to these shifts. The inclusion of risk research and risk reduction
underscores the SRIA’s proactive approach to managing climate-related risks to society and ecosystems.
Together, these findings mirror SRIA’s focus on bridging scientific understanding with practical actions
in multiple fields to efectively address climate change challenges.</p>
        <p>Within the WGs Forms Word Rain, climate change appears as a theme, albeit less prominently.
Notably, this cluster emphasises societal and political participation in climate change discourse, along
with heritage preservation. Terms such as policy linkages, governance, management, innovation, and
participatory suggest an emphasis on community involvement and strategic governance in addressing
climate issues. Additional topics that surface here include citizen science, potential migration, gender
research, gender perspectives, and circular economy. The distinct presence of circular economy—absent
from other word clouds — highlights a potential area for inclusion in the final SRIA, underscoring
sustainable resource use in response to climate challenges. Similarly, citizen science, found only in the
WG forms word cloud, highlights the role of societal engagement and collective knowledge building
in climate action. Potential migration also emerges as a significant factor, recognizing migration as
a likely impact of climate change on both cultural heritage and society, warranting deeper analysis.
These findings collectively suggest important dimensions of climate change that could further enrich
SRIA’s focus. In all four word clouds, greenish clusters highlight themes related to Europe and ARCHE.
Expectedly, ARCHE is a prominent keyword across most word clouds, except for the one focused on
Future Trends. Since it does not contribute to the identification salient collocations and is not part of
any multiword expression, adding it to a stop-word list could be beneficial. Other recurring keywords
include European Commission, European Union, United Nations, Horizon Europe, and heritage research-EU
(with hub intentionally removed). It is unclear whether the presence of proper names might be useful
to ARCHE experts or just a noise source. Always in the same cluster, ARCHE SRIA’s commitment to
climate themes reappears in the SRIA word cloud, with keywords such as climate neutrality, Green Deal,
and adaptive release emphasising this ongoing priority.</p>
      </sec>
      <sec id="sec-4-4">
        <title>4.4. Validating results: the ARCHE experts assessment</title>
        <p>Once the Word Rain visualisations were generated, they were first interpreted by CLARIN experts and
subsequently evaluated in consultation with three E-Rihs experts involved in the ARCHE consortium.
Their feedback was particularly useful for 1) identifying lexical noise; 2) excluding terms that, although
frequent, were not relevant to the domain; and 3) contextualising why such terms were considered
misleading or insignificant in this specific context. For instance, the term Anthropocene emerged
prominently in the visualisation, initially suggesting thematic relevance. However, experts clarified that
its occurrence was related to ongoing debates surrounding the formal recognition of the Anthropocene
as a geological epoch, an issue that had recently been resolved with a negative verdict. As a result, the
term was deemed irrelevant to the agenda’s core research concerns. Following this first consultation,
the results obtained were shared with the wider consortium of experts involved in shaping the agenda,
allowing them to be involved in the process and provide their feedback. The present terminological
experiment was further assessed in comparison to another analysis, which is beyond the scope of this
paper, that focused on the responses to a survey launched by the ARCHE consortium to the broader
Cultural Heritage community.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Future Steps</title>
      <p>This experiment demonstrated how looking at keywords and n-grams of a specialized corpus can provide
valuable insights for extralinguistic analysis. By ofering an immediate and accessible way to examine
policy documents and interpret their discourse, Word Rains enabled ARCHE experts to assess their
work and reflect on the conceptualizations earlier produced. However, this analysis should be viewed
as a preliminary experiment, which requires further research and refinement. Consulting with ARCHE
to remove unnecessary stop words in a more systematic fashion might be very useful in observing any
shifts in the overall thematic structure. To achieve this, running a custom-based experiment within
a vector-based environment using the Word Rain GitHub code appears to be an optimal approach.
Pairing the results obtained with insights from other analysis tools, such as Sketch Engine, and using a
general language reference corpus for Automatic Term Extraction (ATE), will be necessary to conduct
an extensive terminological analysis. Ultimately, the ARCHE corpus will be integrated into a larger
corpus of European Cultural Heritage and Climate Change policy reports. The ARCHE documents
used for this analysis comprise a variety of textual types, including a foresight analysis, working group
discussions on thematically relevant topics, and a formalised output summarising the “preliminary”
ifndings intended to shape the final Strategic Research and Innovation Agenda (SRIA) for Cultural
Heritage in the coming years. Although these documents do not conform to conventional policy
formats, such as green papers, white papers, or policy briefs—they nonetheless fall within the category
of strategic documentation, as they aim to guide and structure collaborative research and action within
the domain to which they pertain. Their inclusion in the corpus is therefore justified, particularly
within a designated subcorpus focused on documents produced by projects addressing the intersection
of Cultural Heritage and Climate Change. This body of grey literature, although often overlooked in
mainstream analyses, is notably rich in domain-specific terminology and conceptual formulations. The
practice of annotating these documents according to project, document type, and communicative intent
aligns with the broader objective of enriching the corpus. This, in turn, contributes directly to the
overarching aim of developing a transdisciplinary glossary for the Cultural Heritage domain — one
that reflects both the expert language and the strategic direction of the field.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Acknowledgements</title>
      <p>This work is partly supported by the H2IOSC Project - Humanities and cultural Heritage Italian Open
Science Cloud funded by the European Union NextGenerationEU - National Recovery and Resilience
Plan (NRRP) - Mission 4 “Education and Research” Component 2 “From research to business” Investment
3.1 “Fund for the realization of an integrated system of research and innovation infrastructures” Action
3.1.1 “Creation of new research infrastructures strengthening of existing ones and their networking
for Scientific Excellence under Horizon Europe” - Project code IR0000029 – CUP B63C22000730005.
Implementing Entity CNR.</p>
      <p>The authors express their sincere gratitude to the Alliance for Research on Cultural Heritage in
Europe (ARCHE) project, funded by the European Union under the Horizon Europe programme
(Grant Agreement No. 101060054). This research would not have been possible without the valuable
collaboration of experts from the Alliance, whose willingness to share data and validate results has
been crucial.
During the preparation of this work, the author(s) used Chat-GPT-4 in order to: perform grammar and
spelling check. After using this tool, the author(s) reviewed and edited the content as needed and take
full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>E.</given-names>
            <surname>Cano Díaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Castillejo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. Ramírez</given-names>
            <surname>Barat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sanz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. Martín</given-names>
            <surname>Gil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bueso</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sarró</surname>
          </string-name>
          ,
          <article-title>The european research infrastructure for heritage science (e-rihs): an infrastructure for an interdisciplinary scientific domain</article-title>
          ,
          <source>Unpublished</source>
          (
          <year>2019</year>
          ).
          <article-title>Add journal name or type if known</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>d</surname>
          </string-name>
          . Jong, D. Van Uytvanck,
          <string-name>
            <given-names>F.</given-names>
            <surname>Frontini</surname>
          </string-name>
          , A. van den Bosch,
          <string-name>
            <given-names>D.</given-names>
            <surname>Fišer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Witt</surname>
          </string-name>
          ,
          <article-title>Language matters. the european research infrastructure clarin, today and tomorrow</article-title>
          , in: D.
          <string-name>
            <surname>Fišer</surname>
            ,
            <given-names>A</given-names>
          </string-name>
          . Witt (Eds.),
          <source>CLARIN. The Infrastructure for Language Resources</source>
          , volume
          <volume>1</volume>
          of Digital Linguistics, De Gruyter, Berlin, Boston,
          <year>2022</year>
          , pp.
          <fpage>31</fpage>
          -
          <lpage>58</lpage>
          . doi:
          <volume>10</volume>
          .1515/
          <fpage>9783110767377</fpage>
          -
          <lpage>002</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Jänicke</surname>
          </string-name>
          , G. Franzini,
          <string-name>
            <given-names>M. F.</given-names>
            <surname>Cheema</surname>
          </string-name>
          , G. Scheuermann,
          <article-title>On close and distant reading in digital humanities: A survey and future challenges</article-title>
          ,
          <source>in: EuroVis (STARs)</source>
          ,
          <year>2015</year>
          , pp.
          <fpage>83</fpage>
          -
          <lpage>103</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahltorp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Skeppstedt</surname>
          </string-name>
          ,
          <article-title>Word rain as a service</article-title>
          ,
          <source>in: CLARIN Annual Conference</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>22</fpage>
          -
          <lpage>25</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Skeppstedt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ahltorp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Johansson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Velupillai</surname>
          </string-name>
          ,
          <article-title>From word clouds to word rain: Revisiting the classic word cloud to visualize climate change texts</article-title>
          , Information Visualization (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          . 1177/14738716241236188.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>