<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>German Conference on Artificial Intelligence, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Introduction to the Third Workshop on Humanities-Centred Artificial Intelligence</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Sylvia Melzer</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hagen Peukert</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stefan Thiemann</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universität Hamburg</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Universität zu Lübeck</string-name>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2023</year>
      </pub-date>
      <volume>26</volume>
      <issue>2023</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Since Humanities-Centred Artificial Intelligence (CHAI) was suggested as an emerging paradigm [1], research in this area has revealed methodological advancements and clear directions towards machine training. This implies a methodological focus on the subject of Humanities, which is indeed maintained throughout the current and past workshops. It also entails the underlying question how the machine can help to more eficiently approach research questions in the Humanities or devise new ones that cannot be followed otherwise. Yet, this year, a broad societal discussion on conversational agents and its consequences for the society heralds a second strain of research in this area that is implicitly given in the CHAI-roadmap, but never actively pursued: ideology. This classic of Humanities pushes the human being, its properties, behavior, and needs to the forefront whereas the methodology follows suit. These two perspectives, ideology and method, do not contradict, but complement each other. They may set forth a positive view of technology and, assumably, positive net efects on the human being. And, chances are high that scientific endeavors in Humanities-Centred AI shift in part more decidedly to the tenets of strong AI. An ideological perspective would embrace ethical issues such as a privacy and accountability paradox that is driven by the inherent inscrutability property of deep nets [2], but also the evaluation of fact and fake as well as other information content brought about by bottom up processes of new media. In a Habermasian understanding [3, 4] the structural change of the public no longer converges on an agreement by discourse [5]. It is completely open as to which truth and commitment can be achieved in the public. Equally relevant to the focus of ideological thinking in Human-Centred AI could be the consequences of speech and language technologies, e.g. conversational agents, to the labor market, education, or cultural industries like (script) writers and media production. While practical AI applications that we observe in our daily life are developed by keeping a methodological focus in the first place, the consequences of having them will be reflected in the</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>ideological perspective of Humanities-Centred AI, which, in turn, will require other methods.
To illustrate, advances in modeling natural language has evoked a range of applications that
change the way, in which text production is taught and examined at schools and universities.
These changes now require new technological means of observation and checking plagiarism.
Thus, some circular feedback loops emanate from the initial advancement. This, however, is not
supposed to mean downsizing the focus on methods. In fact, once a self reinforcing cycle has
started, continuous improvement on the methods seem to be the best solution to flatten the
spiral over time as it is the case for other technological innovations. Consequently, CHAI also
presents seven papers centering around methodological solutions of diverse challenges across
the Humanities. In the rapidly evolving research and technology landscape, the contributions
in this volume highlight the latest advances and challenges approaches from computer science
to be established in the Humanities. The contributions cover a wide range of topics.</p>
      <p>The first regular paper addresses the challenges of reusing data in research archives, even
when following established guidelines. Innovative solutions are proposed to make research
data more accessible and user-friendly.</p>
      <p>
        The second regular paper introduces FrESH, an approach for enhancing Subjective Content
Descriptions (SCDs) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] in a model using human feedback. It focuses on improving models’
accuracy by incorporating feedback without the need for complete retraining.
      </p>
      <p>
        The third regular paper explores the application of latent Dirichlet allocation (LDA) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] in
uncovering hidden thematic structures within a specific domain, academic journals focused
on modern and ancient manuscripts. While LDA is commonly used for various types of text
data, its behavior in highly domain-specific corpora is less understood. The paper discusses the
insights gained from applying LDA to this specialized corpus, shedding light on steps specific
to dealing with domain-specific data.
      </p>
      <p>
        The forth regular paper discusses the challenges faced by humanities scholars when
finetuning Large Language Models (LLMs) [
        <xref ref-type="bibr" rid="ref10 ref8 ref9">8, 9, 10</xref>
        ], such as BERT [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], for domain-specific tasks
with limited training data. It emphasizes the increasing availability of research data in
information systems as a valuable resource for fine-tuning these models. The paper presents a novel
method for fine-tuning BERT models on-demand, using training data from pre-modern Arabic
as an example. In addition, the paper presents the development of a Humanities Aligned Chatbot
that utilizes the fine-tuned model to make LLMs more accessible in humanities research.
      </p>
      <p>The fifth regular paper proposes the use of a federated cross-domain information system
to supplement missing research data in humanities projects. It demonstrates how an
indexing approach can be integrated into an federated information system for eficient federated
information retrieval, addressing challenges in presenting search results from diverse
information sources. In addition, it will be discussed how users can interact with the system by
using natural language queries integrated with GPT-41 to generate SQL queries. The result
is a cross-domain information system that facilitates comprehensive research in the
humanities by bringing together multiple sources of information and facilitating eficient, federated
information retrieval.</p>
      <p>The first short paper presents a new approach to assessing responsible AI that combines the
results of a literature review with an evaluation framework. It provides an overview of the
responsible use of AI, presents evaluation metrics tailored to humanities data, and introduces
VERIFAI, an example implementation of the evaluation framework.</p>
      <p>The second short paper explores how recent advancements in research are enabling the
analysis of diferent modalities in historical artefacts. This work discusses the potential applications
of vision-language models in the context of historical research.</p>
      <p>In essence, these seven contributions collectively highlight ongoing eforts to utilise advanced
technologies, improve data-driven research and bridge the gap between AI capabilities and
domain-specific needs. They ofer promising solutions for more eficient, accurate and accessible
research in the humanities domain.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          ,
          <string-name>
            <surname>Humanities-Centred Artificial</surname>
          </string-name>
          <article-title>Intelligence (CHAI) as an Emerging Paradigm</article-title>
          , De Gruyter, Berlin, Boston,
          <year>2021</year>
          , pp.
          <fpage>245</fpage>
          -
          <lpage>266</lpage>
          . URL: https://doi.org/10.1515/
          <fpage>9783110753301</fpage>
          -
          <lpage>013</lpage>
          . doi:doi:10.1515/
          <fpage>9783110753301</fpage>
          -
          <lpage>013</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>H.</given-names>
            <surname>Peukert</surname>
          </string-name>
          ,
          <article-title>Inscrutability versus Privacy and Automation versus Labor in Human-Centered AI: Approaching Ethical Paradoxes and Directions for Research</article-title>
          , in: M.
          <string-name>
            <surname>Ganzha</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Maciaszek</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Paprzycki</surname>
          </string-name>
          , D. Ślęzak (Eds.),
          <source>Proceedings of the 18th Conference on Computer Science and Intelligence Systems</source>
          , volume
          <volume>35</volume>
          <source>of Annals of Computer Science and Information Systems</source>
          , IEEE,
          <year>2023</year>
          , p.
          <fpage>1101</fpage>
          -
          <lpage>1105</lpage>
          . URL: http://dx.doi.org/10.15439/2023B7504. doi:
          <volume>10</volume>
          .15439/2023B7504.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Habermas</surname>
          </string-name>
          , Strukturwandel der Öfentlichkeit, Suhrkamp,
          <source>Frankfurt am Main</source>
          ,
          <year>1962</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Habermas</surname>
          </string-name>
          ,
          <article-title>Überlegungen und Hypothesen zu einem erneuten Strukturwandel der politischen Öfentlichkeit</article-title>
          ,
          <source>Sonderband Leviathan</source>
          , 1 ed.,
          <string-name>
            <surname>Nomos</surname>
          </string-name>
          , Baden-Baden,
          <year>2021</year>
          , pp.
          <fpage>470</fpage>
          -
          <lpage>500</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J.</given-names>
            <surname>Habermas</surname>
          </string-name>
          , Der philosophische Diskurs der Moderne. Zwölf Vorlesungen, Suhrkamp,
          <source>Frankfurt am Main</source>
          ,
          <year>1981</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>F.</given-names>
            <surname>Kuhr</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Braun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bender</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Möller</surname>
          </string-name>
          , To Extend or not to Extend?
          <article-title>Context-specific Corpus Enrichment</article-title>
          ,
          <source>in: Proceedings of AI 2019: Advances in Artificial Intelligence</source>
          , volume
          <volume>11919</volume>
          of Lecture Notes in Computer Science, Springer,
          <year>2019</year>
          , pp.
          <fpage>357</fpage>
          -
          <lpage>368</lpage>
          . URL: https://doi.org/10.1007/978-3-
          <fpage>030</fpage>
          -35288-2_
          <fpage>29</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Blei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Y.</given-names>
            <surname>Ng</surname>
          </string-name>
          ,
          <string-name>
            <surname>M. I. Jordan</surname>
          </string-name>
          , Latent dirichlet allocation,
          <source>J. Mach. Learn. Res</source>
          .
          <volume>3</volume>
          (
          <year>2003</year>
          )
          <fpage>993</fpage>
          -
          <lpage>1022</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Vaswani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Shazeer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Parmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Uszkoreit</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jones</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. N.</given-names>
            <surname>Gomez</surname>
          </string-name>
          , L. u. Kaiser,
          <string-name>
            <surname>I. Polosukhin</surname>
          </string-name>
          ,
          <article-title>Attention is all you need</article-title>
          , in: I. Guyon,
          <string-name>
            <given-names>U. V.</given-names>
            <surname>Luxburg</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wallach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fergus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Vishwanathan</surname>
          </string-name>
          , R. Garnett (Eds.),
          <source>Advances in Neural Information Processing Systems</source>
          , volume
          <volume>30</volume>
          ,
          <string-name>
            <surname>Curran</surname>
            <given-names>Associates</given-names>
          </string-name>
          , Inc.,
          <year>2017</year>
          . URL: https://proceedings.neurips.cc/ paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Lei</surname>
          </string-name>
          , Unsupervised Learning: Word Vector, Springer Singapore, Singapore,
          <year>2021</year>
          , pp.
          <fpage>95</fpage>
          -
          <lpage>149</lpage>
          . URL: https://doi.org/10.1007/
          <fpage>978</fpage>
          -981-16-2233-
          <issue>5</issue>
          _7. doi:
          <volume>10</volume>
          .1007/
          <fpage>978</fpage>
          -981-16-2233-
          <issue>5</issue>
          _
          <fpage>7</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D.</given-names>
            <surname>Rothman</surname>
          </string-name>
          ,
          <article-title>Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow</article-title>
          , BERT, RoBERTa, and more, Packt Publishing,
          <year>2021</year>
          . URL: https://books.google.de/books?id=
          <fpage>Cr0YEAAAQBAJ</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>J.</given-names>
            <surname>Devlin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Toutanova</surname>
          </string-name>
          ,
          <article-title>BERT: pre-training of deep bidirectional transformers for language understanding</article-title>
          , CoRR abs/
          <year>1810</year>
          .04805 (
          <year>2018</year>
          ). URL: http://arxiv. org/abs/
          <year>1810</year>
          .04805. arXiv:
          <year>1810</year>
          .04805.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>