<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>D. M. Blei, A. Y. Ng, M. I. Jordan, Latent dirichlet allocation, J. Mach. Learn. Res.</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1016/j.acalib.2014.05.016</article-id>
      <title-group>
        <article-title>Humanities-Centered AI: From Machine Learning to Machine Training</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Ralf Möller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Universität zu Lübeck, Institute of Information Systems</institution>
          ,
          <addr-line>Ratzeburger Allee 160, 23562 Lübeck</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2014</year>
      </pub-date>
      <volume>3</volume>
      <issue>2003</issue>
      <abstract>
        <p>In the essay it is argued that machine learning controlled by IT specialists must be replaced with machine training, such that domain experts, e.g., humanities scholars. Furthermore, it is argued that training a machine for a particular task must have a positive impact also on related but diferent tasks. Only then, one can speak of “true learning” or machine education, an area that is investigated in the emerging field of Humanities-Centred Artificial Intelligence.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Humanities</kwd>
        <kwd>AI</kwd>
        <kwd>Machine Learning</kwd>
        <kwd>Machine Training</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. The Claim</title>
      <p>
        Machine learning is a hot topic these days, for good reasons [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Given one half of a large set
of training data from a specific application context, computer scientists with a data science
background can select model classes and then run algorithms, first, to automatically find
appropriate (multiscale) encodings of data, and second, to automatically determine huge sets
of model class parameters to derive a specific model or a specific program from training data.
With automatic tests to check the performance of learning outcomes on the other half of the
training data and possibly learning iterations, the learning result can be further optimized, i.e.,
so-called hyperparameter values can be suitably fixed or alternate components can be selected
for a model class. With the idea of reinforcement learning [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], learning can also be carried out
automatically at runtime, albeit within certain limits because the learning space needs to be
designed at setup time. The well-known learning approaches dominate artificial intelligence (AI)
because – with just training and test data provided by domain experts – computer scientists on
their own can successfully build models (or programs) that can be used in specific applications to
be used by domain experts. In contrast, the modeling approach to specify domain knowledge [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]
requires domain experts to learn and use appropriate formal modeling languages, and usually
also intensive cooperation with computer scientists is required to build appropriate models (or
programs) that can be used successfully as part of applications.
      </p>
      <p>While the computer scientists work alone learning perspective has its merits, it fails if, first,
a very useful declarative model for a certain problem context is already well-established in
the domain (but possibly unknown to computer scientists) and reconstructing the model by
learning will produce subordinate results, or, second, the final goal of the learning process is not
known in advance, i.e., a system is to be constructed that is initially configured using machine
learning, and then used to carry out certain tasks, with subsequent adaptations required by
the application context. Adaptations are usually not foreseeable in the beginning such that
it would hardly be possible to set up a respective reinforcement mechanism at system design
time. While it might be possible to incrementally specify sets of new training data and then
restart machine learning processes, computer scientists remain in the loop all the time this
way. Computer science expertise is a scarce resource, however. Thus, we argue that domain
experts need to be enabled to control the learning process themselves. Most domain experts
will, however, not be IT experts, and it is considered doubtful that often-discussed concepts for
ensuring (or improving) data literacy for domain experts will ever work in practice.</p>
      <p>The claim is that next-generation AI has the enormous potential to overcome today’s
computer-science perspective of system design by machine learning (including adaptation
by reinforcement) and should move towards a much more powerful paradigm of systems being
trained by domain experts in a determined way. While initial machine learning, possibly with
incremental improvements by reinforcement w.r.t. a spectrum that is foreseen at design time
(current AI perspective) is indeed possible and useful, the domain-expert training perspective is
required to cope with adaptation requirements occurring in almost all serious applications of
intelligent systems in real world contexts.</p>
    </sec>
    <sec id="sec-2">
      <title>2. From Machine Learning to Machine Training</title>
      <p>Let us consider an information retrieval (IR) scenario in a humanities research context. Current
IR systems are not tailored towards a specific domain, which is good on the one hand as the
engines are indeed quite versatile then. On the other hand, the catch is that current IR engines
can hardly be tailored by its users, e.g., humanities scholars to fulfill specific needs.
Nextgeneration AI should enable scholars, as examples for non-IT-specialists, to train intelligent
systems on the job.</p>
      <p>
        Providing a set of documents as a reference library and, in addition, a query string, will
allow the intelligent agent [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] behind a search engine, first, to focus on documents in the
respective repository that match the search string and, second, to relate documents to similar
or complementary reference library documents by exploiting, e.g., topic models [5]. Given
reference library documents selected in beforehand by a scholar, the agent is trained w.r.t.
topic models and, possibly, w.r.t. interesting datasets that are referred in the documents of
the reference library. Thus, not only documents can be found, but also related datasets, be
they similar or complementary. If the reference library is extended (or adapted) the agent can
explain the corresponding changes of result sets w.r.t. changes in the result of previous queries.
This way, the trainer, namely the humanities scholar, can directly see the efect of providing a
reference library. In the case of humanities research, this kind of training to build topic models
will enable the search agent underneath to also compare heterogeneous datasets based on the
association of the datasets with the papers that discuss or describe them.
In general, we assume that the agent manages to exploit the directives of humanities scholars
by updating an internal model ℳ, a search string and a topic model in the example above. With
new documents provided as an extension to the reference library, the model ℳ is converted
into a new model ℳ′. With the updated ℳ′, the notion of relatedness between documents
is adapted. While this is an important efect, it is merely a local efect, namely an efect on
the efectiveness of topic-based information retrieval. Local efects on a single model define
the standard learning mode on which many, if not all, artificial intelligence learning scenarios
are based. Transforming ℳ into ℳ′ via machine training can, however, hardly be seen as
education because the learning efect is indeed visible only for the model of a very specific task
that the agent carries out. The question is: How can updates to the reference library and the
corresponding improvements in IR via an updated model ℳ′ be exploited to positively afect
the fulfillment of other goals the agent might be given, that is, other goals alongside IR goals?
On the other hand, if other models not related to information retrieval are updated, this should
indeed also have efects on information retrieval. Only if (significant) efects of the update of a
goal-specific model ℳ to ℳ′ on other goals were achieved, machine training would be efective
in the long run, and only then we can talk about education. Thus, being educated requires that
training w.r.t. a specific task (or goal) also has some (positive) impact on other tasks (or goals)
of the agent, which is the intrinsic idea of education. Learning resulting in education is called
true learning. The question is: how can this be achieved?
      </p>
      <p>To solve the tasks assigned to it an agent uses algorithms for solving problems defined on
a model to be used for a task. To be more precise, usually the problems are defined w.r.t. the
interpretation ℐ(ℳ) of the model ℳ. The function ℐ defines the semantics of the model, and
we write ℳℐ instead of ℐ(ℳ) for brevity. Please note that the semantics given by ℐ is just
used to define computational problems, there is no need to compute the value ℐ(ℳ), which
might even be infinite. A small example is appropriate here.</p>
      <p>Let us assume that ℳ 1 is a search string plus a topic model as indicated above. Both are
used to specify an information need in an information retrieval task  1. In this case, ℳℐ1 is
the set of documents from a repository that match the search string ℳ 1 and the topic model.
An (inference) problem is to check whether a given document  found in the repository is in
ℳℐ1 (relevance decision problem  ∈ ℳℐ1 ). Checking the relevance decision problem for
all documents  in a repository and returning all  for which this is the case, is called the
information retrieval problem. The information retrieval problem is used to formalize task  11.</p>
      <p>Let us further assume that for (or while) carrying out another task  2, new knowledge about
synonyms for words is made available to the agent via training, i.e., another model ℳ 2 is
extended (or adapted) to obtain ℳ′ 2 . If the efect of training an agent w.r.t. a task  2 by
transforming ℳ 2 into ℳ′ 2 is to be called education, we must make sure that the change
to ℳ 2 by learning is also efective for problem solving used for carrying out other tasks,
information retrieval  1 say. We now assume that formal problems used to solve  1 are defined
w.r.t. ℳℐ1 as indicated above. Since ℳℐ1 is not changed, or the change of ℳ 2 into ℳ′ 2 to be
1We neglect ranking here.
efective on  1 the change must indeed have an influence on the interpretation ℐ used to define
the information retrieval problem. In contrast to standard learning processes that transform
ℳ 2 into ℳ′ 2 , education processes should not only transform ℳ 2 into ℳ′ 2 but should also
transform ℐ into ℐ′. Now, if ℐ was transformed into ℐ′, this would mean that knowledge about
synonyms acquired for task  2 is used for the information retrieval task  1. Indeed, when now
ℐ′ is used in problems for  1, we still have ℳ 1 but need to deal with ℳℐ1′ that possibly denotes
a larger (or adapted) set of documents. Changing ℳ 2 into ℳ′ 2 can be seen as education when,
as a by-product, ℐ is transformed into ℐ′ and then ℐ′ is used further on to adapt the semantics
of the unchanged model ℳ 1 used in the IR task  1. This is what we have in mind when talking
about educating by training and true learning. Algorithms for solving the information retrieval
problem problems based on ℐ′ need to be automatically adapted to realize ℐ′, and it is all but
clear how this can be accomplished.</p>
    </sec>
    <sec id="sec-3">
      <title>4. True Learning in the Humanities</title>
      <p>In the use of digital libraries, where humanities scholars often have to choose from a variety
of search criteria, not only do problems arise in users’ choice of efective criteria, but users
are also characterized by diferent personas [ 6]. That is, the same information need exists for
the diferent personas, but the model interpretation ℳℐ varies and leads to diferent results
for the classical approaches such as retrieving the goal-specific model ℳ′ from ℳ, although
the results should be the same here. The problem of identifying the interpretation needs of
humanities scholars via personas has so far been attempted to be solved via approaches such as
information seeking [7]. However, as argued above, it is a form of machine training which has
the previously identified personas as input.</p>
      <p>True learning is, for example, to identify the personas based on ℐ, and then transform ℐ to ℐ′
to identify the information need of the humanities scholars. Another approach for true learning
in the Humanities is, for example, to identify the humanities scholars’ context, which could be
represented as ℐ initially. In addition, ℐ and the search string ℳ′ must be treated diferently in
the field of the Humanities than in classical information retrieval approaches because the human
understanding of a term and the treatment of the same term by the computer is diferent [7].</p>
      <p>We argue, however, that artificial intelligence must evolve into the direction illustrated above
to support true learning while still being beneficial [ 8], a direction that is investigated in the
new field of Humanities-Centred Artificial Intelligence (CHAI).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A. V.</given-names>
            <surname>Joshi</surname>
          </string-name>
          ,
          <source>Machine Learning and Artificial Intelligence</source>
          , Springer,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>R.</given-names>
            <surname>Sutton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Barto</surname>
          </string-name>
          ,
          <source>Reinforcement Learning: An Introduction (2nd Ed.)</source>
          , MIT Press,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>C. M.</given-names>
            <surname>Bishop</surname>
          </string-name>
          ,
          <article-title>Model-based machine learning (</article-title>
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .1098/rsta.
          <year>2012</year>
          .
          <volume>0222</volume>
          ,
          <issue>371</issue>
          (
          <year>1984</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Russell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Norvig</surname>
          </string-name>
          , Artificial Intelligence:
          <article-title>A Modern Approach (4th Edition)</article-title>
          , Pearson,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>