<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Preface to the Understanding Literature References in Academic Full Text workshop at JCDL 2022</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Anastasiia Iurshina</string-name>
          <email>Anastasiia.Iurshina@ipvs.uni-stuttgart.de</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Muhammad Ahsan Shahid</string-name>
          <email>Ahsan.Shahid@gesis.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tobias Backes</string-name>
          <email>Tobias.Backes@gesis.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Philipp Mayr</string-name>
          <email>Philipp.Mayr@gesis.org</email>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Staab</string-name>
          <email>Steffen.Staab@ipvs.uni-stuttgart.de</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2022</year>
      </pub-date>
      <abstract>
        <p>This preface describes the Understanding Literature References in Academic Full Text (ULITE) workshop. ULITE was held as a virtual event on June 24, 2022. It was co-located with the Joint Conference on The goal of the ULITE workshop1 at JCDL 2022 is to engage communities interested in the broad topic of literature reference understanding and automatic processing of scientific fulltext publications. Our workshop has a focus on working with open infrastructures/tools and ofering the extracted information as open data for reuse. Our view is to expose people from one community to the work of the respective other community and to foster fruitful interaction across communities.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>1. Introduction
(S. Staab)
CEUR
Workshop
Proceedings</p>
    </sec>
    <sec id="sec-2">
      <title>2.1. Keynote</title>
      <sec id="sec-2-1">
        <title>We had one keynote speaker:</title>
        <p>Silvio Perroni (University of Bologna, Italy) OpenCitations: a short introduction: In
this paper, Silvio introduced a brief history of open citations, their main characteristics and
use in the context of OpenCitations, a scholarly infrastructure organisation dedicated to open
scholarship and the publication of open bibliographic and citation data using Semantic Web
technologies.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2.2. Research papers</title>
      <sec id="sec-3-1">
        <title>Four research papers were presented at ULITE.</title>
        <p>• Frederik Arnold and Robert Jäschke:</p>
        <p>A Game with Complex Rules: Literature References in Literary Studies
• Christian Boulanger and Anastasiia Iurshina:</p>
        <p>Extracting bibliographic references from footnotes with EXcite-docker
• Bastian Birkeneder, Philipp Aufenvenne, Christian Haase, Philipp Mayr and Malte
Steinbrink:</p>
        <p>Extracting literature references in German Speaking Geography – the GEOcite project
• Tarek Saier, Meng Luan and Michael Färber:</p>
        <p>A Blocking-Based Approach to Enhance Large-Scale Reference Linking</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>2.3. Invited talks</title>
      <p>Four invited talks were given, for two of them papers were submitted:
• Arcangelo Massari and Ivan Heibi:</p>
      <p>How to structure citations data and bibliographic metadata in the OpenCitations accepted
format
• Silvia Eunice Gutiérrez De la Torre, Julián Equihua, Andreas Niekler and Manuel
Burghardt:
Into the bibliography jungle: using random forests to predict dissertations’ reference
section</p>
      <sec id="sec-4-1">
        <title>The two talks without papers:</title>
        <p>• Bikash Joshi</p>
        <p>”Inline Citation Extraction from Scientific Manuscripts”
• Swati Sanagar</p>
        <p>”Finest Tool for Bibliography Reference Matching to Article and Deduplication”
The main outcome of the joined discussion between participants is the decision to join forces
in creating a multi-domain golden standard dataset for literature references extraction and
segmentation. It is clear that the lack of annotated data is one of the most serious limitations
for the progress in the task of automatic reference extraction and segmentation. As annotating
of the data is a very time-consuming and laborious process, it is dificult for one team to
obtain enough data. However, by combining several smaller datasets, we can create one of the
substantial size. In addition to the size, as the participants come from very diferent domains
(law, literature, geography etc), the format of the annotated articles would be very diverse.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>3. Workshop outcome</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>