<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>5th Workshop on Patent Text Mining and Semantic Technologies (PatentSemTech)</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>collocated with the</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>th International ACM SIGIR Conference on Research</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Development in Information Retrieval</string-name>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ralf Krestel</string-name>
          <xref ref-type="aff" rid="aff5">5</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hidir Aras</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Linda Andersson</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Florina Piroi</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Allan Hanbury</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Dean Alderucci</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff6">6</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Artificial Researcher IT GmbH</institution>
          ,
          <addr-line>Taubstummengasse 11 (i2c), 1040, Wien</addr-line>
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Center for AI and Patent Analysis, Carnegie Mellon University</institution>
          ,
          <addr-line>5000 Forbes Avenue, Pittsburgh, PA 15213</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>FIZ Karlsruhe - Leibniz Institute for Information Infrastructure</institution>
          ,
          <addr-line>Hermann-von-Helmholtz-Platz 1, 76344 Eggenstein-Leopoldshafen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Institute of Information Systems Engineering</institution>
          ,
          <addr-line>TU Wien, Favoritenstr. 9-11/194-04, Vienna</addr-line>
          ,
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>RSA FG Studio Data Science</institution>
          ,
          <addr-line>Thurngasse 8/16, 1090, Vienna</addr-line>
          <country country="AT">Austria</country>
        </aff>
        <aff id="aff5">
          <label>5</label>
          <institution>ZBW - Leibniz Information Centre for Economics &amp; Kiel University</institution>
          ,
          <addr-line>Düsternbrooker Weg 120, 24105, Kiel</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
        <aff id="aff6">
          <label>6</label>
          <institution>Ralf Krestel (ZBW - Leibniz Information Centre for Economics &amp; Kiel University, Germany) • Hidir Aras (FIZ Karlsruhe, Germany) • Linda Andersson (Artificial Researcher IT GmbH</institution>
          ,
          <addr-line>Vienna</addr-line>
          ,
          <country country="AT">Austria) •</country>
          <institution>Florina Piroi (TU Wien &amp; Data Science Studio</institution>
          ,
          <addr-line>Vienna, Austria) • Allan Hanbury (TU Wien</addr-line>
          ,
          <country country="AT">Austria) •</country>
          <institution>Dean Alderucci, Carnegie Mellon University</institution>
          ,
          <addr-line>Pittsburgh</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Preface</title>
      <p>The fifth edition (PatentSemTech2024) of the workshop series Patent Text Mining and Semantic
Technologies was held as a full-day event in conjunction with the SIGIR 2024 conference. As
in the previous editions, the workshop focused on new developments and research in patent
retrieval and patent analytics. An important focus of the workshop was to address the adaptation
of existing deep learning models, e.g. large language models, for the patent domain, covering
diverse scientific subject areas, such as chemistry, pharmacology, etc. In general, patent data
is more dificult to analyse compared to corpora comprising other text genres. Working with
patent data, besides its challenging aspects, does bring a richness of facets to be exploited
with text mining and semantic analysis methods as well: (1) It constitutes a huge corpus of
scientific-technical documents for a variety of technological domains. (2) They are rich in
available meta-data such as spatial data, bibliographic data, classifications, temporal data, etc.
(3) Patents describe essential scientific-technical knowledge enclosing solutions for real-world
applications. (4) They are complementary knowledge to scientific literature, e.g. chemical and
physical properties, bio-science knowledge for drug-target-interaction, which appears first in
patents, mostly not published elsewhere. With the PatentSemTech2024 workshop we continued
our series of workshops launched in 2019, aiming to establish a long-term collaboration and a
two-way communication channel between the IP industry and academia from relevant fields.
Therefore, the 5th PatentSemTech workshop was organized as a full-day event with 10 research
paper presentations that were accepted after peer-review out of 17 submissions. 6 long papers
were presented as oral presentations while 4 short papers were presented as posters. In addition,
Matthew Wahlrab, CEO of RapidAlpha, gave a keynote speech on "Unlocking Strategic Growth:
The Role of AI Technology in Intellectual Property". In an open discussion on "How to transform
research insights into products?", the workshop participants exchanged ideas and reported their
experience with applying AI in the patent domain. The workshop closed with Linda Andersson
looking back at 5 successful PatentSemTech workshops and how the field has developed over
these years.</p>
      <sec id="sec-1-1">
        <title>Germany, Austria, USA, July 2024</title>
      </sec>
      <sec id="sec-1-2">
        <title>Ralf Krestel,</title>
        <p>Hidir Aras,
Linda Andersson,</p>
        <p>Florina Piroi,
Allan Hanbury,
Dean Alderucci
Organizers
Program Committee</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Website</title>
      <p>Further information on the topics, schedule, and further developments of the PatentSemTech
workshop can be found on the website: http://ifs.tuwien.ac.at/patentsemtech/</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>