=Paper=
{{Paper
|id=Vol-3630/preface
|storemode=property
|title=None
|pdfUrl=https://ceur-ws.org/Vol-3630/preface.pdf
|volume=Vol-3630
}}
==None==
2023 LWDA CONFERENCE
Lernen, Wissen, Daten, Analysen
(LWDA) Conference Proceedings
LWDA’23
October 09-11, 2023
Marburg, Germany
I
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Editors:
Michael Leyer
Philipps-University of Marburg, Germany/Queensland University of Technology, Australia
Johannes Wichmann
Philipps-University of Marburg, Germany
II
© 2023 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume
is published and copyrighted by its editors.
Preface
LWDA 2023 conference provides a joint forum for experienced and young researchers, to
bring insights to recent trends, technologies and applications and to promote interaction in the
research field of big data and beyond.
The acronym LWDA expands in German to “Lernen. Wissen. Daten. Analysen.” (English:
“Learning. Knowledge. Data. Analytics.”). Recent research in the field is presented and dis-
cussed from the viewpoint of machine learning, data mining, knowledge extraction,
knowledge management, information retrieval, personalization, database management, infor-
mation systems, big data management and big data analytics to name a few.
The LWDA conference series comprises the workshops BIA, DB, FGWM, IR and KDML
which are organized by the respective special interest groups within the German Computer
Science Society:
– FG-BIA 2023 – Business Intelligence & Analytics
– FG-DB 2023 - Database Systems
– FG-FGWM 2023 - Knowledge Management
– FG-IR 2023 - Information Retrieval
– FG-KDML 2023 - Knowledge Discovery, Data Mining and Machine Learning
The papers published in LWDA 2023 proceedings have been selected by independent pro-
gram committees from the respective fields. The program consists of four invited keynotes
and two joint research sessions as well as the community meetings of the special interest
groups. In addition to these joint sessions, there are five parallel research sessions for each of
the workshops focusing on more specific topics. A joint poster session gives all presenters the
opportunity to discuss their work in a broader context. This year’s social program includes a
city tour for further interaction on the second evening.
Our distinguished keynote speakers are:
– Prof. Dr. Hannes Mühleisen - Radboud Universiteit Nijmegen
– Prof. Dr. Erhard Rahm – University of Leipzig
– Prof. Dr. Michael Granitzer – University of Passau
– Dr. Dietrich Alexander Herberg
The working group for Digitization & Process Management at the Philipps-University of
Marburg is proud to host the LWDA 2023 conference. For the technical program the organizer
would like to thank the workshop chairs and their programme committees for their hard work
as well as the keynote speakers for their inspiring talks. We hope the participants will keep the
venue as an inspiring event with fruitful discussions in mind and the readers will enjoy study-
ing the scientific contributions in this proceedings volume. The proceedings are published with
CEUR and can be found here. http://ceur-ws.org/Vol-1917
III
Marburg, Germany, October 2023
Conference Organization
General Chair
Michael Leyer Philipps-University of Marburg/Queensland University of Technology
Program Chairs
Tanja Auge University of Regensburg
Henning Baars University of Stuttgart
Andreas Henrich Otto-Friedrich-University of Bamberg
Thomas Mandl University of Hildesheim
Thorsten Papenbrock Philipps-University of Marburg
Pascal Reuss University of Hildesheim
Jakob Schönborn University of Hildesheim
Helge Spieker Simula Research Laboratory
Felix Stamm Rheinisch-Westfälische Technische Hochschule Aachen
Program Committee
Bernhard Seeger Philipps-University of Marburg
Hazar Harmouch Hasso-Plattner-Institute
Uta Störl Fernuniversität Hagen
Stefan Schulte TUHH Institute for Data Engineering
Fabian Panse Hasso-Plattner-Institute
Benjamin Hättasch TU Darmstadt
Annett Ungethüm TU Dresden
Marina Tropmann-Frick HAW Hamburg
Hannes Grunert University of Rostock
Carsten Felden TU Bergakademie Freiberg
Ralf Finger Information Works
Sebastian Olbrich Deloitte
Martin Atzmueller University of Osnabrück
Christian Bauckhage Fraunhofer IAIS / University of Bonn
Ulf Brefeld Leuphana University of Lüneburg
Mirko Bunse TU Dortmund
Dennis Groß Radboud University
Andreas Hotho University of Würzburg
Eyke Hüllermeier LMU Munich
Robert Jäschke HU Berlin
Christian Kühnert Fraunhofer IOSB
Thomas Liebig TU Dortmund
Thomas Seidl LMU Munich
IV
Pascal Welke TU Wien
Stefan Wrobel Fraunhofer IAIS / University of Bonn
Kerstin Bach Norwegian University of Science and Technology
Joachim Baumeister University of Würzburg
Ralph Bergmann University of Trier
Viktor Eisenstadt Universiy of Applied Sciences of Hanover
Jörg Cassens University of Hildesheim
Lisa Grumbach University of Trier
Andrea Kohlhase University of Applied Sciences Neu-Ulm
Michael Kohlhase University of Nuremberg-Erlangen
Ulrich Reimer University of Applied Sciences of St. Gallen
Christian Severin Sauer University of Hildesheim
Christian Zeyen DFKI
Andreas Korger denkbares
Johannes Wichmann Philipps-University of Marburg
Mirjam Minor University of Frankfurt
Carsten Wenzel University of Hildesheim
Anna Faust HU Berlin
Klaus Berberich University of Applied Sciences of Saarbrücken
Norbert Fuhr University of Duisburg-Essen
Ralf Krestel Christian-Albrechts-University of Kiel
Christin Katharina Kreutz Technical Applied University of Cologne
Jochen L. Leidner Applied University of Coburg
Dirk Lewandowski HAW Hamburg
Philipp Schaer Technical Applied University of Cologne
Ralf Schenkel University of Trier
V
Table of Contents
Risk Identification of Data Science Projects: A Literature Review .................................................................. 1
Maike Holtkemper, Maria Potanin, Alexander Oberst and Christian Beecks
Designing an Analytical Control Chart System with ML-predicted Quality Characteristics ............................ 14
Till Carlo Schelhorn, Jonas Gunklach and Alexander Maedche
Exploiting Foundation Models for Spoken Language Identification ................................................................ 28
Benedikt Augenstein and Darjan Salaj
Accelerating literature screening for systematic literature reviews with Large Language Models
– development, application, and first evaluation of a solution .......................................................................... 41
Paul Herbst and Henning Baars
Datengenossenschaften als Datentreuhänder – Eine qualitative Analyse von Pilotprojekten ........................... 52
Maximilian Werling, Patrick Weber and Heiner Lasi
Governance of Artificial Intelligence – A Framework Towards Ethical AI Applications ................................ 63
Jens F. Lachenmaier, Maximilian Werling and Dominik Morar
Integrating Machine Learning into SQL with Exasol ....................................................................................... 73
Christoph Großmann and Johannes Schildgen
Database and Workflow Optimizations for Spatial-Geometric Queries in GeoMine ....................................... 86
Martin Poppinga, Joel Graef, Konrad Diedrich, Matthias Rarey and Norbert Ritter
SKYSHARK: A Benchmark with Real-world Data for Line-rate Stream Processing with FPGAs ................. 98
Maximilian Langohr, Tim Vogler and Klaus Meyer-Wegener
SuMExplorer: Summarisation-based Frequent Subgraph Mining for Visual Exploratory Subgraph Searching 110
Chimi Wangmo and Lena Wiese
Enhancing Data Acquisition and Fault Analysis for Large-Scale Facilities: A Case Study on the
Laser-Based Synchronization System at the European X-Ray Free-Electron Laser ......................................... 121
Arne Grünhagen, Maximilian Schütte, Annika Eichler, Marina Tropmann-Frick and Görschwin Fey
Heterogeneity in NoSQL Databases —Challenges of Handling schema-less Data .......................................... 134
Mark Lukas Möller, Dominique Hausler, Sebastian Strasser, Tanja Auge and Meike Klettke
Pythagoras: Semantic Type Detection of Numerical Data Using Graph Neural Networks .............................. 146
Sven Langenecker, Christoph Sturm, Christian Schalles and Carsten Binnig
Patient trajectory visualization for FHIR healthcare data: A use case on melanoma patients .......................... 153
Meijie Li, Wolfgang Galetzka, Bahadir Eryilmaz, Georg Christian Lodde, Elisabeth Livingstone,
Jörg Schlötterer and Christin Seifert
VI
CLEARNESS: Coreference Resolution for Generating and Ranking Arguments Extracted
from Debate Portals for Queries ....................................................................................................................... 161
Johannes Weidmann, Lorik Dumani and Ralf Schenkel
The Information Retrieval Experiment Platform .............................................................................................. 175
Maik Fröbe, Jan Heinrich Reimer, Sean MacAvaney, Niklas Deckers, Janek Bevendorff,
Benno Stein, Matthias Hagen and Martin Potthast
Applied Face Recognition in the Humanities ................................................................................................... 179
Martin Bullin and Andreas Henrich
Automatic Classification of Portraits: Application of Transformer and CNN Based Models
for an Art Historic Dataset ................................................................................................................................ 192
Sebastian Diem and Thomas Mandl
Comparative Survey of German Hate Speech Datasets: Background, Characteristics and Biases ................... 207
Markus Bertram, Johannes Schäfer and Thomas Mandl
Preliminary Results of a Scientometric Analysis of the German Information
Retrieval Community 2020-2023 ...................................................................................................................... 222
Philipp Schaer, Svetlana Myshkina and Jüri Keller
A Testbed for Dual-Entity Knowledge Panels .................................................................................................. 231
Leon Martin and Andreas Henrich
Vertical Search Scenarios within a Digital Study Planning Assistant............................................................... 239
Tobias Hirmer, Michaela Ochs and Andreas Henrich
Integrating BDI Agents with the MATSim Traffic Simulation for Autonomous Mobility on Demand ........... 247
Marcel Mauri, Ömer Ibrahim Erduran, Thu Pham Dieu Anh and Mirjam Minor
KIRETT: Knowledge-Graph-Based Smart Treatment Assistant for Intelligent Rescue Operations................. 259
Mubaris Nadeem, Johannes Zenkert, Lisa Bender, Christian Weber and Madjid Fathi
SKOS-Utils: Developing and Checking SKOS Knowledge Graphs (Tool Presentation) ................................. 271
Joachim Baumeister and Valentin Roß
Case-Based Sample Generation using Multi-Armed Bandits .......................................................................... 282
Andreas Korger and Joachim Baumeister
Combining Information Retrieval and Large Language Models for a Chatbot that
Generates Reliable, Natural-style Answers ...................................................................................................... 298
Andreas Lommatzsch, Brandon Llanque, Vinay Srinath Rosenberg, Syed Ali Murad Tahir,
Hristo Dimitrov Boyadzhiev and Maurice Walny
The Data Dilemma: Google Analytics’ Untapped Potential and Web Data Literacy ....................................... 311
Tom Alby
VII
Bridging the Gap: Examining the trust dimensions of smart contracts using supply chain applications .......... 325
Wieland Müller and Michael Leyer
A Feature-wise Comparative Assessment of the CBR-based Methodologies FLEA and SEASALT .............. 339
Viktor Eisenstadt, Jessica Bielski, Christoph Langenhan, Klaus-Dieter Althoff and Andreas Dengel
Comparative Analysis of Text-Based CBR Algorithms for Cybercrime Profiling Investigations .................... 347
Marc Krüger
Cover Song Identification in Practice with Multimodal Co-Training ............................................................... 359
Simon Hachmeier and Robert Jäschke
Higher-Order DeepTrails: Unified Approach to *Trails ................................................................................... 372
Tobias Koopmann, Jan Pfister, André Markus, Astrid Carolus, Carolin Wienrich and Andreas Hotho
Fast k-Nearest-Neighbor-Consistent Clustering ............................................................................................... 387
Lars Lenssen, Niklas Strahmann and Erich Schubert
Preprocessing Ground-Based Hyperspectral Image Data for Improving CNN-based Classification ............... 399
Andreas Schliebitz, Heiko Tapken and Martin Atzmueller
Free-Energy Advantage Functions for Policy Transfer to Noisy Environments with Safety Constraints ........ 414
Pierre Haritz and Thomas Liebig
Automatic Speech Detection on a Smart Beehive’s Raspberry Pi .................................................................... 424
Pascal Janetzky, Philip Lissmann, Andreas Hotho and Anna Krause
Comparing Humans and Algorithms in Feature Ranking: A Case-Study in the Medical Domain ................... 430
Jonas Hanselle, Jaroslaw Kornowicz, Stefan Heid, Kirsten Thommes and Eyke Hüllermeier
Biomedical Event Extraction with Generative Language Models .................................................................... 442
Fabio Barth, Leon Weber-Genzel and Ulf Leser
Liquor-HGNN: A heterogeneous graph neural network for leakage detection in water distribution networks 454
Melanie Schaller, Michael Steininger, Andrzej Dulny, Daniel Schlör and Andreas Hotho
A Document Tagging Support System for Nursing Care Experts..................................................................... 470
Beat Tödtli1, Sebastian Müller, Melanie Rickenmann, Janine Vetsch and Simon Haug
Efficient Light Source Placement using Quantum Computing ......................................................................... 478
Sascha Mücke and Thore Gerlach
Contextual Preselection Methods in Pool-based Realtime Algorithm Configuration ....................................... 492
Jasmin Brandt, Elias Schede, Shivam Sharma, Viktor Bengs, Eyke Hüllermeier and Kevin Tierney
A Few Models to Rule Them All: Aggregating Machine Learning Models ..................................................... 506
Florian Siepe, Phillip Wenig and Thorsten Papenbrock
VIII
Applicability of Models Trained on Generated Clinical German Datasets on Out-domain Data ..................... 521
Oğuz Şerbetçi and Ulf Leser
IX