<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Lisa Zimmermann</string-name>
          <email>lisa.zimmermann@unisg.ch</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Process Mining, Question Design, Question Refinement, End-User Support</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of St. Gallen</institution>
          ,
          <addr-line>St Gallen</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper introduces the Process Mining Question Forge (PMQF), a tool that supports the design and refinement of questions for process analysis projects. Motivated by the observation that formulating well-defined questions is essential in process analysis, PMQF addresses challenges such as dificulty in designing appropriate questions and issues arising from poorly defined ones, aiming to improve the overall efectiveness of process analysis projects. In particular, it guides users in viewing, selecting, and refining example questions for their own use cases.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        2024.
analysts confirmed that question formulation is a significant challenge in PM projects, often
arising when analysts work with questions that are unclear, overly specific, or too broad (as in
the second example above) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ].
      </p>
      <p>
        In this work, we address this problem and introduce the Process Mining Question Forge
(PMQF), a tool that implements guidance for the crucial task of question design in PM projects.
We developed PMQF based on findings from [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], that highlight that (experienced) analysts might
rely on domain knowledge or analysis templates when confronted with a project that starts
without clear questions. In practice, especially less experienced analysts lack this knowledge and
templates are not always available. Therefore PMQF leverages categorized example questions
and a classification schema to help users in designing their own questions. PMQF can be set
up with any custom set of questions and a respective categorization schema. On top of these
resources, it guides users to design their questions in a structured way.
      </p>
    </sec>
    <sec id="sec-3">
      <title>2. Tool Description and Features</title>
      <p>PMQF has been developed as a web application in Python using Flask1. When it is set up, it
expects a categorized set of analysis questions and the corresponding classification schema
as input. The source code of PMQF can be found at https://github.com/promise-ics-hsg/
demoApplication-PMQF, and a deployed version can be accessed via http://130.82.168.60:5000/.
In the deployed version, PMQF runs on an exemplary set of 405 categorized analysis questions
that we gathered from diverse sources, such as the BPI Challenges2 or Case Studies, and a
classification schema that classifies questions across six dimensions. We provide a video
demonstrating how the tool can be used: https://drive.switch.ch/index.php/s/Y9cW0Nk3DEmdVOR.
PMQF supports users in (i) retrieving an overview of questions and their classification, (ii)
designing new questions, and (iii) clarifying and refining existing questions.</p>
      <sec id="sec-3-1">
        <title>2.1. Keyword Search</title>
        <p>PMQF features an advanced keyword search that allows for the eficient location of relevant
analysis questions. Users can enter keywords related to their areas of interest and the tool returns
a list of questions that match this search criteria. To this end, we integrated the computation of
synonyms based on wordnet3 (using the nltk library4 ). The keyword search is implemented for
project teams, learners, or teachers who are interested in a specific concept of PM and aim to
retrieve an overview of what kind of questions they could ask in this regard.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Question Design</title>
        <p>The question design feature supports users in formulating analysis questions through three
phases. It is particularly suited for less experienced analysts or project teams without extensive
expertise who will benefit from guidance on how to iteratively identify areas of interest.
1https://flask.palletsprojects.com/en/3.0.x/
2https://www.tf-pm.org/competitions-awards/bpi-challenge
3https://wordnet.princeton.edu/
4https://www.nltk.org/
(a) Phase 1
(b) Phase 3
Category Selection: In the first phase, users are directed to select categories based on the
dimensions of the classification schema (Fig. 1a). For each dimension, we suggest a guiding
question that helps them to choose categories that are aligned with their project goals. Definitions
for all categories are displayed when hovering over the buttons.</p>
        <p>Question Filtering: After selecting categories, PMQF filters the available questions accordingly
and displays the resulting set. Users are asked to review the questions and select and save those
they identify as relevant for further investigation.</p>
        <p>Question Customization: In the last phase, users conduct their final review by discarding or
reformulating questions to fit their domain-specific terminology (Fig. 1b). We assume that the
reformulation maintains the original question categorization. Additionally, PMQF generates a
heatmap to visualize the range of selected questions across the classification schema.</p>
        <p>After customizing the questions, users can either go back to the category selection (e.g., when
they identified the need to cover further categories) or end the question design by exporting
the identified and reformulated set of questions.</p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3. Question Refinement</title>
        <p>The question refinement feature supports users in formulating concrete and understandable
questions based on broad ideas. Users begin by entering their initial questions into an input form.
After saving, their input appears on a second screen for reflection, where they are prompted
to categorize the questions according to the existing categorization schema. This helps users
refine their ideas to fit the categories and formulate them as direct questions.</p>
        <p>We find the question refinement especially beneficial for project teams, allowing discussions
and consensus on question formulation and categorization. Refined questions can be exported
and PMQF stores a copy of the same export, enabling administrators to review and potentially
add novel questions to the set of exemplary questions.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Evaluation and Maturity of the Tool</title>
      <p>
        The three core features of PMQF run without any known errors. Additionally, we evaluated
the tool in two sessions with (1) a leading international commercial vehicle manufacturer from
Germany with initial PM experience through the analysis of one of their core manufacturing
processes, and (2) a public sector organization with no prior PM experience, exploring the value
of integrating it into their new BPM initiative. During the sessions, two representatives from
each organization used the tool, aiming to design new analysis questions for their ongoing
or planned PM projects. In both cases, the users were able to navigate the tool and use the
features as expected. As part of the evaluation, the participants filled out the Technology
Acceptance Model (TAM) [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Results are provided in Tab. 1. On average, the four participants
evaluated the usefulness with 2.46 and the perceived ease of use with 2.33 on a 7-point Likert
scale. However, they also pointed out that the usefulness largely depends on the quality and
scope of the provided set of questions and the classification schema. For the evaluation, we
used the one that is also available in the deployed version of PMQF.
      </p>
      <p>Both organizations were able to successfully derive a set of relevant analysis questions for
their projects which they planned to use further. Qualitative feedback addressed smaller aspects
such as the use of colors, options to store reformulations at once, and saving the selection of
categories for better traceability of the results. The feedback is already implemented in the
current version of PMQF. Additionally, participants suggested enhancing the tool by adding
more questions for specific domains and including guidelines for analyzing the questions.</p>
      <p>TAM Items (average ratings per item are provided in brackets)
Usefulness
Using PMQF would enable me to accomplish the design of analysis question more quickly. (3.00); Using
PMQF would improve my performance in designing analysis questions. (2.75); Using PMQF would increase
my productivity in designing analysis questions. (3.00); Using PMQF would enhance my efectiveness in PM
project planning and question design.(2.25); Using PMQF would make it easier to design PM analysis questions.
(2.00); I would find PMQF useful for designing PM analysis questions. (1.75)
Ease of Use
Learning to operate PMQF would be easy for me. (2.25); I would find it easy to get PMQF do what I want it to
do. (2.25); My interaction with PMQF would be clear and understandable. (2.50); I would find PMQF to be
flexible to interact with. (2.75); It would be easy for me to become skillful at using PMQF. (2.25); I would find
PMQF easy to use. (2.00);
Avg. Rating
2.46
2.33</p>
    </sec>
    <sec id="sec-5">
      <title>4. Conclusion and Outlook</title>
      <p>Based on our knowledge, PMQF is the first tool providing practical support for analysis question
design and refinement in PM. As such, it contributes to research by suggesting a standardization
for these two tasks and thus enables higher consistency and comprehensiveness across projects.
We believe that in the future, this can lead to more comparable and reliable project outcomes.</p>
      <p>However, in its current version, PMQF is sensitive to the existence of a set of categorized
example questions and the respective classification schema. The deployed version we provide
is optimized for one specific schema and runs on top of a collection of 405 analysis questions.
Local installations can be adapted to custom input during setup.</p>
      <p>
        In the future, we aim to provide a stable, universal categorization schema applicable for all
PM domains. Additionally, PMQF could be further advanced in several directions:
1. Integration of large language models (LLMs) to enhance the question design and
question refinement features.
2. Integration of analysis guidance by linking questions to relevant analysis techniques
and provide hints for how to answer them (e.g., supported by GenAI [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] or integrated in
PM tools).
3. Requesting community feedback in the form of ratings for questions or indications
on whether questions were answerable and valuable to project teams in practice. Such
information would help identify what constitutes a good analysis question and which
types of questions are most frequently addressed in projects.
      </p>
      <p>By integrating insights from the community and refining our approach with advanced
technological capabilities, PMQF may be able to ofer even more sophisticated and tailored
support functions in the future.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work was funded by the Swiss National Science Foundation as part of the ProMiSE project
under Grant No.: 200021_197032. I express my gratitude to my colleagues and the participants
of the evaluation for taking the time to test the tool and providing their ideas and feedback.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>M.</given-names>
            <surname>Dumas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. La</given-names>
            <surname>Rosa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mendling</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Reijers</surname>
          </string-name>
          , Fundamentals of Business Process Management, Springer,
          <year>2018</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>F.</given-names>
            <surname>Emamjome</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Andrews</surname>
          </string-name>
          , A. ter
          <string-name>
            <surname>Hofstede</surname>
          </string-name>
          ,
          <article-title>A case study lens on process mining in practice</article-title>
          ,
          <source>in: On the Move to Meaningful Internet Systems: OTM 2019 Conferences</source>
          , Springer,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>M. van Eck</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Leemans</surname>
          </string-name>
          , W. van der Aalst,
          <article-title>Pm2: A process mining project methodology</article-title>
          ,
          <source>in: Advanced Information Systems Engineering: CAiSE</source>
          <year>2015</year>
          , Springer,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>A.</given-names>
            <surname>Mamudu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Bandara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. T.</given-names>
            <surname>Wynn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Leemans</surname>
          </string-name>
          ,
          <article-title>A process mining success factors model</article-title>
          ,
          <source>in: International Conference on Business Process Management</source>
          , Springer,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Zerbato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Koorn</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Beerepoot</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Weber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Reijers</surname>
          </string-name>
          ,
          <article-title>On the origin of questions in process mining projects</article-title>
          ,
          <source>in: EDOC 2022</source>
          , Springer,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>L.</given-names>
            <surname>Zimmermann</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Zerbato</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Weber</surname>
          </string-name>
          ,
          <article-title>What makes life for process mining analysts dificult? a reflection of challenges, Software and Systems Modeling (</article-title>
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>F. D.</given-names>
            <surname>Davis</surname>
          </string-name>
          ,
          <article-title>Perceived usefulness, perceived ease of use, and user acceptance of information technology</article-title>
          ,
          <source>MIS quarterly</source>
          (
          <year>1989</year>
          )
          <fpage>319</fpage>
          -
          <lpage>340</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>A.</given-names>
            <surname>Berti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuster</surname>
          </string-name>
          , W. M. van der Aalst, Abstractions, scenarios, and
          <article-title>prompt definitions for process mining with llms: a case study</article-title>
          , in: International Conference on Business Process Management„ Springer,
          <year>2023</year>
          , pp.
          <fpage>427</fpage>
          -
          <lpage>439</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>