<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">The Process Mining Question Forge</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Lisa</forename><surname>Zimmermann</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">University of St. Gallen</orgName>
								<address>
									<settlement>St Gallen</settlement>
									<country key="CH">Switzerland</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">The Process Mining Question Forge</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">0D25FF9375F662D1C4843F650575895F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T17:02+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Process Mining</term>
					<term>Question Design</term>
					<term>Question Refinement</term>
					<term>End-User Support</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>This paper introduces the Process Mining Question Forge (PMQF), a tool that supports the design and refinement of questions for process analysis projects. Motivated by the observation that formulating well-defined questions is essential in process analysis, PMQF addresses challenges such as difficulty in designing appropriate questions and issues arising from poorly defined ones, aiming to improve the overall effectiveness of process analysis projects. In particular, it guides users in viewing, selecting, and refining example questions for their own use cases.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>When analyzing processes, organizations increasingly resort to process mining (PM) techniques and tools <ref type="bibr" target="#b0">[1]</ref>. Several methodologies and case studies are available to guide and inspire the planning and implementation of such projects. <ref type="bibr" target="#b1">[2]</ref>. A similarity of them is that they include an initial phase in which questions should be formulated to guide subsequent project phases, like data extraction, preparation, or analysis <ref type="bibr" target="#b2">[3]</ref>. For example, a question such as "What are the most common paths taken by cases in the process?" indicates an interest in the process flow. It requires the analyst to compute variants based on the available process data, identify the most frequent ones, and describe their execution. Instead the question "Are there any interesting patterns in the process?" is much broader. Without a keyword like "variant" or "path", it is not restricted to process flow patterns and analysts might also investigate other perspectives, such as time or resources. Additionally, "interesting" might not be equivalent to "most common", as it could refer to edge cases that are less relevant in terms of frequency. While someone familiar with PM techniques can likely envision the analysis for the first question, the second question requires more exploration and may depend on the available data and the analyst's experience. A less experienced project stakeholder interested in control-flow patterns might be disappointed by the outcome if they formulate a question similar to the second example.</p><p>Arguably, designing questions is an essential step in setting up PM initiatives, as they elicit requirements and direct the analysis. Research confirms that the execution of this step is a relevant success factor for projects <ref type="bibr" target="#b3">[4]</ref>. However, formulating questions is not trivial and analysts lack support from tools and methods <ref type="bibr" target="#b4">[5]</ref>. An interview study conducted with 40 analysts confirmed that question formulation is a significant challenge in PM projects, often arising when analysts work with questions that are unclear, overly specific, or too broad (as in the second example above) <ref type="bibr" target="#b5">[6]</ref>.</p><p>In this work, we address this problem and introduce the Process Mining Question Forge (PMQF), a tool that implements guidance for the crucial task of question design in PM projects. We developed PMQF based on findings from <ref type="bibr" target="#b4">[5]</ref>, that highlight that (experienced) analysts might rely on domain knowledge or analysis templates when confronted with a project that starts without clear questions. In practice, especially less experienced analysts lack this knowledge and templates are not always available. Therefore PMQF leverages categorized example questions and a classification schema to help users in designing their own questions. PMQF can be set up with any custom set of questions and a respective categorization schema. On top of these resources, it guides users to design their questions in a structured way.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Tool Description and Features</head><p>PMQF has been developed as a web application in Python using Flask<ref type="foot" target="#foot_0">1</ref> . When it is set up, it expects a categorized set of analysis questions and the corresponding classification schema as input. The source code of PMQF can be found at https://github.com/promise-ics-hsg/ demoApplication-PMQF, and a deployed version can be accessed via http://130.82.168.60:5000/. In the deployed version, PMQF runs on an exemplary set of 405 categorized analysis questions that we gathered from diverse sources, such as the BPI Challenges<ref type="foot" target="#foot_1">2</ref> or Case Studies, and a classification schema that classifies questions across six dimensions. We provide a video demonstrating how the tool can be used: https://drive.switch.ch/index.php/s/Y9cW0Nk3DEmdVOR. PMQF supports users in (i) retrieving an overview of questions and their classification, (ii) designing new questions, and (iii) clarifying and refining existing questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">Keyword Search</head><p>PMQF features an advanced keyword search that allows for the efficient location of relevant analysis questions. Users can enter keywords related to their areas of interest and the tool returns a list of questions that match this search criteria. To this end, we integrated the computation of synonyms based on wordnet<ref type="foot" target="#foot_2">3</ref> (using the nltk library<ref type="foot" target="#foot_3">4</ref> ). The keyword search is implemented for project teams, learners, or teachers who are interested in a specific concept of PM and aim to retrieve an overview of what kind of questions they could ask in this regard.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Question Design</head><p>The question design feature supports users in formulating analysis questions through three phases. It is particularly suited for less experienced analysts or project teams without extensive expertise who will benefit from guidance on how to iteratively identify areas of interest.  Category Selection: In the first phase, users are directed to select categories based on the dimensions of the classification schema (Fig. <ref type="figure" target="#fig_2">1a</ref>). For each dimension, we suggest a guiding question that helps them to choose categories that are aligned with their project goals. Definitions for all categories are displayed when hovering over the buttons. Question Filtering: After selecting categories, PMQF filters the available questions accordingly and displays the resulting set. Users are asked to review the questions and select and save those they identify as relevant for further investigation. Question Customization: In the last phase, users conduct their final review by discarding or reformulating questions to fit their domain-specific terminology (Fig. <ref type="figure" target="#fig_2">1b</ref>). We assume that the reformulation maintains the original question categorization. Additionally, PMQF generates a heatmap to visualize the range of selected questions across the classification schema.</p><p>After customizing the questions, users can either go back to the category selection (e.g., when they identified the need to cover further categories) or end the question design by exporting the identified and reformulated set of questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.3.">Question Refinement</head><p>The question refinement feature supports users in formulating concrete and understandable questions based on broad ideas. Users begin by entering their initial questions into an input form. After saving, their input appears on a second screen for reflection, where they are prompted to categorize the questions according to the existing categorization schema. This helps users refine their ideas to fit the categories and formulate them as direct questions.</p><p>We find the question refinement especially beneficial for project teams, allowing discussions and consensus on question formulation and categorization. Refined questions can be exported and PMQF stores a copy of the same export, enabling administrators to review and potentially add novel questions to the set of exemplary questions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Evaluation and Maturity of the Tool</head><p>The three core features of PMQF run without any known errors. Additionally, we evaluated the tool in two sessions with (1) a leading international commercial vehicle manufacturer from Germany with initial PM experience through the analysis of one of their core manufacturing processes, and (2) a public sector organization with no prior PM experience, exploring the value of integrating it into their new BPM initiative. During the sessions, two representatives from each organization used the tool, aiming to design new analysis questions for their ongoing or planned PM projects. In both cases, the users were able to navigate the tool and use the features as expected. As part of the evaluation, the participants filled out the Technology Acceptance Model (TAM) <ref type="bibr" target="#b6">[7]</ref>. Results are provided in Tab. 1. On average, the four participants evaluated the usefulness with 2.46 and the perceived ease of use with 2.33 on a 7-point Likert scale. However, they also pointed out that the usefulness largely depends on the quality and scope of the provided set of questions and the classification schema. For the evaluation, we used the one that is also available in the deployed version of PMQF.</p><p>Both organizations were able to successfully derive a set of relevant analysis questions for their projects which they planned to use further. Qualitative feedback addressed smaller aspects such as the use of colors, options to store reformulations at once, and saving the selection of categories for better traceability of the results. The feedback is already implemented in the current version of PMQF. Additionally, participants suggested enhancing the tool by adding more questions for specific domains and including guidelines for analyzing the questions. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>2.33</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Average responses to the TAM on a 7-point Likert scale ranging from extremely likely (1), to extremely unlikely <ref type="bibr" target="#b6">(7)</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Conclusion and Outlook</head><p>Based on our knowledge, PMQF is the first tool providing practical support for analysis question design and refinement in PM. As such, it contributes to research by suggesting a standardization for these two tasks and thus enables higher consistency and comprehensiveness across projects. We believe that in the future, this can lead to more comparable and reliable project outcomes. However, in its current version, PMQF is sensitive to the existence of a set of categorized example questions and the respective classification schema. The deployed version we provide is optimized for one specific schema and runs on top of a collection of 405 analysis questions.</p><p>Local installations can be adapted to custom input during setup.</p><p>In the future, we aim to provide a stable, universal categorization schema applicable for all PM domains. Additionally, PMQF could be further advanced in several directions:</p><p>1. Integration of large language models (LLMs) to enhance the question design and question refinement features. 2. Integration of analysis guidance by linking questions to relevant analysis techniques and provide hints for how to answer them (e.g., supported by GenAI <ref type="bibr" target="#b7">[8]</ref> or integrated in PM tools). 3. Requesting community feedback in the form of ratings for questions or indications on whether questions were answerable and valuable to project teams in practice. Such information would help identify what constitutes a good analysis question and which types of questions are most frequently addressed in projects.</p><p>By integrating insights from the community and refining our approach with advanced technological capabilities, PMQF may be able to offer even more sophisticated and tailored support functions in the future.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head></head><label></label><figDesc>Proceedings of the Best BPM Dissertation Award, Doctoral Consortium, and Demonstrations &amp; Resources Forum co-located with 22nd International Conference on Business Process Management (BPM 2024), Krakow, Poland, September 1st to 6th, 2024.Envelope lisa.zimmermann@unisg.ch (L. Zimmermann) Orcid 0000-0002-6149-7060 (L. Zimmermann)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Interface for phases 1 and 3 of the question design feature in PMQF</figDesc><graphic coords="3,298.88,84.19,179.17,178.83" type="bitmap" /></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">https://flask.palletsprojects.com/en/3.0.x/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://www.tf-pm.org/competitions-awards/bpi-challenge</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://wordnet.princeton.edu/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="4" xml:id="foot_3">https://www.nltk.org/</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work was funded by the Swiss National Science Foundation as part of the ProMiSE project under Grant No.: 200021_197032. I express my gratitude to my colleagues and the participants of the evaluation for taking the time to test the tool and providing their ideas and feedback.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Dumas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Rosa</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mendling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Reijers</surname></persName>
		</author>
		<title level="m">Fundamentals of Business Process Management</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<analytic>
		<title level="a" type="main">A case study lens on process mining in practice</title>
		<author>
			<persName><forename type="first">F</forename><surname>Emamjome</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Andrews</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hofstede</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">On the Move to Meaningful Internet Systems: OTM 2019 Conferences</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Pm2: A process mining project methodology</title>
		<author>
			<persName><forename type="first">M</forename><surname>Van Eck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Leemans</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Advanced Information Systems Engineering: CAiSE 2015</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2015">2015</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">A process mining success factors model</title>
		<author>
			<persName><forename type="first">A</forename><surname>Mamudu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Bandara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Wynn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">J</forename><surname>Leemans</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Business Process Management</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">On the origin of questions in process mining projects</title>
		<author>
			<persName><forename type="first">F</forename><surname>Zerbato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Koorn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Beerepoot</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Weber</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Reijers</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">EDOC 2022</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">What makes life for process mining analysts difficult? a reflection of challenges</title>
		<author>
			<persName><forename type="first">L</forename><surname>Zimmermann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Zerbato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Weber</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Software and Systems Modeling</title>
		<imprint>
			<biblScope unit="page" from="1" to="29" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Perceived usefulness, perceived ease of use, and user acceptance of information technology</title>
		<author>
			<persName><forename type="first">F</forename><forename type="middle">D</forename><surname>Davis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">MIS quarterly</title>
		<imprint>
			<biblScope unit="page" from="319" to="340" />
			<date type="published" when="1989">1989</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">Abstractions, scenarios, and prompt definitions for process mining with llms: a case study</title>
		<author>
			<persName><forename type="first">A</forename><surname>Berti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Schuster</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Business Process Management</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2023">2023</date>
			<biblScope unit="page" from="427" to="439" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
