<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">Analyzing Manufacturing Process By Enabling Process Mining on Sensor Data</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Dina</forename><surname>Bayomie</surname></persName>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business (WU)</orgName>
								<address>
									<settlement>Vienna</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="department">Austrian Center for Digital Production</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kate</forename><surname>Revoredo</surname></persName>
							<email>kate.revoredo@hu-berlin.de</email>
							<affiliation key="aff2">
								<orgName type="institution">Humboldt University</orgName>
								<address>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Stefan</forename><surname>Bachhofner</surname></persName>
							<email>stefan.bachhofner@wu.ac.at</email>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business (WU)</orgName>
								<address>
									<settlement>Vienna</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kabul</forename><surname>Kurniawan</surname></persName>
							<email>kabul.kurniawan@wu.ac.at</email>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business (WU)</orgName>
								<address>
									<settlement>Vienna</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
							<affiliation key="aff3">
								<orgName type="department">Austrian Center for Digital Production</orgName>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Elmar</forename><surname>Kiesling</surname></persName>
							<email>elmar.kiesling@ai.wu.ac.at</email>
							<affiliation key="aff0">
								<orgName type="institution">Vienna University of Economics and Business (WU)</orgName>
								<address>
									<settlement>Vienna</settlement>
									<country key="AT">Austria</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Jan</forename><surname>Mendling</surname></persName>
							<email>jan.mendling@hu-berlin.de</email>
							<affiliation key="aff2">
								<orgName type="institution">Humboldt University</orgName>
								<address>
									<settlement>Berlin</settlement>
									<country key="DE">Germany</country>
								</address>
							</affiliation>
						</author>
						<author>
							<affiliation key="aff1">
								<orgName type="institution">Cairo University</orgName>
								<address>
									<settlement>Cairo</settlement>
									<country key="EG">Egypt</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">Analyzing Manufacturing Process By Enabling Process Mining on Sensor Data</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">7B4E1930F26B165A396AEAE81C3591B2</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-03-23T23:10+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Sensor data</term>
					<term>Event log creation</term>
					<term>Process mining</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Typical manufacturing processes involve various machines, each of which may be equipped with a variety of sensors. Digital twins can be used to model how the machines operate and support analysts in issue identification and identifying potential improvements in the process. For a complete view of the status of a machine, however, models need to be enriched to identify patterns over changes in the measurements of sensors and correlations between these sensors. Process mining techniques could be usefully applied in this context, given that they provide descriptive analyses to explain and simulate physical objects based on event logs storing multi-perspective data about the process. However, although sensors generate a vast amount of data about the status of machines on the production floor, they cannot be directly used by process mining techniques. To tackle this issue, we introduce a method that creates a custom event log from sensor data based on the process analysts interests. To this end, we propose different encodings for the sensor data. An exploratory experiment using real-life data from an industrial partner shows the effectiveness of our approach.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>In the manufacturing industry, Digital Twins (DTs) <ref type="bibr" target="#b0">[1]</ref> play increasingly important roles as the Industry 4.0 vision becomes reality. DTs facilitate the virtualization of physical objects to support analysts with an overview of the reality and a visualization of potential improvements and possible issues to be mitigated. An example of a physical object in a manufacture process is an injection molding machine used to produce plastic car parts through injection molding. As Figure <ref type="figure" target="#fig_0">1</ref> depicts, this machine can be equipped with numerous sensors. In this case having a model that focuses on how this machine operates is not sufficient for a full understanding of its status. In this context, it is beneficial to enrich the DT model with patterns of changes on the measurements of sensors or correlations between these sensors, which may provide a more complete view of the machine status.</p><p>Process mining techniques <ref type="bibr" target="#b1">[2]</ref> are able to provide descriptive analyses to explain and simulate physical objects. For that event logs storing multi-perspective data about the process are used. Sensors generate vast amounts of data about the status of machines on the production floor. However, this data cannot be directly used by process mining techniques, given that sensor measurements are structured as time serieses whereas event logs store the occurrence of discrete events of a process over time.</p><p>In this paper, we introduce a method that creates a custom event log from sensor data based on the process analysts interests. To this end, we explore multiple options to encode sensor measurements as process events and a set of techniques to group these events into cases, i.e., sequences of correlated events. We conduct exploratory experiments using real-life use case form our industrial partner Farplas. The experiments use specific encodings of the sensor data into an event log, which allow the use of process mining techniques for root-cause analysis and pattern discovery. The results shows the effectiveness of our approach.</p><p>The remainder of this paper is organized as follows. We discuss prior work in Section 2, describe our method in 3, and evaluate our method and discuss the findings in Section 4. Finally, we conclude our work and provide some future research directions in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related work</head><p>Various approaches to leverage process mining to support the analysis and modeling of real machines running on the shopfloor have been introduced in the context of Industry 4.0 <ref type="bibr" target="#b2">[3]</ref>. In <ref type="bibr" target="#b3">[4]</ref>, an approach to integrate Digital Shadows to analyze shopfloor-level manufacturing processes is proposed. The different materials and sub-parts of a product are considered for the analysis. The integration of process events with structural data allows process mining techniques to learn an enriched process model. In this work, we consider a single machine with various sensors and focus on the change in the measurements of these sensors to define process events. Our goal is to understand the correlation among these process events and how their interactions affect the output of the machine.</p><p>[5] addresses the scenario of using Digital Twins of organizations for business process improvement. To this end, process mining techniques are used to evaluate violations of constraints and produce required actions. A digital twin interface model is presented to make the current state of the business processes transparent, allowing the process analyst to visualize potential improvements to the business process.</p><p>In <ref type="bibr" target="#b5">[6]</ref>, an approach to transform sensor data into an event log is presented. The approach maps sensor measurements taken when users interact with smart products to human activities and grouping them into cases. The goal is to use the event log generated to discover models of human behavior. To this end, sensor data is encoded into an event log by segmenting the sensor measurements considering a fixed time window. Then the segments or groups of segments (when common characteristics are identified) are labeled as activities. The labeling process is performed manually by domain experts. Activities are grouped into cases based on the goal of the research, which is the relation of the sensor data with the user interaction. Therefore, each interaction with a user defines a new case. In the current work, we explore different ways for segmenting the sensor measurements and to create a case.</p><p>In <ref type="bibr" target="#b6">[7]</ref>, a method for analyzing time series data to be used in decision points in a process model is proposed. The approach learns a process model from an event log and if there are decisions in the model based on numeric attributes, then time series analysis is used to be considered at the decision point. In our work, we do not focus on decisions made based on the values of sensor data. Instead, we focus on investigating the changes in the sensor data to support the understanding of the process outcome.</p><p>In <ref type="bibr" target="#b7">[8]</ref> a method for finding the interaction between sensor data and process knowledge is used to construct an event log. The sensor data used was the location of a specific object (e.g., the location of a patient in a hospital). In the current work, we focus on encoding different sensor data as an event log.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Method</head><p>In this section, we describe our method for creating an event log from sensor data, which we henceforth call EL-SD. It is inspired by the data science model <ref type="bibr" target="#b8">[9]</ref>.</p><p>Figure <ref type="figure" target="#fig_1">2</ref> illustrates the EL-SD process; the main data input is the sensor data, and the output is an event log. To create the event log, EL-SD relies on two main steps: (i) a pre-processing step in which the data is cleaned and an appropriate subset is selected, and (ii) an event log building step based on the process analyst's interest and the pre-processed data.</p><p>As shown in Figure <ref type="figure" target="#fig_0">1</ref>, different types of sensors are associated with machines on the production floor. Each of these sensors provide time series data about the machine's status at a given time. Table <ref type="table" target="#tab_0">1</ref> provides an example of the extracted entries of sensor data over time. The sensor data are raw data that track the machine status over time; Notice that "row #" column is a table pointer and does not denote the entry id. For example, in the second row, the time of the sensor reading is "2021-01-04 17:57:00 ", the temperature reading is 300.5, the pressure sensor reading is 222, the location sensor reading is L1, the volume sensor reading is 34.  time sensor reading is 35.0, and the speed sensor reading is 100.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data Preprocessing</head><p>The first step is a preprocesssing step to prepare the data in terms of data quality and data selection.</p><p>Data cleaning Time series data are essential in industry, where all kinds of sensor devices capture data from the industrial environment continuously. The time series data collected is typically large and affected by the limited reliability of sensor devices <ref type="bibr" target="#b9">[10]</ref>. Consequently, data cleaning is an essential step <ref type="bibr" target="#b10">[11]</ref> before subsequent analysis. Key quality issues with the sensor data are (i) unannotated data, i.e., the extracted data is not properly related to sensor metadata (e.g., sensor name, type etc.) (ii) Missing (null) values due to incomplete sensor readings. There are several techniques to handle such issues (cf. <ref type="bibr" target="#b10">[11]</ref>).</p><p>Data Selection Due to the large number of sensors that production machines are typically instrumented with, the extracted sensor data is typically enormous. To perform meaningful analyses, process analysts need to select data relevant for their analysis. The selection step may consist in slicing the data based on a time window, selecting specific sensors to analyze, or both. For example, suppose the analyst wants to understand the relation between temperature, pressure, and speed of the product produced on a machine. In that case, they can select the entries related to the respective sensors.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Event Log Construction</head><p>The second step of the EL-SD process consists in building an event log from the pre-processed time series sensor data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Event log</head><p>In this section, we define the output of our method in terms with respect to events, cases and the constructed event log. Table <ref type="table" target="#tab_2">2</ref> shows a possible event log 𝐿 1 containing three cases generated from the sensor data in Table <ref type="table" target="#tab_0">1</ref>. The first case 𝜎 1 = ⟨𝑒 1 , 𝑒 2 , 𝑒 3 ⟩ has four events; 𝑒 1 is characterized by four attributes that describe the state of the system -𝑒 1 .Activity, for instance, represents "Increase in Temp" from 𝑒 1 .pvalue = 300 to 𝑒 1 .nvalue = 300.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 1 (Event</head><p>Based on Definition 3, an event log is defined through events that are grouped into cases. Analysts needs to define what the events and cases represent. Our method provides several possible encodings for both the events and cases to build the event log. Inspired by <ref type="bibr" target="#b11">[12]</ref>, we use data objects for database schema to define different forms of cases and generate various event log. In the following subsection, we explain different ways of defining the events and cases over sensor data.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Event encoding</head><p>There are various options for defining what constitutes an event in the context of sensor data that can help process analysts. As per Definition 1, an event describes the status of the system execution, therefore the analyst can determine how to represent this status. We propose five options to define an event and mainly express the event activity.</p><p>The first and naïve option is that every entry is an event representing the system status at this point. The event represents one activity ("Record status"). This encoding would be interesting for analyzing the data perspective over sensor data by considering all of them as event data for the log using root-cause analysis techniques over the data. Table <ref type="table" target="#tab_3">3</ref> shows an example of encoding the events in which an event represents the activity "system status", and all the sensor data are attributes that describe this status.</p><p>The second option encodes the events to represent changes in the system over each sensor between two successive readings of sensor data. Thus, we would have event activities of "change in sensor x" for each sensor that changed over the readings. Also, we maintain the status of the changes over the sensor data, whether it is increasing or decreasing. This encoding helps to analyze the detailed changing patterns over sensor data readings. Table <ref type="table" target="#tab_4">4</ref> shows an example of encoding the events in which an event represents the change over the sensor data, and the attributes describe this change in terms of the previous value, i.e., "pValue" attribute, new value, i.e., "nValue" attribute and status attribute that indicate whether the sensor reading is increasing or decreasing. Notice that we do not state any change status for non-numerical sensor such as location (Loc) sensor.</p><p>The third option encodes the events similar to the previous encoding. However, an event represents the change over each sensor between two successive readings of sensor data when the change exceeds a given threshold. Thus, we would have event activities of "change in sensor x" for each sensor within the readings such that the change over the two successive readings exceeds the analyst threshold for this sensor. Also, we maintain the previous and new values of the exceeding change and the status of the changes over the sensor data, whether increasing or decreasing. This encoding helps analyze to analyze interesting changing points within the sensor data readings based on the analyst thresholds. Table <ref type="table" target="#tab_5">5</ref> shows an example of encoding the events based on changing threshold. The number of events generated using the threshold is less than that generated using the second encoding in Table <ref type="table" target="#tab_4">4</ref> over the same sensor readings. Therefore, analysts can focus on changing points instead of an excessive number of granular changes.</p><p>The fourth option encodes the events to represent an aggregation overview of numerical sensor readings over a given time window specified by the analyst. There are two possibilities for event activities. First, we may conceive event activities as "changing in sensor behavior on average x" for each sensor that changed over the readings during the time window. Also, we maintain the minimum, average, and maximum values of the changes. This encoding can be combined with the third encoding option by allowing the analyst to provide change thresholds for the sensors. Table <ref type="table" target="#tab_7">6</ref> shows an example of encoding the average changes of each numerical sensor over a one-day time window.</p><p>The second possibility is similar to the first option, setting the event activity to "Record the aggregate status". Then, we take the average of the sensor readings during the time window for each numerical sensor and consider them as event data attributes. Table <ref type="table" target="#tab_8">7</ref> shows an example of encoding the average values of each numerical sensor over a one-day time window.</p><p>The idea of this encoding is to provide an aggregated view of the enormous amount of sensor data. Moreover, allow the analyst to see the changes over the aggregated values.</p><p>The fifth option encodes the events to represent the change in the system over each sensor between successive readings of sensor data. Unlike the second encoding, however, it has three different event activities that alter based on the changing behavior, such that (i) if the sensor data is non-numerical, then it is "change in sensor x"; (ii) if it is numerical and the change is increasing, then the event activity is "Increase in sensor x", and finally, (iii) if it is numerical and decreasing then it is "Decrease in sensor x". Also, we maintain the previous and new values of the changes over the sensor readings. This encoding can be combined with the third encoding option by allowing the analyst to provide change thresholds for the sensors.</p><p>As shown in Table <ref type="table" target="#tab_9">8</ref>, the first event indicates an increase in temperature sensor reading from 300 to 300.5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Sensor data Selection Criteria</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Event notation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Case notation</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Event log</head><p>Process Analytics EL-SD method </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Case encoding</head><p>After defining the event notation, we need to define the case notation to group the events and generate the logs for further analysis. There are several ways to group the events to formulate a case. We present two possible case encodings that can be combined with any of the event encodings presented in the previous section. The first option is grouping the events to represent the change over two successive sensor readings. As shown in Table <ref type="table" target="#tab_2">2</ref>, the events are grouped based on the change over the successive data entries.</p><p>The second option is grouping the events based on the time window so that all events that occur within the same time window belong to the same case. For example, all events that occur on the same day in Table <ref type="table" target="#tab_7">6</ref> have the same case id, so there will be two cases over these events.</p><p>Generating an event log allows the process analysts to use different process analytic techniques that improve the understanding and explainable of the manufacture process status.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Evaluation</head><p>Figure <ref type="figure" target="#fig_2">3</ref> shows the evaluation process we followed to conduct two exploratory experiments based on a prototypical implementation of our approach<ref type="foot" target="#foot_0">1</ref> . To evaluate the usefulness of the EL-SD method, we then used process analytics tools over the generated logs. Specifically, we used Disco<ref type="foot" target="#foot_1">2</ref> and EL-RM<ref type="foot" target="#foot_2">3</ref>  <ref type="bibr" target="#b12">[13]</ref>.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Dataset</head><p>We used one dataset from our industry partner in Teaming AI project, Farplas. Farplas is a full system solutions partner for the automotive industry.Farplas researches, develops, and manufactures superior automotive polymer systems, provides innovative solutions, and implements state-of-the-art technologies.We use their sensor data that describe the status of the production floor machines for polymer systems production, such as temperature, pressure, and volume  We conducted two exploratory experiments with different analysis objectives to explore the usefulness of our method and the effectiveness of creating event logs from sensor data. Each experiment addresses an analysis scenario over the data to explore the benefits of having a specific event log that meets the process analyst's interests.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Scenario 1: Change Pattern Analysis</head><p>The first experiment explores the sensor data using process discovery techniques to understand the changing patterns over the sensors. We generated two event logs using different event and case encodings. Then, we used Disco to discover the process model over these logs.</p><p>Figure <ref type="figure" target="#fig_4">4</ref> shows the process models discovered from the two generated event logs. Figure <ref type="figure" target="#fig_4">4a</ref> illustrates the most changing patterns given that a case contains the events generated over changes of two successor entries. Using this encoding, the analyst can easily capture the most frequent sensors that change almost every reading. That helps to understand the physical state of the manufacturing machines over every reading by providing a virtual model that represents real-time changes over the sensors.</p><p>Figure <ref type="figure" target="#fig_4">4b</ref> depicts the changing patterns over 22 days, given that each day represented a case. An event describes the changes over two successor entries with three possible activities that clarify the changes over the sensor readings (see the fifth encoding option in subsubsection 3.2.2). Using this encoding, the analyst can quickly grasp the most regular sensors that change over the days. Also, the analyst can see the changing behavior in terms of increasing or decreasing the sensors' readings and how they affect each other. That helps to understand the physical interaction between the different environmental parameters, e.g., temperature, pressure, and speed, of the manufacturing machines over the day by modeling a virtual model that represents the daily behavior of the sensors.</p><p>Using the process models, analysts can investigate various patterns of the sensors. Being able to use different encodings allows them to explore the sensor day from multiple perspectives.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Scenario 2: Root cause Analysis</head><p>The second experiment performs a root-cause analysis to understand the relationship between the sensor data and the quality detection data. We generated one event log in which the case contained events that occurred on the same day. We encoded the events following the first encoding insubsubsection 3.2.2 in which each entry represents an event in order to analyze the sensor data from a data perspective. Then, we discovered the association rules over these logs using EL-RM <ref type="bibr" target="#b12">[13]</ref>. We focus on the association rules that concentrate on the detection sensor data. Therefore, we select only the rules that include the not-ok quality detection data as a consequence of the rule. Following that, EL-RM discovered 20 association rules with a confidence of 0.9 that show a possible association between the changes over the sensors and the not-ok quality detection data. Using this encoding allows the analyst to conduct a root-cause analysis and gaining insights into the influence of the machine status from the sensor data over the quality detection control.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.4.">Discussion</head><p>As shown in both exploratory experiments, generating an event log from the sensor data supports creating a virtual model to understand the manufacturing of physical objects. Moreover, it allows the analysts to explore the sensor data using process analysis techniques that investigate it from a new perspective other than the time-series analysis. Also, it enriches the event model of Teaming.AI <ref type="bibr" target="#b13">[14]</ref> which helps understand and support the AI agent.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>In this paper, we proposed a method (EL-SD) for creating an event log from sensor data. Our method consists of two steps: a pre-processing step allows the analysts to prepare the data by applying data cleaning and selection techniques; the main step then creates the log based on the event notation and case notation specified by the analyst. We provide five event encoding options and two case encoding options. The results of our exploratory case scenarios show the potential of the method to investigate the sensor data from different perspectives and provide new insights into the production floor. Moreover, it constructs a virtual model that can contribute to accurate digital twins that capture the dynamic behavior manufacturing machines. As future work, we will investigate different ways to define the event activities such that they reflect the production process.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Injection Molding Machine instrumented with sensors.</figDesc><graphic coords="2,130.96,84.19,333.38,174.14" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: Creating an event log from sensor data (EL-SD)</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Evaluation process</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head></head><label></label><figDesc>(a) Cases per changes over detailed change events (b) Cases per day over detailed changes with different activities events</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Changing pattern analysis: process model</figDesc><graphic coords="10,108.88,84.19,187.51,215.72" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 1</head><label>1</label><figDesc>Sample of sensor data over the ending of two working days</figDesc><table><row><cell>row #</cell><cell>Date</cell><cell cols="6">Temp Pressure Loc Volume Time Speed</cell></row><row><cell>1</cell><cell cols="2">2021-01-04 17:55:30 300.0</cell><cell>222</cell><cell>L1</cell><cell>34.0</cell><cell>34.0</cell><cell>100</cell></row><row><cell>2</cell><cell cols="2">2021-01-04 17:57:00 300.5</cell><cell>222</cell><cell>L1</cell><cell>34.1</cell><cell>35.0</cell><cell>100</cell></row><row><cell>3</cell><cell cols="2">2021-01-04 17:58:30 300.6</cell><cell>221</cell><cell>L25</cell><cell>34.3</cell><cell>35.5</cell><cell>90</cell></row><row><cell>4</cell><cell cols="2">2021-01-04 18:00:00 301.9</cell><cell>220</cell><cell>L20</cell><cell>34.5</cell><cell>36.0</cell><cell>110</cell></row><row><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell></row><row><cell>5</cell><cell cols="2">2021-01-05 17:55:30 303.0</cell><cell>219</cell><cell>L10</cell><cell>34.6</cell><cell>36.5</cell><cell>120</cell></row><row><cell>6</cell><cell cols="2">2021-01-05 17:57:00 303.5</cell><cell>219</cell><cell>L1</cell><cell>34.9</cell><cell>37.0</cell><cell>100</cell></row><row><cell>7</cell><cell cols="2">2021-01-05 17:58:30 304.8</cell><cell>220</cell><cell>L10</cell><cell>35.0</cell><cell>36.5</cell><cell>120</cell></row><row><cell>8</cell><cell cols="2">2021-01-05 18:00:00 304.9</cell><cell>219</cell><cell>L1</cell><cell>35.0</cell><cell>38.0</cell><cell>100</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head>Table 2</head><label>2</label><figDesc>A sample event log representing the sensor data in Table1</figDesc><table><row><cell>Case Id</cell><cell>Activity</cell><cell>Timestamp</cell><cell cols="2">pvalue nvalue</cell></row><row><cell>1</cell><cell>Increase in Temp</cell><cell>2021-01-04 17:57:00</cell><cell>300</cell><cell>300.5</cell></row><row><cell>1</cell><cell>Increase in Volume</cell><cell>2021-01-04 17:57:00</cell><cell>34</cell><cell>34.1</cell></row><row><cell>1</cell><cell>Increase in Time</cell><cell>2021-01-04 17:57:00</cell><cell>34</cell><cell>35</cell></row><row><cell>2</cell><cell>Increase in Temp</cell><cell>2021-01-04 17:58:30</cell><cell>300.5</cell><cell>300.6</cell></row><row><cell>2</cell><cell cols="2">Decrease in Pressure 2021-01-04 17:58:30</cell><cell>222</cell><cell>221</cell></row><row><cell>2</cell><cell>Change in Loc</cell><cell>2021-01-04 17:58:30</cell><cell>L1</cell><cell>L25</cell></row><row><cell>2</cell><cell>Increase in Volume</cell><cell>2021-01-04 17:58:30</cell><cell>34.1</cell><cell>34.3</cell></row><row><cell>2</cell><cell>Increase in Time</cell><cell>2021-01-04 17:58:30</cell><cell>35</cell><cell>35.5</cell></row><row><cell>2</cell><cell>Decrease in speed</cell><cell>2021-01-04 17:58:30</cell><cell>100</cell><cell>90</cell></row><row><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell><cell>⋮</cell></row><row><cell>300</cell><cell>Increase in Temp</cell><cell>2021-01-05 18:00:00</cell><cell>304.8</cell><cell>304.9</cell></row><row><cell>300</cell><cell cols="2">Decrease in Pressure 2021-01-05 18:00:00</cell><cell>220</cell><cell>219</cell></row><row><cell>300</cell><cell>Change in Loc</cell><cell>2021-01-05 18:00:00</cell><cell>L10</cell><cell>L1</cell></row><row><cell>300</cell><cell>Increase in Time</cell><cell>2021-01-05 18:00:00</cell><cell>36.5</cell><cell>38</cell></row><row><cell>300</cell><cell>Decrease in speed</cell><cell>2021-01-05 18:00:00</cell><cell>120</cell><cell>100</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head>Table 3</head><label>3</label><figDesc>Entry encoding for the sensor data readings of the first four rows in Table1</figDesc><table><row><cell>Activity</cell><cell cols="6">Timestamp Temp Pressure Loc Volume Time Speed</cell></row><row><cell cols="2">Record status 2021-01-04 17:55:30 300.0</cell><cell>222</cell><cell>L1</cell><cell>34.0</cell><cell>34.0</cell><cell>100</cell></row><row><cell cols="2">Record status 2021-01-04 17:57:00 300.5</cell><cell>222</cell><cell>L1</cell><cell>34.1</cell><cell>35.0</cell><cell>100</cell></row><row><cell cols="2">Record status 2021-01-04 17:58:30 300.6</cell><cell cols="2">221 L25</cell><cell>34.3</cell><cell>35.5</cell><cell>90</cell></row><row><cell cols="2">Record status 2021-01-04 18:00:00 301.9</cell><cell cols="2">220 L20</cell><cell>34.5</cell><cell>36.0</cell><cell>110</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_4"><head>Table 4</head><label>4</label><figDesc>Detailed change encoding for the sensor data readings of the first three rows in Table1</figDesc><table><row><cell>Activity</cell><cell cols="4">Timestamp pvalue nvalue status</cell></row><row><cell cols="2">Change in Temp 2021-01-04 17:57:00</cell><cell>300</cell><cell>300.5</cell><cell>+</cell></row><row><cell cols="2">Change in Volume 2021-01-04 17:57:00</cell><cell>34</cell><cell>34.1</cell><cell>+</cell></row><row><cell cols="2">Change in Time 2021-01-04 17:57:00</cell><cell>34</cell><cell>35</cell><cell>+</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_5"><head>Table 5</head><label>5</label><figDesc>Threshold change encoding for the sensor data readings of the first three rows in</figDesc><table /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_6"><head>Table 1</head><label>1</label><figDesc></figDesc><table><row><cell>, given a</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_7"><head>Table 6</head><label>6</label><figDesc>Aggregated change view of sensor data in Table1</figDesc><table><row><cell cols="4">Activity Timestamp minimum average maximum</cell></row><row><cell>Change in Temp 2021-01-04</cell><cell>300</cell><cell>300.75</cell><cell>301.9</cell></row><row><cell>Change in Pressure 2021-01-04</cell><cell>220</cell><cell>221.25</cell><cell>222</cell></row><row><cell>Change in Volume 2021-01-04</cell><cell>34</cell><cell>34.2</cell><cell>34.5</cell></row><row><cell>Change in Time 2021-01-04</cell><cell>34</cell><cell>35.125</cell><cell>36</cell></row><row><cell>Change in speed 2021-01-04</cell><cell>90</cell><cell>100</cell><cell>110</cell></row><row><cell>Change in Temp 2021-01-05</cell><cell>303</cell><cell>304.05</cell><cell>304.9</cell></row><row><cell>Change in Pressure 2021-01-05</cell><cell>219</cell><cell>219.25</cell><cell>220</cell></row><row><cell>Change in Volume 2021-01-05</cell><cell>34.6</cell><cell>34.875</cell><cell>35</cell></row><row><cell>Change in Time 2021-01-05</cell><cell>36.5</cell><cell>37</cell><cell>38</cell></row><row><cell>Change in speed 2021-01-05</cell><cell>100</cell><cell>100</cell><cell>110</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_8"><head>Table 7</head><label>7</label><figDesc>Aggregated entry view of sensor data in Table1</figDesc><table><row><cell>Activity Timestamp</cell><cell cols="3">Temp Pressure Volume</cell><cell cols="2">Time Speed</cell></row><row><cell cols="2">Record status 2021-01-04 300.75</cell><cell>221.25</cell><cell cols="2">34.2 35.125</cell><cell>100</cell></row><row><cell cols="2">Record status 2021-01-05 304.05</cell><cell>219.25</cell><cell>34.875</cell><cell>37</cell><cell>110</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_9"><head>Table 8</head><label>8</label><figDesc>Various activity types over the changes of the 2nd and 3rd rows in Table1</figDesc><table><row><cell>Activity</cell><cell>Timestamp</cell><cell cols="2">pvalue nvalue</cell></row><row><cell>Increase in Temp</cell><cell>2021-01-04 17:58:30</cell><cell>300.5</cell><cell>300.6</cell></row><row><cell cols="2">Decrease in Pressure 2021-01-04 17:58:30</cell><cell>222</cell><cell>221</cell></row><row><cell>Change in Loc</cell><cell>2021-01-04 17:58:30</cell><cell>L1</cell><cell>L25</cell></row><row><cell>Increase in Volume</cell><cell>2021-01-04 17:58:30</cell><cell>34.1</cell><cell>34.3</cell></row><row><cell>Increase in Time</cell><cell>2021-01-04 17:58:30</cell><cell>35</cell><cell>35.5</cell></row><row><cell>Decrease in speed</cell><cell>2021-01-04 17:58:30</cell><cell>100</cell><cell>90</cell></row></table></figure>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="1" xml:id="foot_0">Build event log https://github.com/DinaBayomie/GenerateEventLogFromSensorData</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="2" xml:id="foot_1">https://www.fluxicon.com/disco/</note>
			<note xmlns="http://www.tei-c.org/ns/1.0" place="foot" n="3" xml:id="foot_2">https://github.com/DinaBayomie/EL-RM</note>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>This work received funding from the Teaming.AI project in the European Union's Horizon 2020 research and innovation program under grant agreement No 95740. The work of J. Mendling was supported by the Einstein Foundation Berlin.</p></div>
			</div>

			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">Digital twin paradigm: A systematic literature review</title>
		<author>
			<persName><forename type="first">C</forename><surname>Semeraro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lezoche</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Panetto</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Dassisti</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Comput. Ind</title>
		<imprint>
			<biblScope unit="volume">130</biblScope>
			<biblScope unit="page">103469</biblScope>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<title level="m">Process Mining -Data Science in Action</title>
				<meeting>ess Mining -Data Science in Action</meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
	<note>Second Edition</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Application fields and research gaps of process mining in manufacturing companies</title>
		<author>
			<persName><forename type="first">S</forename><surname>Dreher</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Reimann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gröger</surname></persName>
		</author>
		<idno type="DOI">10.18420/inf2020_55</idno>
	</analytic>
	<monogr>
		<title level="m">INFORMATIK 2020</title>
				<editor>
			<persName><forename type="first">R</forename><forename type="middle">H</forename><surname>Reussner</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">A</forename><surname>Koziolek</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">R</forename><surname>Heinrich</surname></persName>
		</editor>
		<meeting><address><addrLine>Bonn</addrLine></address></meeting>
		<imprint>
			<publisher>Gesellschaft für Informatik</publisher>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="621" to="634" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">T</forename><surname>Brockhoff</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">U</forename><surname>Seran</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<title level="m">Modeling digital shadows in manufacturing by using process mining</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note>Modellierung 2022 Satellite Events</note>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Realizing A digital twin of an organization using actionoriented process mining</title>
		<author>
			<persName><forename type="first">G</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">ICPM, IEEE</title>
				<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="page" from="104" to="111" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Enabling process mining on sensor data from smart products</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">L</forename><surname>Van Eck</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Sidorova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M P</forename><surname>Van Der Aalst</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE Tenth International Conference on Research Challenges in Information Science (RCIS)</title>
				<imprint>
			<date type="published" when="2016">2016. 2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">A method for analyzing time series data in process mining: Application and extension of decision point analysis</title>
		<author>
			<persName><forename type="first">R</forename><surname>Dunkl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Rinderle-Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Grossmann</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename></persName>
		</author>
		<author>
			<persName><forename type="first">Anton</forename><surname>Fröschl</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Information Systems Engineering in Complex Environments</title>
				<editor>
			<persName><forename type="first">S</forename><surname>Nurcan</surname></persName>
		</editor>
		<editor>
			<persName><forename type="first">E</forename><surname>Pimenidis</surname></persName>
		</editor>
		<meeting><address><addrLine>Cham</addrLine></address></meeting>
		<imprint>
			<publisher>Springer International Publishing</publisher>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="68" to="84" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<analytic>
		<title level="a" type="main">The ROAD from sensor data to process instances via interaction mining</title>
		<author>
			<persName><forename type="first">A</forename><surname>Senderovich</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Rogge-Solti</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mendling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Mandelbaum</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CAiSE</title>
		<title level="s">Lecture Notes in Computer Science</title>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2016">9694. 2016</date>
			<biblScope unit="page" from="257" to="273" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Data quality in etl process: A preliminary study</title>
		<author>
			<persName><forename type="first">M</forename><surname>Souibgui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Atigui</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Zammali</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Cherfi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">B</forename><surname>Yahia</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Procedia Computer Science</title>
		<imprint>
			<biblScope unit="volume">159</biblScope>
			<biblScope unit="page" from="676" to="687" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Declarative support for sensor data cleaning</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">R</forename><surname>Jeffery</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Alonso</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Franklin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Hong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Widom</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">International Conference on Pervasive Computing</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2006">2006</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Time series data cleaning: A survey</title>
		<author>
			<persName><forename type="first">X</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Ieee Access</title>
		<imprint>
			<biblScope unit="volume">8</biblScope>
			<biblScope unit="page" from="1866" to="1881" />
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">G L</forename><surname>De Murillas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">A</forename><surname>Reijers</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><forename type="middle">M</forename><surname>Van Der Aalst</surname></persName>
		</author>
		<title level="m">Case notion discovery and recommendation: automated event log building on databases</title>
				<imprint>
			<publisher>Knowledge and Information Systems</publisher>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Multi-perspective process analysis: Mining the association between control flow and data objects</title>
		<author>
			<persName><forename type="first">D</forename><surname>Bayomie</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Revoredo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Mendling</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">CAiSE</title>
				<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="72" to="89" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">ai: enabling human-ai teaming intelligence in manufacturing</title>
		<author>
			<persName><forename type="first">T</forename><surname>Hoch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Heinzl</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Czech</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Khan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Waibel</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bachhofner</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Kiesling</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Moser</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Teaming</forename></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">11th International Conference on Interoperability for Enterprise Systems and Applications</title>
				<imprint>
			<date type="published" when="1973">2022. 0073</date>
			<biblScope unit="volume">3214</biblScope>
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
