<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Human-Machine Interface</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Mladen Šverko</string-name>
          <email>mladen.sverko@fer.hr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tihana Galinac Grbac</string-name>
          <email>tihana.galinac@unipu.hr</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Faculty of Electrical Engineering and Computing, University of Zagreb</institution>
          ,
          <addr-line>Unska 3, Zagreb 10000</addr-line>
          ,
          <country country="HR">Croatia</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Juraj Dobrila University of Pula</institution>
          ,
          <addr-line>Zagrebačka 30, Pula 52100</addr-line>
          ,
          <country country="HR">Croatia</country>
        </aff>
      </contrib-group>
      <fpage>150</fpage>
      <lpage>161</lpage>
      <abstract>
        <p>This paper investigates the benefits of utilizing proven data mining techniques for data preprocessing, with the objective of enhancing data visualization in the context of industrial control systems (ICS). In particular, we address a human-machine interface (HMI) as the key component of supervisory control and data acquisition (SCADA) systems that provide crucial insight into a controlled process, thus posing a critical point of potential data misrepresentation. This is particularly emphasized in the ever-increasing data quantity generated on the factory floor under the Industry 4.0 paradigm. Furthermore, we discuss how this approach can impact data quality during the data collection phase, consequently influencing subsequent data mining stages. To illustrate this approach, we present an example related to the graphical representation of data in the HMI for the tension control process within the steel manufacturing industry. The novelty of this paper lies in exploring the application of data preprocessing techniques in the domain of data presentation at the data acquisition and immediate process control level, prior to data storage in databases, and forming of data lakes and data warehouses.</p>
      </abstract>
      <kwd-group>
        <kwd>Visualization</kwd>
        <kwd>data mining</kwd>
        <kwd>preprocessing</kwd>
        <kwd>human-machine interface</kwd>
        <kwd>HMI</kwd>
        <kwd>process control</kwd>
        <kwd>industrial automation</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The field of human-machine interface (HMI) design has undergone a gradual transformation
influenced by Industry 4.0, resulting in a shift towards incorporating disruptive and open
technologies. This evolution has introduced new design approaches such as human-centric,
model-driven, data-driven, task-driven, agile, and contextual design [1, 2]. These approaches are
tailored to the unique requirements of each project, client, and industry standards. Aiming to
generate more flexible, resilient, adaptive, and eficient solutions for industrial control systems
(ICS), they consequently increase the complexity in the development and commissioning phases.
Moreover, the constant reduction in time-to-market for new development tools and products
https://tfpu.unipu.hr/tfpu/tihana.galinac_grbac (T. Galinac Grbac)
CEUR
Workshop
Proceedings
CEUR
imposes greater demands on expertise for system integrators and solution providers.
Consequently, development teams tend to specialize in narrow fields and specific areas, potentially
leading to a loss of awareness by individual team members regarding the overall solution’s key
performance values and how individual project segments influence them.</p>
      <p>Although it may initially appear as a minor problem with minimal impact on the user
experience and HMI functionality, the negative consequences become more pronounced due
to the increased complexity of HMI, the volume of data involved, and the discrepancies that
arise between real-world data from the production floor and the system’s design throughout
the HMI lifecycle.</p>
      <p>To efectively tackle these challenges at the HMI level, it is crucial to employ methods that
leverage domain knowledge and ensure enhanced data quality and reliability. One approach to
achieve this is through the application of data preprocessing techniques in data mining that
incorporate rule-based mechanisms. By revising data elements such as real-time signals from
ifeld devices, alarm and event definitions, equipment faults, and status messages, and applying
data preprocessing, HMI can transform, and present data in a meaningful way, improving its
usability and enhancing the user experience.</p>
      <p>The rest of the paper is structured as follows: In section II we present methodology, i.e.
research selecting strategy, and briefly elaborate on the significance of data context. Section
III elaborates on SCADA-based HMI industrial data analytics. Here, we particularly address
works emphasizing preprocessing and data quality significance at the lower levels of the
data processing pipeline and draw a parallel to the HMI in terms of potential benefit to data
presentation and visualization. In section IV, we provide specific points of concern in the
domain of HMI data presentation afected by process data quality. Furthermore, we discuss the
significance of domain knowledge to difer the final result data at the HMI level, compared to
data preprocessing implemented on aggregated data further on in the data processing pipeline.
In section V we address related works, and finally, section VI provides a conclusion on the topic
in question.</p>
      <p>The main contribution of this paper is to introduce and explore the application of data mining,
specifically data preprocessing, in the domain of real-time process controls. This domain difers
from traditional data mining approaches used in industrial environments, as it concentrates on
levels considerably lower than those typically considered for data mining, which are typically
regarded solely as data sources. We additionally contribute by identifying critical points of
concern related to real-time data presentation and visualization at the HMI level, where these
issues can be efectively addressed through the utilization of rule-based techniques in data
preprocessing, which fall under the umbrella of data mining. By doing so, our work subsequently
contributes to the broader field of human-machine interaction and context awareness.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Methodology</title>
      <p>
        There are multiple aspects and views addressing the data quality impact on the data presentation
and visualization. From the HMI standpoint, this can be addressed through areas such as graphic
design standards, a human-centered approach, heterogeneous data, and cognitive limitations, to
name a few. At the top level, in terms of data quality impact, the question is twofold: (
        <xref ref-type="bibr" rid="ref1">1</xref>
        ) How
data are presented, and (
        <xref ref-type="bibr" rid="ref2">2</xref>
        ) How data are perceived. Although the latter is a consequence of the
former, both aspects ultimately afect users’ situational awareness by interpreting the given
context.
      </p>
      <p>Acknowledging the above aspects, in this section we first briefly address the influence of the
demands posed on HMI operators on understanding the data context, extending to situational
awareness. In continuation, we focus on the research relevant to the core topic of data mining in
the industrial environment that defines preprocessing-related tasks of significance for enhancing
data quality for the purpose of data presentation at the HMI level.</p>
      <sec id="sec-2-1">
        <title>2.1. Significance of data context</title>
        <p>Recognizing the importance of big data impact on user interface and the overwhelming quantity
of data presented on the HMI screens, together with increasing process control complexity and
in combination with operators’ cognitive limitations, multiple studies are dedicated to the field
of situational awareness that is crucial on the plant floor.</p>
        <p>In this respect, Singh, Vajirkar, and Lee [3] recognize the increasing volume of data and
entities that evolve over time, concluding that, due to the dynamic environment, data must
be also interpreted accordingly, i.e. contextualized according to the current situation. In the
domain of manufacturing, a context-aware control system is proposed to help operators cope
with the challenges of monitoring several devices simultaneously and achieve timely reaction
[2]. At the HMI level, a survey covering context-aware inference, conducted by Salam et
al. [4] singled out automatic engagement inference as one of the tasks required to develop
successful human-centered HMI. Although the majority of employed HMI applications across
the industries do not have the means to implement such techniques, considering the close
relationship between engagement inference and understanding the context of observed entities,
this work additionally emphasizes the importance of data quality perceived in realtime. In the
context of Industry 4.0, and extending to the Industry 5.0 human-centric approach, several
works that addressed human-machine interaction [5, 1] pinpointed adaptive human-machine
interfaces as crucial in context-aware technologies. All the above works have a common ground
in linking context and users’ behavior and choices close to data accuracy and adequate data
presentation techniques.</p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. Data preprocessing in industrial environment</title>
        <p>Advanced analytics implementing machine learning and data mining is widely used across
industries, with the manufacturing industry being an early adopter. Numerous conducted
researches are focused on domains such as energy consumption, process optimization, product
quality, and predictive maintenance. In this respect, data preprocessing has been well-defined
and established in practice.</p>
        <p>However, if HMI is introduced as an element of interest, the works available in scientific
databases decrease rapidly in number, and more importantly, in relevance. The search string
applying expression ”((HMI or human-machine interface) and preprocessing)” on title, abstract,
and keywords, returned a total of 71 papers in the Scopus database. Although this may seem
a substantial number of works for such a narrow field, the encompassed research papers are
barely related to manufacturing or the industry sector at all. The HMI is mainly addressed
in terms of user interaction such as emotion, gesture, and speech recognition, and/or in the
domain of medicine and bio-medicine.</p>
        <p>Expanding the search to SCADA systems (which inherently encompass the broader portion
of ICS that may implicitly relate to HMI) resulted in 473 research papers, but this did not lead
to an increase in relevance, as it has shifted the focus to the levels above the HMI, i.e. to the
standard fields of data mining implementation such as predictive maintenance, fault diagnostic,
product quality, and anomaly detection and cybersecurity that performs data preprocessing on
the data already available in data lakes and warehouses.</p>
        <p>
          Narrowing down the filtered research papers for relevance, we considered those papers that
address data preprocessing techniques and data mining in ways that are applicable to the topic
at hand and that meet the following criteria: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) Applicable at the HMI level, without requiring
additional components within the ICS, except for those serving as data sources. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) Applicable
in realtime with respect to execution of the HMI runtime layer. (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Potentially implemented as
rule-based. (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) Implementable throughout the entire HMI lifecycle, i.e. scalable according to
potential ICS expansion and SCADA system modifications (expanding field devices, additional
signals, changes in tag naming conventions, graphical representation).
        </p>
        <p>With such an approach, we have extracted a total of 15 works that we have found significant
and directly contributing to this research.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. SCADA-based HMI in industrial Data Analytics</title>
      <p>This section elaborates on the HMI role and position within the Data mining process
implemented in the industrial domain. We have addressed the relevant papers that refer to data
mining implementation and are related to the production floor data preprocessing, thus
providing grounds for targeted implementation of data preprocessing at the HMI level.</p>
      <p>Fig.1 depicts the first three layers of the standard automaton pyramid, based on ISA-95
model of functional hierarchy: (L0) Factory floor, i.e. field devices, (L1) Process layer, (L2) Data
acquisition layer, i.e. SCADA). Although the interlayer communication is significantly disrupted
by introducing the IIoT and smart field devices, in a major part of the manufacturing facilities
data flow from the production floor (runtime data layer on the figure) reaches the SCADA layer
through various PLC devices in charge of the direct process control (data acquisition layer on
the figure). An exception to this is a set of unstructured data (static data on the figure) whose
context is addressed across the layers but is not exchanged in realtime. In this respect, raw
data generated on the factory floor are in various forms and stored in databases across the Data
visualization &amp; Storage layer, i.e. SCADA system that is positioned as a major source of process
data for the Data aggregation and/or data preprocessing layer. In this context, HMI stations
are merely another data source of Production data recorded in realtime, with the addition of
limited historical data, such as alarms, events, reports, and HMI systems logs, that are stored at
the HMI level and potentially forwarded to the data historian as well.
Advanced analytics
and data presentation
Data agregation and/or
preprocessing
Data visualization &amp;
Storage
Data acquisiton
Runtime data
(primary sources)</p>
      <p>Preprocessing</p>
      <sec id="sec-3-1">
        <title>3.1. User’s prespective</title>
        <p>From the user’s point of view, this is suficient for the data scientist and business intelligence roles
dealing with advanced analytics performed on aggregated data in data lakes and warehouses.
However, operators who are in charge of process control, benefit less from such data processing
pipelines. The main task of the operator is to supervise and control the ongoing industrial
process, and the HMI role is to provide reliable real-time data that enables the operator to be
aware of all the aspects of the controlled process. If data preprocessing tasks were implemented
on the raw data as early as possible, i.e. at the lowest three levels in the picture, then HMI
could benefit from increased data quality in providing more accurate data visualization, thus
enhancing users’ insight into controlled process.</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Data preprocessing tasks/steps</title>
        <p>Several of the selected papers [6, 7, 8] provide a methodology of data preparation as part of their
work with a focus on data cleaning, integration, transformation, and data reduction. These four
tasks are identified as particularly important for acquired industrial data by [ 9]. Additionally,
Battas et al. [6], proposing the data preprocessing method for an industrial prediction process,
has elaborated on each step and emphasized the importance of understanding how the data
is collected as well as its meaning in order to be able to use it correctly. In this sense, the
authors added data understanding as the first task prior to entering the above-defined data
preprocessing tasks. The ultimate goal of the tasks defined by the combined above-addressed
works is to prepare the data in a manner that maximizes the efectiveness and eficiency of
subsequent data mining analysis by uncovering meaningful patterns, relationships, and insights
from the raw data.</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Time component and near-source data preprocessing</title>
        <p>Discussing the importance of raw data collected from the production floor, Gao et al. [10]
addresses digital twins and positions data preprocessing in the layer above the digital twin,
i.e. when the digital representation of the process is already formed based on the data prior to
preprocessing. From the perspective of our research, a parallel can be drawn between the role of
digital twins and HMI in terms of real-time insight into the ongoing process. Although, in this
case, the discussed data preprocessing is still performed above the HMI level, it is significant
for our research because the authors also aim to provide timely feedback to users in order to
provide visual monitoring of the process. This emphasizes the importance of a time component
and feedback to the users in a way that is not addressed in usual practice, i.e. in implementing
advanced analytics with data concentrated in data lakes and warehouses where HMI is only
one of the data sources.</p>
        <p>Similarly, research discussing data preprocessing with a focus on automated guided vehicles
(AGV) [11], proposes a methodology for the aggregation of raw data received from multiple
sources exchanging data based on a time-driven paradigm, or by an event-driven paradigm,
thus providing data streams that are dificult for data mining tools to directly analyze [ 12].</p>
        <p>Addressing the practical undertaking, Wrembel [7] in comparison of research vs.
industrial projects, discusses task orchestration (reordering) techniques called Push down, which
implies lowering the critical Extract-Transform-Load (ETL) tasks towards the beginning of data
preprocessing. This is primarily oriented to task complexity orchestration aiming to achieve
scalability and increase the performance of the overall data mining process. However, it does
emphasize the significance of the time component in data preprocessing relevant to real-time
process control.</p>
        <p>Bearing in mind the implication of task orchestration on the data mining pipeline, i.e.
compressing the hierarchical structure by lowering the data presentation layer in Fig.1, the
appropriate structure needs to be defined for data preprocessing framework at the HMI level. Based on
the comprehensive hierarchical structure of data mining defined by Yuan [ 13], Fig.2 depicts the
hierarchical structure focusing on HMI as a data presentation layer, encompassing five crucial
layers: Resource access layer, Resource layer, Data layer, Data preprocessing layer, and user
application layer.</p>
        <p>The purpose of such a structure is to guarantee the overall process of data mining in terms
of resource access, ordering, data sources, implemented tasks, and data presentation, i.e. user
application layer in the figure.</p>
        <p>Human Machine Inteface (HMI) / Operator panels / Mobil devices
Alarm/Event
summary</p>
        <p>Historical
logging</p>
        <p>Display
pages</p>
        <p>Trend
charts</p>
        <p>Graphic
designer</p>
        <p>Reports</p>
        <p>Visualization
tools</p>
        <p>Data Data Data Data Data
understanding Integration transformation celaning reduction
Feature Data
selection formatting</p>
        <p>Maintenance,
Operation, Security
and Audit logs.</p>
        <p>Functional spec.</p>
        <p>Equipemt config files</p>
        <p>PLC
IIoT smart devices
Vendor web sources</p>
        <p>Field devices
(sensors/actuators)</p>
        <p>High resolution
acquistion systems
Alarms and Events Electrical, hydraulic,</p>
        <p>definitions water cooling dwg.</p>
        <p>PLC
Security
policies</p>
        <p>Fast data
acquistion</p>
        <p>Field
devices</p>
        <p>Virtual
environment</p>
        <p>Fieldbus
protocols</p>
        <p>Remote
access
Remote
access</p>
        <p>IIoT smart
devices</p>
        <p>IIoT
protocols</p>
        <p>Cybersecurity
mechanisms
Cybersecurity
mechanisms</p>
        <p>User applicaton layer
Data preprocessing layer
Data layer
Resource layer
Resource access layer</p>
      </sec>
      <sec id="sec-3-4">
        <title>3.4. Advanced analytic task applicable to HMI</title>
        <p>
          In consideration of all the above research papers addressing data preprocessing in data mining
significant for the topic in question, we have singled out the following AI analytic tasks: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
Data gathering, (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) Data presentation, (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Data Analysis, (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) Irregularity detection, (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) Reporting,
(6) User interaction. These analytic tasks have an impact on the HMI if applying adequate task
orchestration following Wrembel[7], and are in line with research papers dealing with big data
in the industrial sector and process mining [8, 14].
        </p>
        <p>Table 1 provides a comparison of targeted segments by the above tasks on the AI-powered
analytics vs. HMI, thus indicating HMI segments that can potentially benefit from implementing
these tasks directly at the process control domain, i.e. focusing on the HMI as the critical
component of user interaction.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Discussion</title>
      <p>human-machine interface (HMI) AI-powered analytics
Field devices, PLC, operator inputs, Field devices, equipment logs,
realconfiguration files, L2 automation time and historical databases
Representation of the controlled pro- Analysis presented as insights,
precess and equipment status in real- dictions, recommendations,
supporttime, graphical and numerical, lim- ing decision-making.
ited historical data.</p>
      <p>Visual representation, operators Advanced data analysis, uncover
inmonitor and analyze real-time sights and patterns in the data.
process variables, equipment status,
and trends.</p>
      <p>Alarms and alerts based on
predefined conditions enabling immediate
action.</p>
      <p>Customized reports on system
performance, alarms/events, production
data, and relevant metrics</p>
      <sec id="sec-4-1">
        <title>Detection of anomalies, deviations,</title>
        <p>or patterns indicative of faults or
abnormal behavior.</p>
        <p>Automated reports summarizing
analysis results, predictions, or
performance metrics based on historical
or real-time data.</p>
        <p>Automated decision support by
analyzing data, identifying patterns, and
ofering recommendations to
operators or control systems.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Direct interaction between operators and the control system, enabling process control and monitoring based on domain knowledge.</title>
        <p>In this section, we identify the point of concern in terms of HMI data presentation and
visualization that can be addressed by the data preprocessing leveraging on the findings in
aforementioned research papers and implementing HMI-adapted data mining hierarchical
structure depicted in Fig.2.</p>
        <p>
          In regard to data preprocessing tasks and targeted HMI segments addressed in Fig.1, we have
identified the following HMI data presentation points of concern that can benefit from the
implementation of data preprocessing in compliance with the hierarchical structure shown in
Fig.2: (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) Synoptic view variable refresh cycles – in case of an increased amount of data it
is common practice to apply diferent refresh cycles for the visualization that, if implemented
statically, may afect time stamp in further data analysis. (
          <xref ref-type="bibr" rid="ref2">2</xref>
          ) Signal value range for graphic
presentation – if the value range defined by the field equipment is applied to the graphic
element, and the defined range is much larger than the range of values typically encountered
in the system, it can lead to insuficient resolution or precision on the display. (
          <xref ref-type="bibr" rid="ref3">3</xref>
          ) Poor IoT
input data quality – addressable by data preprocessing techniques implemented on the field,
such as logging practices and sensor placement [15]. (
          <xref ref-type="bibr" rid="ref4">4</xref>
          ) Missing data – although data streams
including missing data may still be of great significance to the data mining process, if they do
not correlate with the key indicators, and do not impact output variables of interest for the
dedicated graphic display, they should be excluded from visualization. (
          <xref ref-type="bibr" rid="ref5">5</xref>
          ) Outliers – Some
rarely appearing extreme values are crucial for process monitoring, and must be distinguished
from random or cyclical disturbance in the measurement. (6) seemingly redundant data – in
terms of data redundancy, the specific situation with HMI data presentation is that these data
are often presented as valuable information confirmation, and should not be confused with data
from multiple sources. (7) Noise – Some of the data structures and techniques typically used to
represent device status, such as bit masks applied on integer value showing variable frequency
drive status, can be interpreted as inconsistent, or erroneous variation in data if not properly
defined at the HMI level.
        </p>
        <p>Fig.3 shows signals extracted from a dataset containing real-time data of the line tension
control of continuous annealing process line demonstrating some of the above points. For
this purpose alone, the significance of these signals to the underlying industrial processes is
irrelevant.</p>
        <p>1
2
3
4
5
6
7
8</p>
        <p>
          We can regard them as generic analog process values retrieved from the PLC or field devices.
In this specific example, they represent (
          <xref ref-type="bibr" rid="ref1">1</xref>
          ) strip line remaining length at cut point, (
          <xref ref-type="bibr" rid="ref2 ref3">2,3</xref>
          ) variable
frequency drive status, (
          <xref ref-type="bibr" rid="ref4 ref5">4,5</xref>
          ) strip line speed over a motorized roll, (6) tension leveler elongation,
and (7,8) line tension on exit section. It is important to note that the depicted signals present
regular values in relation to the ongoing process.
        </p>
        <p>Regarding the above-defined points of concern, peak values of signals 1, 2, and 3 can easily
be interpreted as outliers, although in this case, they provide valid information. In this respect,
signals 2 and 3 may additionally be interpreted as noise due to their random peak values, as
they are formatted as bit masks, i.e. each bit of the 16-bit word independently signals a diferent
status, thus contributing to seemingly random resulting value. Signals 4 and 5 could easily be
interpreted as redundant, although they show opposite directions in terms of device control, i.e.
one is a set value toward the device, and the other is device feedback. Additionally, signals 7
and 8 emphasize the scaling issue since both show the same process value (line tension) but in
diferent units (mm and ton) which on the HMI screen results in poor graphic visualization.</p>
        <p>These examples additionally emphasize the importance of understanding data prior to
engaging in data preprocessing at the HMI level, but also the potential to enhance standard data
mining approach by defining rule-based data preprocessing at the lower levels, before data
aggregation in the data lakes and warehouses.</p>
        <p>Some of the above points may seem trivial to resolve. However, due to reasons stated earlier
in the paper afecting HMI development, these points cannot be addressed until the SCADA
system is in the production phase. Only then above-identified points are fully manifested on
the HMI screen and need further analysis to locate the exact problems and provide adequate
solutions that can be implemented in the form of a rule-based approach.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Relevant works</title>
      <p>A number of research papers were addressed in previous sections that provide insights valuable
to the topic in question. Although these insights can be transferred to alternative contexts, and
some of them afect HMI data presentation, none of them specifically address data preprocessing
at the HMI level. Approaching from a diferent angle, we applied the search string against
the Scopus database, which combines HMI, industrial context, and the aforementioned data
preprocessing tasks significant for the topic in hand.</p>
      <p>The result of a total of eight papers in the period of ten years (2013 – 2023) has shown a
marginal interest in the scientific community compared to the results of the previously applied
search string in section II.</p>
      <p>From the method standpoint, these works are predominantly focused on artificial intelligence,
i.e. neural networks, support vector machine, and hidden Markov model. The fields of interest
are in line with works addressed in section II, i.e. HMI is mainly addressed in terms of user
interaction such as emotion, gesture, and speech recognition in advancing toward next-generation
non-invasive human-machine interface [16]</p>
      <p>Although these works do not implement methods that can be integrated at the HMI level
without considering additional computational power, i.e. extending to the additional ICS
components, they do meet the remaining three criteria defined in section II. In this respect,
Wang et al. dealing with feature extraction and low accuracy of multi-gesture recognition
in real-time human-computer interaction, implemented a convolutional neural network and
achieved high accuracy with a delay of less than 300 ms.</p>
      <p>Similarly, Ji et al. [17] implemented a hidden Markov model dealing with feature extraction
in the domain of mechanical acoustic signal asisted translational model for industrial HMI.
Verified on typical industrial HMI application, the proposed model achieved 14.3% performance
improvement compared with traditional methods.</p>
      <p>This shows that similar data preprocessing techniques can be implemented at the HMI level
without significantly afecting the real-time performance of the HMI runtime layer.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>The adoption of data mining techniques for data preprocessing at the SCADA-based HMI level
within the Industry 4.0 paradigm has the potential to significantly enhance data visualization
and improve data quality in process control within the manufacturing industry. By addressing
concerns such as variable refresh cycles, signal value range, poor IoT data quality, missing data,
outliers, seemingly redundant data, and noise, operators can make better-informed decisions
based on accurate and reliable information. In this respect, the application of data mining
techniques at the HMI level, prior to data storage in databases, can optimize data presentation
and visualization, potentially leading to increased operational eficiency. Furthermore, the
incorporation of domain knowledge in data preprocessing at the HMI level can yield distinct
results compared to preprocessing at higher levels of the data processing pipeline. Overall,
leveraging data preprocessing techniques in the HMI domain demonstrates promising potential
for driving data-driven decision-making and process optimization in the manufacturing industry.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>This work was supported by the Croatian Science Foundation under the project
HRZZ-IP-201904-4216.
[6] I. Battas, R. Oulhiq, H. Behja, L. Deshayes, A proposed data preprocessing method for
an industrial prediction process, in: 2020 6th IEEE Congress on Information Science and
Technology (CiSt), 2020, pp. 98–103. doi:1 0 . 1 1 0 9 / C i S t 4 9 3 9 9 . 2 0 2 1 . 9 3 5 7 2 6 9 .
[7] R. Wrembel, Data integration, cleaning, and deduplication: Research versus industrial
projects, in: E. Pardede, P. Delir Haghighi, I. Khalil, G. Kotsis (Eds.), Information Integration
and Web Intelligence, Springer Nature Switzerland, Cham, 2022, pp. 3–17. doi:h t t p s :
/ / d o i - o r g . e z p r o x y . n s k . h r / 1 0 . 1 0 0 7 / 9 7 8 - 3 - 0 3 1 - 2 1 0 4 7 - 1 _ 1 .
[8] J. Zhu, Z. Ge, Z. Song, F. Gao, Review and big data perspectives on robust data mining
approaches for industrial process modeling with outliers and missing data, Annual Reviews
in Control 46 (2018) 107–133. URL: https://www.sciencedirect.com/science/article/pii/
S1367578818301056. doi:h t t p s : / / d o i . o r g / 1 0 . 1 0 1 6 / j . a r c o n t r o l . 2 0 1 8 . 0 9 . 0 0 3 .
[9] A. Kochański, Data preparation, Computer Methods in Materials Science 10 (2010) 25–29.
[10] X. Gao, P. Liu, Q. Zhang, D. Gao, X. Huang, Analysis and application of manufacturing
data driven by digital twins, Journal of Physics: Conference Series 1983 (2021) 012104.
doi:1 0 . 1 0 8 8 / 1 7 4 2 - 6 5 9 6 / 1 9 8 3 / 1 / 0 1 2 1 0 4 .
[11] R. Cupek, M. Drewniak, T. Steclik, Data preprocessing, aggregation and clustering for agile
manufacturing based on automated guided vehicles, in: M. Paszynski, D. Kranzlmüller,
V. V. Krzhizhanovskaya, J. J. Dongarra, P. M. Sloot (Eds.), Computational Science – ICCS
2021, Springer International Publishing, Cham, 2021, pp. 458–470.
[12] X. Fei, N. Shah, N. Verba, K.-M. Chao, V. Sanchez-Anguix, J. Lewandowski, A. James,
Z. Usman, Cps data streams analytics based on machine learning for cloud and fog
computing: A survey, Future Generation Computer Systems 90 (2019) 435–450. doi:h t t p s :
/ / d o i . o r g / 1 0 . 1 0 1 6 / j . f u t u r e . 2 0 1 8 . 0 6 . 0 4 2 .
[13] M. Yuan, K. Deng, W. Chaovalitwongse, H. Yu, Research on technologies and application
of data mining for cloud manufacturing resource services, The International Journal of
Advanced Manufacturing Technology 99 (2018) 1061–1075. URL: https://doi.org/10.1007/
s00170-016-9661-6. doi:1 0 . 1 0 0 7 / s 0 0 1 7 0 - 0 1 6 - 9 6 6 1 - 6 .
[14] D. Stefanovic, D. Dakic, B. Stevanov, T. Lolic, Process mining in manufacturing: Goals,
techniques and applications, in: B. Lalic, V. Majstorovic, U. Marjanovic, G. von Cieminski,
D. Romero (Eds.), Advances in Production Management Systems. The Path to Digital
Transformation and Innovation of Production Management Systems, Springer International
Publishing, Cham, 2020, pp. 54–62.
[15] Y. Bertrand, R. Van Belle, J. De Weerdt, E. Serral, Defining data quality issues in process
mining with iot data, in: M. Montali, A. Senderovich, M. Weidlich (Eds.), Process Mining
Workshops, Springer Nature Switzerland, Cham, 2023, pp. 422–434.
[16] S. Wang, L. Huang, D. Jiang, Y. Sun, G. Jiang, J. Li, C. Zou, H. Fan, Y. Xie, H. Xiong, B. Chen,
Improved multi-stream convolutional block attention module for semg-based gesture
recognition, Frontiers in Bioengineering and Biotechnology 10 (2022). doi:1 0 . 3 3 8 9 / f b i o e .
2 0 2 2 . 9 0 9 0 2 3 .
[17] Z. Ji, C. Chen, J. He, X. Guan, Mechanical acoustic signal assisted translational model
for industrial human-machine interaction, in: 2019 IEEE Global Conference on Signal
and Information Processing (GlobalSIP), 2019, pp. 1–5. doi:1 0 . 1 1 0 9 / G l o b a l S I P 4 5 3 5 7 . 2 0 1 9 .
8 9 6 9 4 6 8 .</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>P.</given-names>
            <surname>Leal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. N.</given-names>
            <surname>Madeira</surname>
          </string-name>
          , T. Romão,
          <article-title>Model-driven framework for human machine interaction design in industry 4.0</article-title>
          , in: D.
          <string-name>
            <surname>Lamas</surname>
            ,
            <given-names>F.</given-names>
          </string-name>
          <string-name>
            <surname>Loizides</surname>
            ,
            <given-names>L.</given-names>
          </string-name>
          <string-name>
            <surname>Nacke</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Petrie</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Winckler</surname>
          </string-name>
          , P. Zaphiris (Eds.),
          <string-name>
            <surname>Human-Computer</surname>
            <given-names>Interaction - INTERACT</given-names>
          </string-name>
          <year>2019</year>
          , Springer International Publishing, Cham,
          <year>2019</year>
          , pp.
          <fpage>644</fpage>
          -
          <lpage>648</lpage>
          .
          <source>doi:1 0 . 1 0</source>
          <volume>0 7 / 9 7 8 - 3 - 0 3 0 - 2 9 3 9 0 - 1</volume>
          _
          <fpage>5</fpage>
          4 .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nassehi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <article-title>Context-aware manufacturing system design using machine learning</article-title>
          ,
          <source>Journal of Manufacturing Systems</source>
          <volume>65</volume>
          (
          <year>2022</year>
          )
          <fpage>59</fpage>
          -
          <lpage>69</lpage>
          . doi:h t t p s : / / d o i .
          <source>o r g / 1 0 . 1 0 1 6 / j . j m s y . 2 0 2 2 . 0 8 . 0 1 2 .</source>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>S.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Vajirkar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <article-title>Context-based data mining using ontologies</article-title>
          , in: I.
          <string-name>
            <surname>-Y. Song</surname>
            ,
            <given-names>S. W.</given-names>
          </string-name>
          <string-name>
            <surname>Liddle</surname>
          </string-name>
          , T.-W. Ling, P. Scheuermann (Eds.),
          <source>Conceptual Modeling - ER 2003</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>2003</year>
          , pp.
          <fpage>405</fpage>
          -
          <lpage>418</lpage>
          . doi:h t t p s : / / d o i .
          <source>o r g / 1 0 . 1 0</source>
          <volume>0 7 / 9 7 8 - 3 - 5 4 0 - 3 9 6 4 8 - 2</volume>
          _
          <fpage>3</fpage>
          2 .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>H.</given-names>
            <surname>Salam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Celiktutan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Gunes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chetouani</surname>
          </string-name>
          ,
          <article-title>Automatic context-aware inference of engagement in hmi: A survey</article-title>
          ,
          <source>IEEE Transactions on Afective Computing</source>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>20</lpage>
          .
          <source>doi:1 0 . 1 1</source>
          <volume>0</volume>
          <fpage>9</fpage>
          <string-name>
            <surname>/ T A F F C .</surname>
          </string-name>
          <article-title>2 0 2 3 . 3 2 7 8 7 0 7</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>M.</given-names>
            <surname>Salima</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. M'Hammed</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Messaadia</surname>
            ,
            <given-names>S. M.</given-names>
          </string-name>
          <string-name>
            <surname>Benslimane</surname>
          </string-name>
          ,
          <article-title>Context aware human machine interface for decision support</article-title>
          ,
          <source>in: 2023 International Conference On Cyber Management And Engineering (CyMaEn)</source>
          ,
          <year>2023</year>
          , pp.
          <fpage>143</fpage>
          -
          <lpage>147</lpage>
          .
          <source>doi:1 0 . 1 1</source>
          0 9 / C y M a
          <source>E n 5 7</source>
          <volume>2 2 8 . 2 0 2 3 . 1 0 0 5 1 0 7 8 .</volume>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>