=Paper=
{{Paper
|id=Vol-3762/598
|storemode=property
|title=Data & AI for Industrial Application
|pdfUrl=https://ceur-ws.org/Vol-3762/598.pdf
|volume=Vol-3762
|authors=Antimo Angelino
|dblpUrl=https://dblp.org/rec/conf/ital-ia/Angelino24
}}
==Data & AI for Industrial Application==
Data & AI for Industrial Application
Antimo Angelino
1MBDA Italia SpA, via Carlo Calosi, Bacoli, 80070, Italy
Abstract
The use of Artificial Intelligence in the Industry can lead to recovery of efficiency for industrial
processes (such as reduce scrap and rework rate, increase throughput time), and this can carry
competitive advantages. Nevertheless, to correctly deploy artificial intelligence projects it is
needed to have connectivity and quality data. Both are enabling factors for AI projects, and
industries must put in place processes to reach them before to start the AI journey.
Keywords
Artificial Intelligence, Data Analytics, Data Quality, Data Strategy, Industrial Network 1
1. Where AI comes from o expert systems: capable of
simulating deductive logical
The Artificial Intelligence (AI) is something wider reasoning
and older that the hype of last years. Its birth can be o fuzzy logic : capable of introducing
dated back to 1943 when McCulloc and Pitts uncertainty management into
introduced the concept of artificial neurons for the logical reasoning
first time. Concept then taken up by Rosenblatt in o genetic algorithms: which, by
1958 who presented the first artificial neural imitating natural selection, are
network: the perceptron. In the middle, Alan Turing able to identify the optimal
(the father of computer science) in 1950 introduced solution to a given problem;
the concept of an intelligent machine. o artificial neural networks : systems
In its life AI has undergone various ups and downs that simulate the neural networks
and the alternating fortunes have always been linked of our brain are able to learn from
to successes in real use cases or to the emergence of data and extrapolate behaviors
favorable technological conditions, as well as periods and information;
of abandonment have been conditioned by the failures • ML: specific AI techniques that make
in projects too ambitious or not yet technologically computers capable of learning;
mature. Since the mid-2000s there has been a • DL: a subset of ML techniques specifically
rediscovery of AI thanks to the birth of a branch that based on deep (or multilayer) neural
is well suited to the resolution of predictive problems: networks suitable for solving computer
the machine learning (ML). vision, image recognition and signal
Figure 1 shows the relationship between AI, ML, processing problems;
DL (deep learning) and GEN_AI (generative AI), which • GEN_AI: a sub set of DL that use NLP (natural
we could briefly define as: language processing) technique to elaborate
text and predict sentence starting from an
• AI: any technique that makes computers input (prompt)
capable of imitating human behavior; among
these we remember the most emblematic:
Ital-IA 2024: 4th National Conference on Artificial Intelligence, © 2024 Copyright for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).
organized by CINI, May 29-30, 2024, Naples, Italy
∗ Corresponding author.
antimo.angelino@mbda.it (A. Angelino);
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
floor) are raw data without the minimum
requirement for trustworthiness and quality, and so
they need to be pre-processed before to be used; data
coming from main informative systems are normally
trusted date, because there are in place processed to
control and validate them before charged into the
systems.
Besides the collection of the data is very complex due
to the different nature of their sources. In fact the
main informative systems are normally available over
the company enterprise network, while machines and
sensors are often isolated and when connected they
are appended to a special network (normally called
edge network). Due to cyber security risks, it is not
possible to connect directly the two type of network.
Therefore, to collect all the data it is necessary to put
in place and industrial network according the
standard ISA-IEC 62443 (known also as ISA-99, that
Figure 1: relationship between AI and its major replaced the ISA-95 the first standard for Industrial
frameworks application Network).
2. What means Industrial Domain
With the reference “industrial domain” we means all
the processes involved into manufacturing,
maintenance and quality activities of industries;
where industry start when raw materials arrive and
finish when manufactured item is delivered, so
excluding supply chain and customer support.
The data involved into the industrial domain are
generated by manufacturing and test machines and by
IIoT (industrial internet of things) sensors;
nevertheless also specific data contained into the MRP
(manufacturing resource planning), MES
(manufacturing execution system) and QMS (quality
management system) are involved.
Figure 3: industrial network schema
3. AI in the Industrial Domain
However, the birth of ML was not the only triggering
factor behind the rediscovery of AI in the new
millennium, but rather there was a mix of accelerating
factors such as:
1. The exponential increase in data
Figure 2: what is Industrial Domain availability, thanks to the internet,
connectivity systems (both wired and
Not only the source but also the type and format of mobile) and intelligent sensors (commonly
these data are very different: data coming from called IoT = Internet of Things );
machines and sensors (normally located on the shop
2. The possibility of collecting data in real from a reactive approach, i.e. chasing the problem
time, thus providing an instant after the event; to a predictive one, i.e. anticipating the
representation of reality; problem before the event happens.
3. The constant increase in computing power
combined with the miniaturization of
devices available at ever-decreasing costs
(at least until pre-Covid )
All this has placed data at the center of decision-
making strategies, effectively evolving decision-
making models from knowledge models based or based
on knowledge (often empirical and built with
experience), to data driven models . A first effect
induced by this epochal change is that while
knowledge-based models were (and still are) used
mainly for descriptive analysis (i.e. describing an Figure 5: timing line of data analysis
event that occurred), data-driven models can be used
for predictive analyses. (i.e. predicting an event before Predictive and prescriptive analyzes have been used
it happens). for years in various sectors: from financial to risk
Therefore, if data constitute the fuel of new decision- management, from marketing to communication and
making models, data analytics and advanced data even politics; always with the aim of predicting events
analytics techniques constitute their engine. In and (trying to) influence them through targeted
particular, while data analysis techniques are actions.
essentially based on the most common statistical In the industrial domain, the principal application is
formulas and are used to describe an event and the prediction of failures, both on the product and on
diagnose its cause; the advanced ones are based on AI the machinery, and the prescription of actions aimed
(mainly ML) algorithms and used to predict an event at ensuring that they do not occur. In fact, these
and prescribe actions to influence it. Figure 4 (Gartner applications have a direct impact on the efficiency and
2012) shows the evolution of data analytics in four effectiveness of industrial processes, and
phases: descriptive, diagnostic, predictive and consequently on competitiveness and cost reduction
prescriptive; depicted in a Cartesian plane whose axes of companies. There are various declinations and use
represent the value (returned by the analysis) and the cases implemented in industries, which have given
difficulty (of the analysis itself). rise to different methodologies:
• failure prediction (prediction of product
failures),
• predictive quality (prediction of product
quality),
• predictive maintenance (prediction of
machinery failures).
The adoption of predictive systems in the industrial
world, although pushed by managers who see the
potential benefits, often encounters the reluctance of
technicians who, accustomed to analyzes based on
empirical experience, have difficulty accepting the
predictions made from a heuristic model using ML
algorithms. Therefore, a fundamental step for a
Figure 4: Gartner data analysis phases predictive model to be accepted (and consequently
then correctly used) is to demonstrate its reliability:
The Figure 5 illustrates how the adoption of its predictions and prescriptions are true. To do this
data-driven decision-making models combined with the model validation phase is fundamental; it takes
advanced analysis techniques enables another place after the training and testing phases. This phase
epochal change in business processes: the is developed using real data, i.e. historical data
anticipation of corrective actions. Therefore moving recorded in the company and referring to real events:
the predictive model, once trained and tested, is asked
to provide a prediction starting from known data; the
model's prediction is acquired and compared with
what really happened.
To give some more elements, it is necessary to
divide the predictive models based on their
applications: regression models and classification
models. The former are models that predict values,
while the latter predict the classification of an event.
A classic explanatory example is the predictive
meteorology systems: a regression system predicts
the temperature value, a classification model whether
the day will be sunny, cloudy or rainy. The most
commonly used metrics for validating these models Table 2: SWOT matrix for AI adoption
are:
It is so clear that managing data is the most critical
1. RMSE (Root Mean Squared Error) for part in the deployment of an AI project. In particular,
regression models
the industrial domain there are completely different
2. Confusion Matrix for classification models
data source and data format.
The first metric consists of calculating the square First it is necessary to identify all data sources and
root of the mean square error between the value define how to connect them to allow data collection
predicted by the model and the actual value recorded and assure the possibility to recovery data in
in the company. An acceptable error value is less than continuous and automatic way. After that it is
3%. The confusion matrix , on the other hand, is a necessary to identify (for each data source) the data
slightly more complex metric, which is based on the format, so to label them and create a data catalog. It is
combination of the exact and incorrect classifications important to note that not all the data generated are
made by the predictive model compared to the real in a format “ready to use” for an AI model. Often it is
ones. Figure 4 illustrates this in a very intuitive way. necessary to pre-process the data with specific actions
The parameters of confusion matrix should have a (i.e. cleaning, pruning, filtering, augmenting, etc.) to
value greater than 95%. transform them from raw data to quality data. In
particular in the industrial domain all the data
generated by machines and sensors are normally raw
data. It is therefore important in this phase to well
define which pre-processing activities are necessary
to transform raw data into quality data. International
standards are available and can be used as reference:
1. ISO/IEC 8183:2023 - Data Life Cycle
Framework
2. ISO/IEC 42001:2023 - Artificial intelligence
Management System
3. ISO/IEC DIS 5259-1 - Data quality for
Table 1: Confusion Matrix analytics and machine learning
4. Data Strategy In conclusion it is necessary that companies put in
place a data strategy to correctly manage their data,
We can use a SOWT matrix to resume the use of AI in setting specific processes with dedicated role profiles.
the industrial domain. An appropriate data strategy will safe companies
against the threat of collecting big amount of data not
useful for an AI model.
The Table 3 shows the principal activities for a Data
Strategy, with indication of specific role profiles (data
steward and data engineer) to set up in the business
divisions. Normally a role of data architect is set up in
the information technology division to manage data the human must then validate and execute? In this
catalog and data storage policies. case we talk about the dualism "autonomous systems
vs human in the loop systems". The former are systems
in which AI is given the opportunity to make decisions
and carry out actions, in the second case the AI
systems process the information and then suggest a
decision or action that the human being must validate.
The analysis is not trivial: there has been discussion
about autonomous systems for years and various
experiments and research have been conducted in
many fields of application, but to date no system has
fully convinced.
Table 3: main principles of a data strategy
5. Safeguards and Ethics
A problem that often emerges and leads models to
make incorrect predictions is the bias (conditioning)
of the data. In fact, a fundamental element for a model
to make reliable predictions is that the data with
which it is trained and tested are representative of the
event it wants to predict. If the data only partially
represents the event or represents a distorted view of
it, even if the predictive model passes the training and
testing phases, it will then make incorrect predictions.
Furthermore, since, by their intrinsic nature, it is
difficult to analytically verify the behavior of an AI-
based predictive model. Therefore to mitigate the
effects of an incorrect prediction, the concept of risk
management was introduced, which provides control
mechanisms on decisions taken following predictions Figure 6: EU AI ACT risk management
with the intention of limiting potential induced errors.
The practice is now an international standard, which The latest research frontier in this field is called
finds its place in the reference document ISO/IEC explainable artificial intelligence (XAI) and was
23894. Also the recent EU AI ACT is based on the risk introduced by Michael Van Lent in 2004. Its scope is
management. The figure 6 shows the level of risk to give put in clear the activities done by an AI system
based on the AI model implication. For example an during its training, testing and execution phases.
autonomous system that classify spam email is
considered as low risk, while a system used for social
scoring is an unacceptable risk.
The problems inherent to the possibility of
incorrect behavior of predictive models based on AI,
added to the difficulty of analytical and timely
feedback, lead to a much broader reflection that
introduces the topic of ethics in the use of AI: what use
should we make of AI? How much decision-making
autonomy can we leave them? On the first, a conscious
use of AI can help automate tasks that are currently
manual and repetitive, giving a notable boost to Figure 7: explanation AI model
industrial processes. The second question is much
more complex: can AI-based systems be autonomous 6. AI & Quantum Computing
in making decisions and carrying out actions or do
An important step in the AI systems is the
they only have to suggest a decision (or action) which
quantum machine learning (QML), it is the
combination of machine learning and quantum [7] E. Rich, K. Knight, Intelligenza Artificiale,
computing with the aim to use the machine learning seconda edizione, McGraw Hill Italia, 1994.
model on the quantum computers. There are a lot of [8] R. H. Nielsen, Neurocomputing, Addison Wesley,
research on this field in academic world, while there 1991.
are growing the first proof of concept application in [9] S. J. Russell, P. Norvig, Intelligenza Artificiale,
the industrial companies. terza edizione, Pearson, Vol. 1 e 2, 2010.
Nevertheless there is still a lot of work to do: the [10] M. Van Lent, An explainable artificial intelligence
quantum computer for industrial application will be system for small-unit tactical behavior, in:
available probably within 5-7 years; while the Proceedings of the Nineteenth National
classical machine learning models are not yet all Conference on Artificial Intelligence, Sixteenth
transformed into quantum one. Conference on Innovative Applications of
Artificial Intelligence, July 25-29, 2004, San Jose,
California, USA
[11] M. Schuld, I. Sinayskiy, F. Petruccione. “An
introduction to quantum machine learning.”
Contemporary Physics 56 (2014): 172-185.
A. Online Resources
A complete explanation of the confusion matrix is at:
https://www.andreainini.com/ai/machine-
learning/matrice-di-confusione
The official web page of EU AI ACT is:
Figure 8: quantum machine learning models https://artificialintelligenceact.eu/the-act/
7. Conclusion The US Department of Defense official web page for
In conclusion, AI and its growing applications in Explainable AI is:
industries can represent an opportunity to achieve https://www.darpa.mil/program/explainable-
otherwise unattainable target. Nevertheless we must artificial-intelligence
know and govern AI, in order to use it in correctly and
exploiting all its potential in the way most suited to
the application context and comfortable to industries
operating way.
References
[1] W. McCulloch, W. Pitts. "A Logical Calculus of
Ideas Immanent in Nervous Activity." Bulletin of
Mathematical Biophysics 5 (1943): 115-133.
[2] F. Rosenblatt. “The perceptron: a probabilistic
model for information storage and organization
in the brain.” Cornell Aeronautical Laboratory,
Psychological Review 65.6 (1958): 386-408.
[3] Alan M. Turing. “Computing machinery and
intelligence.” Mind 59 (1950): 433-460
[4] A. Angelino. “L’Intelligenza Artificiale.”
Ingegneri Napoli 1 (2013): 3-10
[5] A. Angelino. “L’Intelligenza Artificiale per i
processi industriali.” Sistemi & Impresa 4
(2022): 28-32.
[6] A. Angelino. “La convergenza tra IT ed OT.”
Sistemi & Impresa 2 (2022): 36-39.