<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>On Identifying Repeated Patterns of OT Attacks with LOGistICS</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stefano Bistarelli</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Emanuele Bosimini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Santini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Aruba S.p.A.</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ponte San Pietro (Bergamo)</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Mathematics and Computer Science, University of Perugia</institution>
          ,
          <addr-line>Perugia</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>ITASEC'22: Italian Conference on Cybersecurity</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>LOGistICS is a novel monitoring framework for investigating the security of industrial PLC systems. Diverse processing components and probes with diferent tasks are included in the architecture. Reconstructing the process of cyber-attacks against industrial devices in its entirety, and also identifying the actors, is the topic of this study. Our solution is suitable for research and production environments, allowing for more interaction with an attacker while remaining undetectable as we will see from the behaviour of the actors. To achieve this aim, we exposed LOGistICS to the Internet with a module that we developed to examine suspicious activities, sequencing the operations carried out by the same attacker over time and therefore identifying its pattern. We found that the Cyber Kill Chain model is not always respected. We discuss these anomalies and provide our insights.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Operational Technology</kwd>
        <kwd>Honeypot</kwd>
        <kwd>Attack behavior</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        ware used for this purpose was Industroyer2, which targets the controller hardware that manages
the flow of water, use of cleaning agents and other embedded machines that keep water systems
running eficiently [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. We can think about resorting to protection measures adopted in
conventional ICT systems, such as a Firewall or an Intrusion Detection System. However, this introduces
design-related safety issues. The priority of an ICT system is to focus on confidentiality and data
integrity, but, on the other hand, ICS are designed to ensure reliability/availability, even
jeopardizing the confidentiality and integrity of data. Further common goals in ICS are related to the safety
of operators and environment, and, of course, productivity. An example of such diferent security
concerns with respect to ICT systems is represented by the Modbus protocol [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], a long-lived and
reliable protocol, which nonetheless comes without authentication mechanisms. Nowadays the
devices and protocols adopted in an ICS are used in nearly every industrial sector and critical
infrastructure such as the manufacturing, transportation, energy, and water treatment industries. Given
all the introduced peculiarities and the importance of monitoring security in such applications, in
this paper we present the enhanced implementation of LOGistICS an emulation and monitoring
system for OT [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], and the experimental analysis conducted through the behavioral module. In
particular, we focus on the benefits of using our solution to identify the attitude of the actors attacking
industrial protocols, such as S7Comm from Siemens [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] and Modbus from Schneider Electric [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ],
which are also used extensively in home and building automation. The framework we propose is
composed of i) a highly customizable honeypot that emulates the two aforementioned ICS services,
ii) a snifer that records trafic, and iii) a monitoring node, which is capable of analyzing and
plotting recorded data. A fourth component implements malicious activities towards the honeypots,
being an assessment node not properly part of the monitoring platform (more a test node). What is
described in the following goes under the umbrella of deception technology, which is an emerging
category of cybersecurity defense. Deception technology automates the creation of traps (decoys),
and can detect, analyze, and defend against zero-day and advanced attacks, often in real time.
      </p>
      <p>
        Honeypots [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] are physical or virtual systems used as “baits” to decoy the attackers. Their
goal is to expose only the vulnerable interface of a full service (e.g. SSH), without implementing
all of its logic and features. Honeypots can be used to attract and record attacks, allowing the
underlying patterns to be identified [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. The interaction metrics are widely used to characterize
honeypots. Low-interaction honeypots give limited responses: they’re mostly employed for
statistical analysis, and they’re good enough to spot spikes in the number of requests, such as
those caused by autonomous malware. In comparison to low-interaction honeypots, medium
interaction honeypots give a higher level of interaction for the attacker: their purpose is to generate
appropriate responses in the hopes of initiating follow-up attacks. High-interaction honeypots,
on the other hand, expose more resources than previous ones and capture the most amount
of data imaginable, including comprehensive attack logs. LOGistICS is light, stealthy, isolated
through containerization technologies, platform independent, easy to deploy, with a fair degree
of interaction. Our framework leverages two libraries that allow communication via S7Comm
and Modbus protocols: Snap7 3 and Pymodbus4 respectively. With the purpose to deceive search
engines, we made some low-level changes to make the behavior of such honeypots more faithful to
a real Programming Logic Controller (PLC). We prove its camouflage ability with reconnaissance
      </p>
      <sec id="sec-1-1">
        <title>3Snap7: snap7.sourceforge.net. 4Pymodbus: pymodbus.readthedocs.io/en/latest.</title>
        <p>Honeypot
CryPLH
HoneyPLC</p>
        <p>Gaspot</p>
        <p>HoneyVP
S7CommTrace</p>
        <p>
          Conpot
LOGistICS
and honeypot detection tools. Detection systems likely use unique characteristics of specific
honeypots in order to identify them, such as the property-value pairs of default honeypot
conifgurations. The majority of the honeypots discussed in Section 2 have a low level of interactivity.
Quantitative trafic analysis can benefit from this level of involvement. However, to analyze the
state of the art of attacks we cannot rely on this approach. According to Shodan,5 the most common
mistake of researchers deploying honeypots is to use a default configuration without applying
changes [
          <xref ref-type="bibr" rid="ref8">8</xref>
          ]. All default configurations return the same banner, including same PLC names, same
serial numbers etc. Around 30% of Siemens S7 PLCs connected to the internet have the same serial
number, which corresponds to the default serial number of a Conpot instance.6 Moreover, in the
literature the only ICS honeypot that can be extended in a user-friendly way is Conpot,7 via XML.
        </p>
        <sec id="sec-1-1-1">
          <title>1.1. Paper structure</title>
          <p>
            Some initial results have been presented in [
            <xref ref-type="bibr" rid="ref4">4</xref>
            ]; this paper extends them by proposing a refined
architecture (see Figure 1) and describing the behavioral pattern of attackers. The remainder
of the paper is organized as follows: Section 2 summarizes the related work in the literature
and gives a minimal background to the reader. Section 3 reports necessary background notions
about the two protocols at the core of LOGistICS: Modbus and S7Comm. Section 4 describes
the implementation of the systems, while Section 5 reports some tests we performed to evaluate
LOGistICS. Section 6 presents the behavior of attackers, that is the sequence of commands they
executed on the medium-interaction honeypots. Finally, in Section 7 we conclude the paper
with final thoughts and several ideas about possible future work.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Gaspot is a project conducted in the Trend Micro research center,8 and was later presented at the
Blackhat 2015 conference [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].9 This low interaction honeypot is designed to emulate the behavior
of the Guardian AST, a remote monitoring system for gas tank. It is written in Python and some
options can be modified, such as the tank name and volume. CryPLH simulates a Siemens Simatic
S7-300 PLC. It emulates HTTP, HTTPS, S7Comm and SNMP services. Furthermore, the TCP/IP
      </p>
      <sec id="sec-2-1">
        <title>5Shodan: shodan.io.</title>
        <p>6The complete guide to Shodan: ia800705.us.archive.org/17/items/shodan-book-extras/shodan/shodan.pdf.
7Conpot: conpot.org.
8Trend Micro: https://www.trendmicro.com.
9Black Hat: https://blackhat.com.</p>
        <p>
          Stack is simulated via the Linux kernel [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. The authors identify CryPLH as a high-interaction
honeypot. As suggested in [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ], the attacker can never fully interact with the emulated PLC. The
authors set the maximum level of security by rejecting any authentication attempt. As a result,
[
          <xref ref-type="bibr" rid="ref11">11</xref>
          ] and [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] downgrade it as a low interaction honeypot. HoneyVP is a HoneyPLC-inspired
hybrid honeypot that redirects inbound industrial trafic from the cloud to real Siemens PLCs [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ].
Conpot is an open-source low-interaction honeypot that emulates a wide range of OT and IT
protocols. It emulates a Siemens S7-200 PLC by default [
          <xref ref-type="bibr" rid="ref14">14</xref>
          ], but it also includes some other
profiles that can be customized. Morales et al. developed HoneyPLC, an ICS honeypot based
on Honeyd [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. It emulates three protocols supported by Siemens PLCs: S7Comm, SNMP and
HTTP. HoneyPLC provides multiple PLCs profiles and it implements the capture of Ladder Logic
programs. S7CommTrace is a honeypot developed by Feng Xiao et al. that emulates S7-300
PLC [
          <xref ref-type="bibr" rid="ref5">5</xref>
          ]. It is extensible with a user template that defines the PLC fingerprint. Our solution, in
addition to providing a medium interaction honeypot capable of emulating two widely used ICS
services, is cross-platform, and provides a user-friendly graphical interface for PLC deployment.
Moreover, LOGistICS also supports 7 types of PLC out-of-the box. Based on the features and the
limitations of the previous contributions, in Table 1 we report a comparison of ICS honeypots.
        </p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Modbus and S7Comm</title>
      <p>In this section we introduce the necessary background notions about Modbus and S7Comm,
which are the two protocols monitored by LOGistICS. We mostly focus on the functions that
can be invoked on a PLC, which will be exploited during attacks (see Section 5 and Section 6).</p>
      <sec id="sec-3-1">
        <title>3.1. Modbus</title>
        <p>
          Modbus is a protocol developed by Modicon [
          <xref ref-type="bibr" rid="ref16">16</xref>
          ] in 1979. The creator company was subsequently
acquired by Schneider Electric. In 2004 the rights on the Modbus protocol were transferred from
Schneider Electric to the Modbus organization. This transition made it possible to make Modbus
an open protocol and to consequently adopt it on a large scale of devices (from intelligent sensors
to PLCs) and to ride the wave of the IoT [
          <xref ref-type="bibr" rid="ref17">17</xref>
          ]. Modbus is implemented with a master/slave
architecture, where the entities can perform simultaneous requests. We can divide Modbus messages into
four categories [
          <xref ref-type="bibr" rid="ref18">18</xref>
          ]. Request message is how the master establishes the transmission, sending the
request message to the slave. Response message is how the slave responds back to the master with
the requested data. Upon master receives a response, it sends a confirmation message to the queried
slave. Then an indication message is generated by the slave and confirms the receipt of master
request. Modbus operates at the application level, and for this reason it is interoperable with the
underlying levels by adapting the Application Data Unit (ADU ) structure. The Modbus PDU consists
of the data field and function code. The ADU is made up of the PDU, in addition to the Address and
Error Check field . In order to transport the PDU over TCP/IP (port 502) a header is added to identify
the ADU: the Modbus Application (MBAP) header [
          <xref ref-type="bibr" rid="ref19">19</xref>
          ]. This 7-Byte header allows the packet
identification in case of fragmentation. Modbus TCP [
          <xref ref-type="bibr" rid="ref3">3</xref>
          ] does not include the CRC expected in the
ADU as the integrity is controlled by the higher levels. The data and function code fields are part of
PDU, and they belong to every possible variant of architecture. The data field contains all the
information useful for the slave in order to execute a request, and to respond adequately to the master.
        </p>
        <p>According to Schneider Electric documentation, the data field has a variable length depending on
the type of function invoked. Function code represents the core meaning of the message, whose
encoding denotes the action performed. Function codes are of three types: public, user-defined
and reserved. Public function codes are documented and tested. This case also includes codes that
are reserved for future developments. User-defined function codes are those codes not supported
by the protocol specification, and which the user can implement from scratch. Reserved function
codes are implemented by companies and are not available for public use. Modbus protocol
provides one byte of code-mapped field capable of representing 255 possible functions: value
“0" is not valid, values from 1 to 127 are efectively used, and values between 128 and 255 are used
for exception function code in response messages. Table 2 shows a list of Modbus function codes.
3.2. S7Comm
S7Comm is a proprietary protocol developed by Siemens. In the industry it is used for diagnostic
and PLC programming purposes. S7Comm works via TCP on port 102 and relies on the
ConnectionOriented Transport (COTP) protocol and the Transport Packet (TPKT ) service [20]. The packet
is encapsulated with the COTP header in order to be ISO-on-TCP compliant (RFC1006) [21].</p>
        <p>The S7 PDU consists of three core components: header, parameters and data, where the latter
component is optional. S7Comm is a command oriented protocol, in the sense that each
transmission contains a command or a response to it. Protocol documentation is not publicly available. For
this reason projects such as the open-source communication library Snap7 [22] and Libnodave [23]
have come to light. It is possible to transfer user programs or individual portions to a PLC. To
interface data with programs like Siemens Step7,10 PLCs use distinct parts, called blocks [24].
As introduced before, S7Comm is the payload of a COTP packet, and it is distinguished by the
identifier 0x32 (as known as Magic Byte). Then we find Message Type, Data Unit, Reserved,
Parameter and Data length, and finally error classes and codes. The Message Type, often identified as
ROSCTR, can be interpreted in four ways. Job request is the request sent by the master. Ack signal
is the acknowledgment sent by the slave, omitting the data field. Ack data is the acknowledgment
sent by the slave with the optional data field, containing the Job. User data is the extension of the
original protocol containing ID requests/responses. In the case of parameters like Job and Ack data,
the optional parameter that follows is the function code. The remaining fields may vary depending
10Step7: cache.industry.siemens.com/dl/files/056/18652056/att_70831/v1/S7prv54_i.pdf.
on this parameter, which decides the purpose of the communication. The Setup Communication
function is used to establish the first S7Comm connection, and provides information on Ack
queues (i.e. number of parallel jobs without Ack) and the maximum length of the PDU. The Read
and Write functions are self-explanatory, and are used to read and write more variables on the PLC.
Functions such as Request Download, Download Block, Download End, Start Upload, Upload and
Upload End are used to perform block download and upload operations. The PLC Stop function
stops the execution of the program running on the PLC, and Program Invocation (PI) manages the
execution of such a program in a wider way (it also include the PLC Stop functionality). The CPU
service is used to access fields such as the protection levels of the CPU and the SZL partial lists.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Implementation</title>
      <p>
        In this section we describe the components of LOGistICS framework. In Section 4.1 we
describe the overall experimental setup in its entirety. In Section 4.2 we describe the snifer we
implemented. We mainly focus on its versatility, and on its ability to listen only to the trafic of
interest. In Section 4.3 we describe the implementation of the Modbus service based on Pymodbus,
by also explaining how we enriched the functionality of the emulated service. Compared to the
honeypots seen in Section 2, we guarantee the user the ability to customize the PLC template
while still ensuring considerable camouflage capabilities [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. In Section 4.4 we describe the
implementation of Snap7 based on S7Comm service. We discuss the client-side implementation
that we managed in order to test locally the solution. We present a novel tool that deploys our
implementation of the PLC in a simple way. Finally, in Section 4.5 we describe the implementation
of the behavioral module which is able to reconstruct the command sequence of an attacker.
      </p>
      <sec id="sec-4-1">
        <title>4.1. Architecture</title>
        <p>
          The framework, whose architecture is represented in Figure 1, can be summarized by four main
components logically and physically separated: honeypot node, snifing node, assessment node
and monitoring node [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ].
        </p>
        <p>
          The assessment node is equipped with Kali Linux. The purpose of this node is to evaluate in
depth the interaction with the honeypot node with respect to the state of the art of ICS Red Team
tools available in the Kali Linux distribution. The honeypot node exposes Modbus and S7Comm
services. In order to isolate and to ensure portability, we distribute these services via container
technology. For clarity, the only node actually exposed is the honeypot as we can see in the figure,
and the ICS activities that we report in Section 5 and in Section 6 are external and real. The snifing
node is on the same site of the honeypot, on which we launched an instance of our snifer. This
sniffer perpetually and exclusively listens to the exposed services, and generates files in pcap format.
We use an intrusion detection system that we have passively configured in such a way as to
generate, starting from the captured packets, industrial events in human readable format [
          <xref ref-type="bibr" rid="ref4">4</xref>
          ]. Finally, the
monitoring node stores the events obtained from the snifing node. Monitoring node contains an
ELK stack11 instance. ELK is a group of open source products designed to help users take data from
any type of source and in any format and search, analyze, and visualize that data (also in real time).
11The Elastic Stack: https://www.elastic.co/what-is/elk-stack.
        </p>
      </sec>
      <sec id="sec-4-2">
        <title>4.4. S7Comm emulation</title>
        <p>LOGistICS supports three types of Siemens PLCs: S7-300, S7-400 and S7-1200. To allow a
user-friendly selection of the honeypot, we implemented a GUI as shown in Figure 2 capable
of containerizing the selected class of PLC to be emulated, and quickly start it.</p>
        <p>In the aforementioned Siemens PLCs, by default data-blocks are not protected and can accessed
or written by users [25]. To test the degree of interaction before the exposure phase to the
Internet (see Section 5), we implemented a C++ master that executed requests to the considered
PLCs. Snap7 provides a sample master that is able to query a physical PLC: for example, by
retrieving the PLC version and list of data-blocks. We enhanced the master with the purpose
to enable the interaction with the password-protection mode of a PLC. We implemented a
module in our Snap7 master, which returns three values concerning the CPU protection level,
defined as Byte in the firmware. The array of Byte SZL W_16_0232 index W_16_0004 contains
the protection levels and position of the operating mode selectors. By exchanging messages
between our own honeypot and the master, we were able to evaluate the security of the PLC, and
thus modify the honeypot answer in order to make it authentic, i.e., more similar to a real PLC.</p>
      </sec>
      <sec id="sec-4-3">
        <title>4.5. Behavioral Module</title>
        <p>This module is implemented on the snifing node, reconstruct a sequence of industrial commands
invoked by the attacker in chronological order, whose is identified through IP address and
autonomous system number (ASN) fields. The researcher is able to choose his time window,
reducing or increasing depending on the type of analysis to conduct. In addition to attack
patterns, the module can be well suited for analyzing sequence similarities [26].</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Experiments</title>
      <p>
        In order to evaluate LOGistICS, we deployed it towards the Internet for almost two months. The
most frequent Modbus event we recorded was Read Device Identification invoked using function
code 43. This encapsulation mechanism allows to transfer predefined and reserved fields: reserved
function codes are implemented by companies and are not available for public use. In addition to
being a field that is commonly read by search engines, it also corresponds to the first stage (i.e.,
reconnaissance) of the ICS killchain [27]. This stage is usually conducted both actively using
reconnaissance tools, and passively through search engines such as Shodan or other services that allow for
investigating the targeted resource without interacting directly with it. We were able to detect the
stealth scans exception. In fact, using Wireshark we identified this kind of activity by following the
SYN → SYN/ACK → RST pattern [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This type of request was always performed by scanners and
we believe it is therefore a benign request. We logged unauthorized reading requests to the holding
registers, which store 16 bits readable and writable configuration-values. There are also exceptions
related to the reading of input registers (i.e., 16 bits values representing measurements and
statuses). We believe that a host has tried to access an undefined input, i.e. the register does not exist.
Similarly, the same holds true for Write Multiple Registers exceptions. We detected many requests
for writing to multiple registers, where unauthenticated hosts attempted to corrupt the honeypot,
or were looking for some undefined behavior of our configuration. Regarding S7Comm events, we
collected a large number of Setup Communication requests, used to establish a communication with
the S7Comm Layer. Being able to query diferent fields in Read SZL requests, the attacker could be
then encouraged to execute other following function codes: a medium-interaction honeypot
inactivates possible attackers. In fact we were able to capture a PLC Stop (0x29) request and a Write (0x05)
variable request on Data Block 1, which respectively stops the CPU and creates a boolean variable.
Remark 1. We found that the peak of activities recorded was mostly due to TCP SYN scans.
This was discovered thanks to the separation of logs by protocol type: in that one related to the
application layer there was no trace of the performed-request type. However, by crossing these data
with the ones related to the activities recorded in the presentation layer (ISO-COTP), we realized
a high number of half-open connections13 were recorded by the same actor having unknown DNS
assigned to IP Volume inc. (ASN 202425). This host carried out this activity without actively
obtaining information about our instance. We believe that the host already knew our configuration
thanks to search engines like Censys.
      </p>
      <p>During the classification of industrial trafic, we found that there were many more potentially
harmful Modbus events (274) than S7Comm (6), excluding TCP SYN scans as described in
Remark 1. By referring to the Modbus service, we categorized the requests such as “Report Slave
ID”, “Device ID” and “Half-Open Scan” as benign. However, we consider read requests other than
those listed above to be disruptive, such as read coils and registers, which may contain confidential
information. Concerning S7comm, we considered read requests to the S7Comm protocol as not
malicious, since they are mostly performed by periodic scanners, with the exception of those that
concern the CPU protection level. Any S7Comm write request, including exceptions, was clearly
treated as malicious, regardless of the service. The high number of Modbus requests classified as
malicious can be explained by the fact that i) the interaction with a Modbus PLC, by design, can
involve multiple coils and registers, and ii) Modbus is now an open standard (unlike S7Comm),
which makes it easier to write a script that brute-forces against all the registers.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Behavior of Attackers</title>
      <p>In order to characterize the actors we outline its sequence of commands (i.e., the host behavior)
grouped by IP address. For each host, the sequence of commands executed was captured only
once, and there were no recurring activities from the same IP during the high-intensity periods.
We discovered this by grouping events initially by a certain time slot, and subsequently by
13The lack of synchronization could be due to malicious intent, such as the TCP SYN Flood attack.</p>
      <p>ASN
55536
9009
29073
10439
10439</p>
      <p>Organization</p>
      <p>Pacswitch</p>
      <p>M247
IP Volume
CARI.net
CARI.net
day, noting that the sequence of commands remained unchanged. We believe this is due to
the duration of the experiment, as in general the port scanners repeat the same activities on
a monthly basis. We represented the encoded sequence of commands for hosts related to Modbus
requests. We consider 5 of these command patterns as relevant, and we show them in Table 3,
where the superscript on each command represents how many times it has been invoked. Each
of these patterns (P) was logged on a diferent day, as reported by Table 3.</p>
      <p>We can clearly distinguish the recurring scanners with the harassing actors during the initial
phase of communication. The actors belonging to the CARI.net and IP Volume organization
have an almost identical behavior, and start their activity by making Report Slave ID and Read
Device Identification requests. We could remove the first two commands, as typically a couple of
read requests (slave and device information) are needed to know how to compromise the device.
But as we can see in Table 3, this can be argued. In fact, M247 and Pacswitch hosts execute Read
Device Identification request only after making an illegal reading of the registers (note that
M247 performed 160 TCP SYN requests, which may indicate a suspicious behavior). For what
concerns the activities towards the S7Comm protocol, we identified in a similar way the most
interesting patterns during the periods of greatest intensity. Diferently to what emerged with
Modbus activities, very similar S7Comm patterns were recorded belonging to hosts of the same
organization. Moreover, some actors performed the exact same sequence during the period of
intense activity. In order to simplify the representation, we assume with regard to the S7Comm
activities, that the hosts of an Autonomous System have the same behavior.</p>
      <p>In Table 4 we represent the sequence of commands performed by Organization hosts, grouped
by high-intensity activity period. Regarding the S7Comm service, we were able to reconstruct
the complete communication between the actors and the honeypot, reporting in chronological
order the response to each request made. As the results show, DigitalOcean hosts that make two
Read SZL requests followed by the Setup Communication request are recurring scanners. The
behavior of the Pacswitch host difers from the aforementioned host, which instead of running a
port-scan tries to write directly to our honeypot. The unrecognized pattern of the host belonging
to IP Volume inc. is explained in Remark 1.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Conclusion and Future Works</title>
      <p>With the goal of analyzing attacker behavior, we presented LOGistICS, a multi-component system
for safeguarding critical infrastructures. The system core is a new medium-interaction honeypot
that simulates Modbus and S7Comm, respectively. These are some of the most commonly used
protocols in industrial PLCs. This paper discusses the behavior of ICS actors when interacting
with the honeypot, which is found to be less detectable by attackers than similar proposals, and
indistinguishable from real PLCs when using automated attack tools. We tested how LOGistICS
works by exposing it to the Internet and found that most of the requests made are not malicious.
However, the people who make such requests do not always have the same behavior, which
difers depending on the type of information they read. As for web crawlers, we figured out that
the majority of them have the same behavior towards ICS devices; they essentially difer in the
frequency of reading the fields that identify the device. Moreover, we found that besides periodic
scanners (e.g. Shodan and Censys) who conclude their activity once the reconnaissance stage has
been completed, there exist attacks trying to write data and control the PLC without knowing
the victim, as if they already knew the necessary information. Our intuition is that these actors
obtained them by looking for them indirectly from search engines, activity that would be classified
by the intrusion detection systems as not suspicious (in the case of web crawlers are whitelisted).
We collected several ideas about possible extensions as future work. For instance, we would like
to leverage LOGistICS with SIEM-based technologies such as one based on the Elastic Stack.14
The aim is to leave the honeypot running also in production environments (e.g. nuclear power
plants), comparing it with that already obtained towards research environments. In this way,
we can enhance our framework providing a more focused real-time alerting module. In addition,
besides improving the extensibility and configurability of the honeypot, in the future we would
like to add deception-technology modules and protocols such as i) BACnet (Building Automation
and Control), which finds its application in ventilating, heating, access control, lightning,
air-conditioning and fire detection systems; and ii) DNP3 (Distributed Network Protocol), which
is is widely adopted in the USA and Canada, and it is used by electric and water companies.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work has been partially supported by: project “RACRA”- funded by Ricerca di Base
2018-2019, Univeristy of Perugia Project BLOCKCHAIN4FOODCHAIN: funded by Ricerca di
Base 2020, Univeristy of Perugia Project DopUP - REGIONE UMBRIA PSR 2014-2020.
14Elastic SIEM: https://www.elastic.co/siem/.
[20] A. Kleinman, A. Wool, Accurate modeling of the siemens s7 scada protocol for intrusion
detection and digital forensics, The Journal of Digital Forensics, Security and Law: JDFSL
9 (2014) 37.
[21] M. Rose, D. Cass, ISO Transport Service on top of the TCP-Version 3, Technical Report,</p>
      <p>RFC 1006, Northrop Research and Technology Center, 1987.
[22] D. Nardella, Snap7, 2018.
[23] A. Sentcha, Libnodave–exchange data with siemens plc, 2015.
[24] Siemens, Siemens block description, http://siemens-plc-programming.blogspot.com/p
/load-memory-work-memory-blocks-in-user.html, 2010.
[25] E. Biham, S. Bitan, A. Carmel, A. Dankner, U. Malin, A. Wool, Rogue7: Rogue
engineering-station attacks on s7 simatic plcs, Mandalay Bay/Las Vegas (2019).
[26] A. Kashtalian, T. Sochor, K-means clustering of honeynet data with unsupervised
representation learning., in: IntelITSIS, 2021, pp. 439–449.
[27] M. J. Assante, R. M. Lee, The industrial control system cyber kill chain, SANS Institute
InfoSec Reading Room 1 (2015).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>K.</given-names>
            <surname>Stoufer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Falco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Scarfone</surname>
          </string-name>
          ,
          <article-title>Guide to industrial control systems (ics) security</article-title>
          , NIST special publication
          <volume>800</volume>
          (
          <year>2011</year>
          )
          <fpage>16</fpage>
          -
          <lpage>16</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O.</given-names>
            <surname>Analytica</surname>
          </string-name>
          ,
          <article-title>Cyberattack on ukraine's power grid underlines risk</article-title>
          ,
          <source>Emerald Expert Briefings</source>
          (
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A.</given-names>
            <surname>Swales</surname>
          </string-name>
          , et al., Open modbus/tcp specification,
          <source>Schneider Electric</source>
          <volume>29</volume>
          (
          <year>1999</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bistarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bosimini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Santini</surname>
          </string-name>
          ,
          <article-title>A medium-interaction emulation and monitoring system for operational technology</article-title>
          ,
          <source>in: ARES 2021: The 16th International Conference on Availability, Reliability and Security</source>
          , Vienna, Austria,
          <source>August 17-20</source>
          ,
          <year>2021</year>
          , ACM,
          <year>2021</year>
          , pp.
          <volume>118</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>118</lpage>
          :
          <fpage>7</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <article-title>S7commtrace: A high interactive honeypot for industrial control system based on s7 protocol</article-title>
          ,
          <source>in: International Conference on Information and Communications Security</source>
          , Springer,
          <year>2017</year>
          , pp.
          <fpage>412</fpage>
          -
          <lpage>423</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nawrocki</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Wählisch</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T. C.</given-names>
            <surname>Schmidt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Keil</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schönfelder</surname>
          </string-name>
          ,
          <article-title>A survey on honeypot software and data analysis</article-title>
          ,
          <source>arXiv preprint arXiv:1608.06249</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bistarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Bosimini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Santini</surname>
          </string-name>
          ,
          <article-title>A report on the security of home connections with iot and docker honeypots</article-title>
          .,
          <source>in: ITASEC</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>60</fpage>
          -
          <lpage>70</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Matherly</surname>
          </string-name>
          , Complete guide to shodan, Shodan,
          <source>LLC (2016-02-25) 1</source>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Wilhoit</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. Hilt,</surname>
          </string-name>
          <article-title>The gaspot experiment: Unexamined perils in using gas-tank-monitoring systems</article-title>
          ,
          <source>Trend Micro</source>
          <volume>6</volume>
          (
          <year>2015</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>D. I.</given-names>
            <surname>Buza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Juhász</surname>
          </string-name>
          , G. Miru,
          <string-name>
            <given-names>M.</given-names>
            <surname>Félegyházi</surname>
          </string-name>
          , T. Holczer, Cryplh:
          <article-title>Protecting smart energy systems from targeted attacks with a plc honeypot</article-title>
          , in: International Workshop on Smart Grid Security, Springer,
          <year>2014</year>
          , pp.
          <fpage>181</fpage>
          -
          <lpage>192</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lau</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Klick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Arndt</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Roth</surname>
          </string-name>
          , Poster:
          <article-title>Towards highly interactive honeypots for industrial control systems</article-title>
          ,
          <source>in: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security</source>
          ,
          <year>2016</year>
          , pp.
          <fpage>1823</fpage>
          -
          <lpage>1825</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Litchfield</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Formby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rogers</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Meliopoulos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Beyah</surname>
          </string-name>
          ,
          <article-title>Rethinking the honeypot for cyber-physical systems</article-title>
          ,
          <source>IEEE Internet Computing</source>
          <volume>20</volume>
          (
          <year>2016</year>
          )
          <fpage>9</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>J.</given-names>
            <surname>You</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Lv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wen</surname>
          </string-name>
          , L. Sun,
          <article-title>Honeyvp: A cost-efective hybrid honeypot architecture for industrial control systems</article-title>
          ,
          <source>in: ICC 2021-IEEE International Conference on Communications, IEEE</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>A.</given-names>
            <surname>Jicha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Patton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Scada honeypots: An in-depth analysis of conpot, in: 2016 IEEE conference on intelligence and security informatics (ISI)</article-title>
          , IEEE,
          <year>2016</year>
          , pp.
          <fpage>196</fpage>
          -
          <lpage>198</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>E.</given-names>
            <surname>López-Morales</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Rubio-Medrano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Doupé</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shoshitaishvili</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Bao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.-J.</given-names>
            <surname>Ahn</surname>
          </string-name>
          ,
          <article-title>Honeyplc: A next-generation honeypot for industrial control systems</article-title>
          ,
          <source>in: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security</source>
          ,
          <year>2020</year>
          , pp.
          <fpage>279</fpage>
          -
          <lpage>291</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>I. Modicon</surname>
          </string-name>
          ,
          <article-title>Modicon modbus protocol reference guide</article-title>
          , North Andover, Massachusetts (
          <year>1996</year>
          )
          <fpage>28</fpage>
          -
          <lpage>29</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>K.</given-names>
            <surname>Ashton</surname>
          </string-name>
          , et al.,
          <article-title>That 'internet of things' thing</article-title>
          ,
          <source>RFID journal 22</source>
          (
          <year>2009</year>
          )
          <fpage>97</fpage>
          -
          <lpage>114</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <surname>I. Modbus</surname>
          </string-name>
          , Modbus messaging on tcp,
          <source>IP Implementation Guide v1. 0a</source>
          ,
          <string-name>
            <surname>North</surname>
            <given-names>Grafton</given-names>
          </string-name>
          , Massachusetts (www. modbus. org/specs. php) (
          <year>2004</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Huitsing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chandia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Papa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Shenoi</surname>
          </string-name>
          ,
          <article-title>Attack taxonomies for the modbus protocols</article-title>
          ,
          <source>International Journal of Critical Infrastructure Protection</source>
          <volume>1</volume>
          (
          <year>2008</year>
          )
          <fpage>37</fpage>
          -
          <lpage>44</lpage>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>