<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Salerno, Italy
$ stefano.bistarelli@unipg.it (S. Bistarelli); chiara.luchini@collaboratori.unipg.it (C. Luchini);
ivan.mercanti@unipg.it (I. Mercanti); francesco.santini@unipg.it (F. Santini)
 https://bista.sites.dmi.unipg.it/ (S. Bistarelli); https://francescosantini.sites.dmi.unipg.it/ (F. Santini)</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>A Preliminary Study on the Creation of a Covert Channel with HT TP Headers</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stefano Bistarelli</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michele Ceccarelli</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chiara Luchini</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ivan Mercanti</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Francesco Santini</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Colacem S.p.A.</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gubbio</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Italy</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Dipartimento di Matematica e Informatica “Ulisse Dini”, Università degli Studi di Firenze, Viale Giovanni Battista Morgagni</institution>
          ,
          <addr-line>67/a, 50134 Firenze (FI)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Dipartimento di Matematica e Informatica, Università degli Studi di Perugia</institution>
          ,
          <addr-line>Via Vanvitelli 1, 06123 Perugia (PG)</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2024</year>
      </pub-date>
      <volume>000</volume>
      <fpage>0</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>Steganography conceals confidential information within seemingly innocuous data and has evolved with technological advancements. In network steganography, data is hidden in packets exchanged at diferent levels (e.g., Ethernet, IP, TCP, etc.). This paper considers the HTTP protocol for setting up a covert channel between two endpoints: the main motivation is that creating ad-hoc HTTP packet headers does not require superuser privileges, while TCP segment headers, for example, require them. This simplifies the execution of tools implementing the channel. Moreover, HTTP/HTTPS trafic is usually allowed to flow to/from a local network and is often not modified (if not automatically proxied). Therefore, we propose a detailed exploration of a covert channel protocol by modulating standard fields in the HTTP headers for unidirectional communication, i.e., from a sender to a receiver.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Network Steganography</kwd>
        <kwd>Covert Channels</kwd>
        <kwd>HTTP</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Steganography [
        <xref ref-type="bibr" rid="ref1 ref2">1, 2</xref>
        ] is an ancient technique used for centuries to hide sensitive information
from public view in seemingly innocent material. Its development throughout time, spurred
by technological breakthroughs, has produced a variety of steganographic methods that have
increased its applicability in a wide range of fields. A pivotal moment in 2003 marked the
introduction of “network steganography” [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], often referred to as covert channels [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], emerging as
a widely implemented type in practical settings. In general terms, covert transfer of information
always features the following elements regardless of its specific: a covert sender, the entity that
sends secret information, and a covert receiver, the entity that receives secret information. Then,
the covert object: is the data carrier in which the covert sender hides secret information. It must
be selected so that it does not represent an anomaly but at the same time has enough embedding
capacity. Finally, a representation specifies how secret information is embedded in the covert
object.
      </p>
      <p>Network steganography can be employed in various legitimate and potentially malicious
applications. Some examples of possible applications include data exfiltration in espionage and
Intelligence, confidential business communication, malicious activities (malware communication
or command-and-control trafic to evade IDSs), anonymous communication and anti-censorship,
and copyright protection (embedding information to identify the origin or ownership of digital
content).</p>
      <p>The creation of a thorough taxonomy for the scientific community was prompted by the
widespread usage of covert channels over time, which addressed issues ranging from
cybersecurity to terrorism. When the “Information Hiding Project"1 was first launched, it used a
model-based classification system to classify diferent steganographic methods and suggested
performance assessment indices to gauge how efective they were over a wide range of covered
channels.</p>
      <p>
        Practical implementations of network steganography may involve the manipulation of various
communication protocols, such as IP [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], TCP [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], UDP [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], ICMP2, DNS [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], and HTTP [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
Researchers in this field continually work to develop new techniques that can evade monitoring
and detection systems, leading to an ongoing evolution of concealment strategies. Network
steganography poses significant challenges in the context of cybersecurity, requiring a constant
efort to develop advanced detection methods. Its growing relevance is highlighted by the
need to explore new approaches and countermeasures to protect digital networks from using
steganography for malicious purposes.
      </p>
      <p>Our proposal emphasises a detailed definition of an HTTP-level covert channel protocol. The
prototype’s client and server components use HTTP headers for one-way communication while
prioritising particular features. The following sections make up the structure of this paper.
In Section 2, we provide an overview of the HTTP protocol and some technical elements of
network steganography, including nomenclature and taxonomy. Section 3 presents the most
interesting works on HTTP steganography. Section 4 describes our prototype idea. We explore
our findings and possible future directions for this study in Section 5.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>This Section presents an overview of the network steganography and HTTP protocol.</p>
      <sec id="sec-2-1">
        <title>2.1. Network Steganography</title>
        <p>
          The fundamental concept of network steganography is to hide messages or data within other
seemingly innocuous data, ensuring that the act of concealment does not raise suspicions.
Unlike traditional forms of steganography, which often target images or audio files, network
steganography centres on manipulating data packets, frames, or communication protocols
within a computer network. Krzysztof Szczypiorsky originally presented the idea of network
1https://patterns.ztt.hs-worms.de
2https://www.rfc-editor.org/rfc/rfc1256.
steganography in 2003 [
          <xref ref-type="bibr" rid="ref10">10</xref>
          ]. Compared to other well-known methods like picture steganography,
this new kind of steganography had received little attention before its presentation. This
modern branch has seen tremendous growth regarding communication concealment in the last
few decades, bringing many innovative network steganography techniques to the scientific
community.
        </p>
        <p>
          Researchers use terms such as “information hiding" and “covert channel" to refer to the
same technique, which involves concealing information in network protocols [
          <xref ref-type="bibr" rid="ref11">11</xref>
          ]. Diferences
between Lampson ’s [
          <xref ref-type="bibr" rid="ref12">12</xref>
          ] initial definition of a covert channel and a later definition by the US
Department of Defence (US DoD) [
          <xref ref-type="bibr" rid="ref13">13</xref>
          ] have partly contributed to this. This paper refers to a
covert channel as a hidden or secret channel intended to facilitate discrete data transmission
between two peers by covertly exchanging information within a network protocol. On the other
hand, a overt channel is a recognised channel where a sender and a receiver can legitimately
communicate information. When discussing the sender and receiver, it is important to make
clear distinctions. Specifically, an Overt Sender (OS) is defined as the individual who transmits
data through a legitimate channel, while an Overt Receiver (OR) is the individual who receives
the data. In contrast, a Secret Sender (SS) is defined as the entity sending data in a hidden
channel, while the Secret Receiver (SR) is the entity receiving it. It is important to note that SS
and SR may not always align with OS and OR, making the latter unaware that third parties
are using their communication for other operations. Illegitimate communication in a covert
channel involves two processes: embedding and extraction. The embedding process allows the
sender to conceal secret data within legitimate communication, while the extraction process
allows retrieving such data.
        </p>
        <p>
          Traditionally, covert channels were categorised into Covert Storage Channels (CSC) and
Covert Timing Channels (CTC), although there is no fundamental distinction between them [
          <xref ref-type="bibr" rid="ref13 ref14">13,
14</xref>
          ]. Storage channels involve the sender’s direct/indirect inscription of object values and
the receiver’s direct/indirect reading of these values. On the other hand, timing channels
entail the sender signalling information by modulating resource usage (e.g. CPU usage) over
time, allowing the receiver to observe and decode the transmitted data. The methodologies
employed in the construction of covert channels are numerous and diverse. However, they may
be broadly classified into two categories: those that alter the bits of packets, thereby storing
information directly in the trafic and those that modify the timing or behaviour of the flow,
allowing the receiver to decode covert data by observing and interpreting the trafic. Recently,
a third category, referred to as hybrid channels, has been introduced alongside storage and
timing channels [15]. These techniques combine the utilisation of both storage and timing
methodologies.
        </p>
      </sec>
      <sec id="sec-2-2">
        <title>2.2. The HTTP Protocol Header Fields</title>
        <p>Hypertext Transfer Protocol (HTTP) is a client-server protocol that is used to fetch resources
such as HTML documents. It is the foundation of web data exchange and is reconstructed from
sub-documents such as text, images, videos, and scripts [16]. HTTP requests are composed of
headers3 and a body. The headers convey essential information for processing the data in the</p>
        <sec id="sec-2-2-1">
          <title>3HTTP headers: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers</title>
          <p>body. Clients include fields in the requests to inform the server about their handling capabilities
and preferences for receiving requested resources. For example, clients may specify their ability
to process compressed resources, preferred language, or acceptance of older resource versions.
However, it is essential to note that although clients express these preferences, servers may
disregard them or respond based on their preferences if they cannot comply.</p>
          <p>HTTP request headers in Figure 1, adhere to a consistent structure, comprising a
caseinsensitive string followed by a colon (‘:’) and a value. The entire header, including the value, is
on a single line, which may be lengthy. Requests may feature diverse headers categorised into
groups [18]:
• General headers refers to an HTTP header that may be used in both request and response
messages but is unrelated to the content.
• Request headers, such as User-Agent or Accept, further specify or provide context to the
request. Examples include Accept-Language for language preference and Referer for
contextual information. Some, like If-None-Match, conditionally restrict the request.
• Representation headers, including Content-Type, delineate the original format of the
message data and any applied encoding. These headers are only present if the message
includes a body.</p>
          <p>Requests and responses feature standard and personalised fields, encompassing proxy settings,
security configurations, and server-set parameters. Standard request fields communicate the
client’s characters, encodings, manipulations, and language capabilities. They also provide
details about the request, such as date, user agent, or authentication-related data. Meanwhile,
response headers convey data specifications like length, type, encoding, or hash, with additional
ifelds expediting resource processing. Notably, the Set-Cookie header in responses communicates
cookies the client should set for future interactions.</p>
          <p>Non-standard headers cater to more resource-specific details, such as X-Content-Duration,
indicating the duration of audio or video content in seconds. Given the likelihood of requests
passing through various systems, including proxies, before reaching the server, these
intermediaries may modify or control parameters to optimise delivery. We will go over in detail each
HTTP header field that will be employed in our model:
• Accept: This field contains the MIME types accepted for the response.
• Accept-Encoding: The Accept-Encoding field contains the encoding formats accepted for
the response, which can be a value or a list of values.
• Accept-Language: This field allows clients to choose the language(s) they want to receive
the requested resource. As a result, one or more values can be specified as well as the
“:q=” which expresses a preference among several options.
• Accept-Datetime: This field contains the version date of the requested resource. Dates
are written using the standard format “&lt;day-name&gt;, &lt;day&gt; &lt;month&gt; &lt;year&gt;
&lt;hour&gt;:&lt;minute&gt;:&lt;second&gt; &lt;time-zone&gt;".
• From: The From field can contain the contact information of the person who submitted
the request, which is useful if any issues need to be resolved by the server. The standard
requires this contact to have a traditional email address.
• If-Match: The If-Match field determines whether a resource request has been altered
when using the POST method. It has an identifier or list of them, and the request won’t
be handled until one matches the one saved in the server. The most popular method for
generating resource identifiers is using a hash, such as SHA-256.
• Range: This field will only be included in the request if the If-Match field is also present.</p>
          <p>This is because, in most circumstances, they are used together to request (or modify) a
specified resource portion. This field is used to provide the two hypothetical byte ranges.
• TE: This standard field in GET requests allows clients to specify the transfer encodings
they accept. In HTTP protocol versions 2 and 3, this field is only permitted when set to
trailers4.
• User-Agent: The User-Agent value is necessary for communication and indicates the
client version making the request. The server uses this information to determine which
software it is communicating with. It helps to decide which data to handle and apply
further optimisations on top of what is supplied in the other fields.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>3. Related Work</title>
      <p>The highest layers of the ISO/OSI stack, the application layer protocols, have also been utilised
to suggest several hidden channels. At this level, the protocols we uncover can be client-server
or peer-to-peer, where users share information collaboratively. The primary application-level
protocol for information transmission on the Web is HTTP. Although a more secure TLS-based
version (HTTPS) is available, almost all organisations still permit Internet surfing over HTTP.
Dyatlov et al. [19] presented storage channels that exploit the HTTP request/response header
and/or body. The amount of allowed headers changes depending on the web server version,
making an accurate performance evaluation of these strategies impossible.</p>
      <p>One way to transmit instructions and output them secretly over HTTP is using the Reverse
WWW Shell tool [20]. On the other hand, Bowyer [21] encodes messages in URL parameters</p>
      <sec id="sec-3-1">
        <title>4https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/TE</title>
        <p>or after GET requests and utilises these hidden channels to connect with Trojans hidden
behind firewalls. To build an anonymous overlay network, Bauer [ 22] suggests using common
online user communications, such as headers, cookies, redirects, HTML components, and
active content. The majority of these methods, along with the currently available HTTP covert
channels analyses, are documented in the work by Brown et al. [23].</p>
        <p>Heilman et al. [24] presents a covert channel that mimics stealthy behaviour by leveraging the
base Linux shell and command language while relying minimally on system resources through
the usage of the User-Agent string in the HTTP Request Header. The channel’s usage of HTTP
enables it to blend in with network trafic and propagate over wide-area networks, possibly
expanding its reach and making it more accessible to Bash shell users.</p>
        <p>Kwecka [25] proposes a method to embed covert data into HTTP headers by leveraging
the protocol’s treatment of various amounts of whitespace as a single character. For instance,
tabulation can represent 1, while a standard space can represent 0. Additionally, the varying
capitalisation of letters can be used for secret data transfer.</p>
        <p>Ji et al. [26] introduced a technique based on the length of HTTP packets, achieving a recorded
performance of 50 bytes, which includes 20 bytes for the TCP Header, 20 bytes for the IP Header,
and 18 bytes for the Ethernet Header.</p>
        <p>Alman [27] elucidates how a connection can be established through a proxy server by
exploiting a vulnerability in the CONNECT method. Van Horenbeeck [28] developed the
Wondjina tool, enabling a client to validate its cached copy using HTTP Entity tags. Similar
concepts are applied in [29], incorporating LSB approaches on the Date and Last Modified fields.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. A Prototype</title>
      <p>Notably, real-world implementations have been articulated primarily for the IP, TCP, UDP,
ICMP, DNS, and HTTP protocols. The prominence of the first three protocols is attributed to
the substantial generation of network trafic at the upper layers of the ISO/OSI stack, enabling
broader applicability to numerous data packages. However, the requisite root permissions for
operations at these layers limit potential application scopes.</p>
      <p>ICMP and DNS emerge as commonly used protocols, ofering easily accessible payloads
and the ability to initiate conversations without necessitating root privileges. Nevertheless,
their widespread use has led to the proliferation of monitoring techniques leveraging artificial
intelligence, which poses a challenge.</p>
      <p>The HTTP protocol, operating at the Application level and associated with web browsing,
dominates internet trafic within contemporary computer networks. Monitoring systems that
comprehensively analyse HTTP packets are notably scarce, with a prevalent focus on
communication characteristics. Anomalies, such as an imbalance in data transmission between
client and server during a single connection, may indicate potential covert communication.
Despite HTTP protocol communications generally circumventing the need for root access, most
implementations deviate from pure network steganography, employing tunnelling methods
within ostensibly innocuous HTTPS trafic.</p>
      <p>Our objective encompassed the development of a protocol designed to operate seamlessly on
most computers and evade detection by large-scale systems. The choice of the application level
was deliberate, considering the absence of root privileges and the widespread use of the HTTP
protocol among users. The decision to focus on a unidirectional channel, akin to data exfiltration,
aimed to enhance the methodology’s versatility across scenarios. Nevertheless, the outcomes
exhibit adaptability and can be easily extended to facilitate bidirectional communication, thus
broadening its potential applications. The data integration approach within HTTPS packages
was achieved through a unified solution using diferent header fields of an HTTP request.</p>
      <p>After examining the several recommendations in the literature, we concluded that [30] email
suggestion would be most helpful for our investigation. Initially, they proposed entering
information in the “Message-ID" and “Content-Type" boxes. We modified the method to represent
the data according to the characteristics of the diferent fields and expanded the methodology
to HTTP request headers. Before discussing each component’s operation in depth, we specified
the modulation of HTTP headers for data exfiltration.</p>
      <sec id="sec-4-1">
        <title>4.1. HTTP Request Modulation</title>
        <p>There are numerous available fields for creating queries, including nonstandard ones. Some
ifelds imply the presence or absence of others. We also consider the maximum number of bits
that could be injected into a single field and avoid fields that might be modified. To construct
our requests, we follow the format of requesting a portion of a previously passed resource using
the GET method. In Section 2, we listed all the HTTP header fields included in a GET request.</p>
        <p>This Section describes the technique used to represent bit strings for each field. Powers of 2 are
used primarily to facilitate the representation of possible binary strings. The technique proposed
in this paper falls under the PS11 category, as it involves Value Modulation and preserves the
structure [31]. We describe a potential modulation of the GET request’s HTTP header fields and
specify the appropriate course of action for each. Table 2 shows the modulation of the Accept,
Accept-Enconding and Accept-Language fields, while Table 3 displays the modulation of the
Accept-Datetime, From, Range and TE fields.</p>
        <p>Accept. The Accept field specifies the MIME types accepted for the response. We focus on
the most common MIME types such as text/plain, text/css and so on. In this case, there
can be 16 possible values, allowing hiding a sequence of up to 4 bits within this field.
Accept-Encoding. Even with the Accept-Encoding field, which lists the accepted encoding
types for the response, the options should be limited to the eight most common types. This
would allow for the hiding of 3-bit sequences.</p>
        <p>Accept-Language. The client uses the Accept-Language field to indicate the preferred
language(s) for receiving the requested resource. In our case, we considered three languages:
Italian, American English, and British English. Italian was chosen because it is the language
of the authors of this paper, while American and British English were selected due to their
widespread usage. We also considered the preference value ‘ q’ and the presence or absence
of the considered languages to expand the representable data set. Unlike the other fields, the
Accept-Language field can steganograph a maximum of three bits, but it can also be used for bit
sequences of shorter lengths.</p>
        <p>Accept-Datetime. In this context, the version date of the requested resource is
denoted, employing a standard date format: “ &lt;day-name&gt;&lt;day&gt;&lt;month&gt;&lt;year&gt;&lt;hour&gt;:
&lt;minute&gt;:&lt;second&gt;&lt;time-zone&gt;". We use all fields except the first and last to exploit this
date format. The first field is omitted due to its dependency on others, and the last is disregarded
because requests from the same machine cannot difer in time zone.</p>
        <p>Like other fields, the approach involves considering the maximum power of two for each
piece of data, with binary strings assigned to that number of elements. For instance, considering
February the shortest month with 28 days, the largest power of two less than 28 is 16 (24). This
logic modulates the day-name value, minutes, hours, and seconds. The first eight (2 3) months
are considered, and the years from 1991 to 2022, resulting in 32 (25) valid values, avoiding dates
before the internet’s inception. The day of the week is determined once the date is generated,
and the specified Italian CET time zone can be used. However, these two values are disregarded
during the decoding process. In summary, it is feasible to conceal 26 bits, considering string
lengths of 4 for the day and hour, 3 for the month, and 5 for the year, minutes, and seconds.
From. This field includes the contact information of the request submitter, providing valuable
details for server issue resolution. The standard mandates that this contact information adhere
to a traditional email address format.</p>
        <p>To address this, we acquire databases containing the most popular DNS names in the US over
recent years, merge them, and generate a dictionary comprising 32,768 names. Each entry in
this dictionary corresponds to a 15-bit binary string (215=32768). Additionally, we identify eight
of the most popular email domains and assign them their respective 3-bit binary strings.
If-Match. The If-Match field is employed to ascertain whether a resource request has
undergone alterations when using the POST method. This field includes an identification or a
list of identifiers, and the request is not processed unless it matches the identifier stored on
the server. Typically, resource IDs are produced with a hash algorithm such as SHA-256. We
assume that SHA-256 was used for digest computation and that the request includes two hashes.
We transform 256 bits of the secret message into hexadecimal digits and enter them in the field.
Given that SHA-256 yields a 256-bit digest, we may mask 512 bits in this field by entering two
numbers. Notably, this field and its accompanying Range field (described in the next paragraph)
are not used if less than 256 bits of data must be sent.</p>
        <p>Range. As previously stated, the Range field appears in the request only when the If-Match
parameter is included. These fields are commonly used to request or edit a specified portion of
a resource. This parameter is used to provide two hypothetical byte ranges. Recognising that
a resource may consist of several bytes, we consider it fair to utilise integers between 0 and
1023 (210 values) to define two intervals. Each number is now assigned a 10-bit binary string,
which allows for hiding 40 bits inside this field. The first value is extracted from the first ten
bits, followed by the second value from the ten bits after that. The initial number is then added
to the latter to avoid an unreasonable interval in which the end is smaller than the start. The
same approach is used to compute the second interval.</p>
        <p>TE. This standard field is present in GET requests and allows the client to declare which
transfer encodings it is ready to accept. Recognising its fundamental role in requests and
consistent presence, it was deemed suitable for concealing steganography information. Given
the restricted choices, the four most regularly used values were assigned to 2-bit binary strings.
User-Agent. The User-Agent field is obligatory in communications and serves to identify the
client version initiating the request. This information is important for the server to discern
the program with which it interacts, enabling the selection of data handling and implementing
optimisations beyond those ofered in other fields.</p>
        <p>In this context, modulation is eschewed, and instead, the identifying string of a Mozilla
version is entered into each request. This choice is based on Mozilla being the most prevalent
browser across various systems. Detecting requests from the same computer with multiple
User-Agents in a brief period would likely raise concerns.</p>
        <p>Field</p>
        <p>Accept:
Accept-Encoding:
Accept-Language:
Accept-Datetime</p>
        <p>From:
If-Match:</p>
        <p>Range:</p>
        <p>TE:
User-Agent</p>
        <p>Request structure</p>
        <p>Example
text/. . . , image/. . . , video/. . . , application/. . .</p>
        <p>gzip, deflate, compress, br, identity, *
it-IT,it;q=0.9,en-US;q=0.8,en;q=0.7
Wed, 21 Oct 2015 07:28:00 GMT
aristea@libero.it
(x2) hash da 256bit</p>
        <p>bytes:1207-2367
compress, deflate, gzip, trailers
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0)</p>
        <p>Gecko/20100101 Firefox/47.0</p>
        <p>Total bit sent</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Implementation Details</title>
        <p>The proposed prototype has two primary components: sender and receiver. The first takes as
input arguments the file or directory to steganograph inside headers and the URL to make
requests to. Instead, the receiver is a web server listening on a specific port.</p>
        <p>The client executable, denoted as the sender, primarily functions to embed steganographically
or exfiltrate a file or directory via HTTP requests. Its workflow is as follows:
1. Preliminary operations: The sender prints the tool banner and generates the dictionaries
for the Range and From fields.
2. Parameters: The sender takes the URL and the path to the file or directory to be sent as
input parameters.
3. Beginning of communication: the communication begins with a request that has the value
of the Accept field set to application/zip, with the remaining fields either left blank or, if
necessary, defaulted.
4. Sending phase: sender manages path reading, file or directory identification, request
creation, and send. The file is opened in binary read mode, and an estimate of the number
of requests required is reported to the terminal. The process reads the bit blocks of the file,
creating requests based on the binary string obtained from the file. Each field is assigned
a dictionary, with the key being the binary string and the associated value representing
the information to be written into the request. The file transfer process involves sending
a new request when the input file’s length is incompatible with the binary string. If this
happens, a new request is created cyclically until the entire block is sent. The number of
requests sent is updated at each step, providing an estimate of the completion percentage.
5. Ending phase: the server is informed of the end of exfiltration operations by sending a
request with the Accept field set to application/rtf. The total number of requests used in
the process is displayed on the screen, managed in a variable updated each time a request
is sent.</p>
        <p>Contrarily, the receiver is a web server listening on a designated port, mirroring the sender’s
structure.</p>
        <p>1. Preliminary operations: the server’s execution is based on a main function, which prints
the terminal banner and creates dictionaries related to person names and ranges. The
distinction from those the client uses is that the key is the information found in the
request, while the binary string represents the related value. The port number and file
names to be generated are prerequisites; otherwise, default values, specifically port 8000
and the name output_file , will be automatically assigned.
2. Extraction phase: the headers are initially stored in a variable upon receiving each GET
request. Subsequently, the HTML page for the response is chosen, and the headers are
parsed to extract relevant information. The receiver aims to simulate the functioning of
a real web server, responding with HTML pages of varying sizes without emphasising
their actual content. In response to each request, the receiver employs a random web
page, concluding the communication to prevent anomalies. Post-response, the server
analyses the previously saved headers for information. It performs a reverse process
from the client, reading request data and employing dictionaries to derive binary strings,
which are sequentially stored. Following the parsing of each request, the first byte of
the constructed binary string is examined, representing the length of the block read and
sent. The corresponding bytes are written to the destination file if the sum of other bits
surpasses the previously read value. Throughout this process, the terminal is updated
with the count of received requests.
3. Ending phase: once the server receives a request with the Accept field set to application/rtf,
it writes the last bytes to the file and communicates that operations are complete.</p>
        <p>Notably, such a prototype, operating at the bit level, can transmit any data stream. Table 1
depicts all of the header fields of an HTTP GET request issued; the file sent requires around
eight queries, this being the sixth. A transmitted file of a size of 459 bytes produced a total of 9
requests.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusions and Future Work</title>
      <p>This paper reviewed network steganography and some related approaches in the literature by
summarising the fundamental characteristics of this research field. Steganography comprises
the science and art of hiding information transfer and storage. It is not to be confused with
cryptography: while they both share the ultimate goal of protecting information, the former
attempts to hide it to make it “dificult to notice”.</p>
      <p>
        Many existing works call this type of communication a covert channel, referencing a concept
ifrst introduced by Lampson in 1973 [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ]. The latter part of this paper outlines our proposed
implementation of a covert channel. Section 4 elucidates the rationale behind critical development
decisions and provides a comprehensive account of the implementation details. Our prototype
is rooted in HTTP/HTTPS requests, where we modulate header values to embed our targeted
information steganographically.
      </p>
      <p>While not explicitly addressed, a fundamental consideration for the technique’s development
is its behaviour in the presence of HTTP proxies along the packet path. In this case, requests
for web resources are routed via the proxy server rather than directly to the destination server.
After retrieving the response from the destination server, the proxy server relays the request
back to the client. HTTP proxies can modify several fields in the HTTP header as they process
requests and responses between clients and servers. For example, proxies might modify the
Accept-Encoding header to perform content compression.</p>
      <p>Given this, we think a future direction could be to adapt channels to diferent contexts,
i.e., by assembling a portfolio of them, for example, with the proposal in [32] by some of
the authors of this work. Instead of relying solely on one steganography method, a general
parametric framework could switch between several potential alternatives based on information
gathered on the target network. While there is still no guarantee that this channel would
evade the defender’s countermeasures, it has two benefits over a channel that employs a single
steganography approach. The first benefit is that it drives up costs for the defender because
it is probably necessary to deploy more tools (or, at the very least, configure the ones already
there more capillary) to secure the network against diferent exfiltration strategies. The second
benefit is that as the defender’s setup gets more intricate, there is a greater chance of human
error; if even one exfiltration method works, the attacker could finally prevail.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>The authors are members of the Gruppo Nazionale Calcolo Scientifico-Istituto Nazionale di Alta
Matematica (GNCS-INdAM). This work has been partially supported by:
• GNCS-INdAM, CUP_E53C22001930001 and CUP_E53C23001670001;
• European Union - Next Generation EU PNRR MUR PRIN - Project J53D23007220006</p>
      <p>EPICA: “Empowering Public Interest Communication with Argumentation”;
• University of Perugia - Fondo Ricerca di Ateneo (2020, 2021, 2022) - Projects</p>
      <p>BLOCKCHAIN4FOODCHAIN, FICO, AIDMIX, “Civil Safety and Security for Society”;
• European Union - Next Generation EU NRRP-MUR - Project J97G22000170005 VITALITY:
“Innovation, digitalisation and sustainability for the difused economy in Central Italy”;
• Piano di Sviluppo e Coesione del Ministero della Salute 2014-2020 - Project
I83C22001350001 LIFE: “the itaLian system Wide Frailty nEtwork” (Linea di azione 2.1
“Creazione di una rete nazionale per le malattie ad alto impatto” - Traiettoria 2 “E-Health,
diagnostica avanzata, medical devices e mini invasività”).
Technical Report, DOD 5200.28-STD, National Computer Security Center, Dec 1985. http:
//csrc.nist.gov/publications/history/dod85.pdf.
[15] A. Ganivev, O. Mavlonov, B. Turdibekov, et al., Improving data hiding methods in network
steganography based on packet header manipulation, in: 2021 International Conference
on Information Science and Communications Technologies (ICISCT), IEEE, 2021, pp. 1–5.
[16] mnd web docs, An overview of HTTP, Technical Report, Mozilla, 2022. https://developer.</p>
      <p>mozilla.org/en-US/docs/Web/HTTP/Overview.
[17] J. F. Kurose, K. W. Ross, Computer Networking: A Top-Down Approach, Pearson, 2020.
[18] mnd web docs, HTTP Messages, Technical Report, Mozilla, 2022. https://developer.mozilla.</p>
      <p>org/en-US/docs/Web/HTTP/Messages.
[19] A. Dyatlov, S. Castro, Exploitation of data streams authorized by a network access
control system for arbitrarydata transfers: Tunneling and covert channels over the HTTP
protocol, Zugrif am unter http://dl.packetstormsecurity.net/papers/protocols/covert_
paper.txt (2003).
[20] V. Hauser, Placing backdoors through firewalls, WindowsSecuriy.com (1999).
[21] L. Bowyer, Firewall bypass via protocol steganography, Network Penetration (2002).
[22] M. Bauer, New covert channels in HTTP: adding unwitting web browsers to anonymity
sets, in: Proceedings of the 2003 ACM workshop on Privacy in the electronic society, 2003,
pp. 72–78.
[23] E. Brown, B. Yuan, D. Johnson, P. Lutz, Covert channels in the HTTP network protocol:
Channel characterization and detecting man-in-the-middle attacks, Journal of Information
Warfare 9 (2010) 26–38.
[24] S. Heilman, J. Williams, D. Johnson, "covert channel in HTTP user-agents", in: 11th Annual</p>
      <p>Symposium on Information Assurance, ASIA’16, 2016, pp. 68–73.
[25] Z. Kwecka, Application layer covert channel analysis and detection, Undergraduate Project</p>
      <p>Dissertation, Napier University (2006).
[26] L. Ji, W. Jiang, B. Dai, X. Niu, A novel covert channel based on length of messages, in: 2009
International Symposium on Information Engineering and Electronic Commerce, IEEE,
2009, pp. 551–554.
[27] D. Alman, Http tunnels through proxies, SANS Institute (2003).
[28] M. Van Horenbeeck, Deception on the network: thinking diferently about covert channels,</p>
      <p>Australian Information Warfare and Security Conference (2006).
[29] R. Duncan, J. E. Martina, Steganographic message broadcasting using web protocols, in:
proceedings of: Simposio Brasilerio de Seguranca (SBSeg 2010), Fortaleza, Brasil, 2010, pp.
61–70.
[30] A. Castiglione, A. d. Santis, U. Fiore, F. Palmieri, E-mail-based covert channels for
asynchronous message steganography, in: 2011 Fifth International Conference on
Innovative Mobile and Internet Services in Ubiquitous Computing, 2011, pp. 503–508.
doi:10.1109/IMIS.2011.133.
[31] S. Wendzel, L. Caviglione, W. Mazurczyk, A. Mileva, J. Dittmann, C. Krätzer, K. Lamshöft,
C. Vielhauer, L. Hartmann, J. Keller, et al., A generic taxonomy for steganography methods,
TechRxiv (2022). URL: http://dx.doi.org/10.36227/techrxiv.20215373.v2.
[32] S. Bistarelli, A. Imparato, F. Santini, A tcp-based covert channel with integrity check and
retransmission, in: 20th Annual International Conference on Privacy, Security and Trust,</p>
    </sec>
    <sec id="sec-7">
      <title>A. Appendix</title>
      <p>From
Range
· · ·
Wed, 05 Feb 2003 12:24:13 CET
Bit
00000000000000000000000000
· · ·
01000010110011001100001101</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D.</given-names>
            <surname>Kahn</surname>
          </string-name>
          ,
          <article-title>The history of steganography</article-title>
          , in: R.
          <string-name>
            <surname>Anderson</surname>
          </string-name>
          (Ed.),
          <source>Information Hiding</source>
          , Springer Berlin Heidelberg, Berlin, Heidelberg,
          <year>1996</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>O. I.</given-names>
            <surname>Abdullaziz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. T.</given-names>
            <surname>Goh</surname>
          </string-name>
          , H.
          <string-name>
            <surname>-C. Ling</surname>
            ,
            <given-names>K.</given-names>
          </string-name>
          <string-name>
            <surname>Wong</surname>
          </string-name>
          ,
          <article-title>Network packet payload parity based steganography</article-title>
          ,
          <source>in: 2013 IEEE Conference on Sustainable Utilization and Development in Engineering and Technology (CSUDET)</source>
          , IEEE,
          <year>2013</year>
          , pp.
          <fpage>56</fpage>
          -
          <lpage>59</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Lubacz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Mazurczyk</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Szczypiorski</surname>
          </string-name>
          ,
          <article-title>Principles and overview of network steganography</article-title>
          ,
          <source>IEEE Communications Magazine</source>
          <volume>52</volume>
          (
          <year>2014</year>
          )
          <fpage>225</fpage>
          -
          <lpage>229</lpage>
          . doi:
          <volume>10</volume>
          .1109/
          <string-name>
            <surname>MCOM</surname>
          </string-name>
          .
          <year>2014</year>
          .
          <volume>6815916</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>S.</given-names>
            <surname>Bistarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ceccarelli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Luchini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>I.</given-names>
            <surname>Mercanti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Santini</surname>
          </string-name>
          ,
          <article-title>A survey of steganography tools at layers 2-4 and HTTP</article-title>
          ,
          <source>in: Proceedings of the 18th International Conference on Availability, Reliability and Security</source>
          ,
          <string-name>
            <surname>ARES</surname>
          </string-name>
          <year>2023</year>
          , Benevento, Italy, 29
          <source>August 2023- 1 September</source>
          <year>2023</year>
          , ACM,
          <year>2023</year>
          , pp.
          <volume>81</volume>
          :
          <fpage>1</fpage>
          -
          <lpage>81</lpage>
          :
          <fpage>9</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Shamir</surname>
          </string-name>
          , IP = PSPACE,
          <string-name>
            <surname>J. ACM</surname>
          </string-name>
          39 (
          <year>1992</year>
          )
          <fpage>869</fpage>
          -
          <lpage>877</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Chase</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Gallatin</surname>
          </string-name>
          ,
          <string-name>
            <surname>K. G.</surname>
          </string-name>
          <article-title>Yocum, End system optimizations for high-speed TCP</article-title>
          ,
          <source>IEEE Communications Magazine</source>
          <volume>39</volume>
          (
          <year>2001</year>
          )
          <fpage>68</fpage>
          -
          <lpage>74</lpage>
          . doi:
          <volume>10</volume>
          .1109/35.917506.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>L.-A.</given-names>
            <surname>Larzon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Degermark</surname>
          </string-name>
          , S. Pink,
          <article-title>UDP lite for real time multimedia applications</article-title>
          ,
          <string-name>
            <surname>Hewlett-Packard Laboratories</surname>
          </string-name>
          ,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>J.</given-names>
            <surname>Jung</surname>
          </string-name>
          , E. Sit,
          <string-name>
            <given-names>H.</given-names>
            <surname>Balakrishnan</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. Morris,</surname>
          </string-name>
          <article-title>DNS performance and the efectiveness of caching</article-title>
          ,
          <source>in: Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement</source>
          , IMW '01,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2001</year>
          , p.
          <fpage>153</fpage>
          -
          <lpage>167</lpage>
          . URL: https://doi.org/10.1145/505202.505223.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>S.</given-names>
            <surname>Lederer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Müller</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Timmerer</surname>
          </string-name>
          ,
          <article-title>Dynamic adaptive streaming over HTTP dataset</article-title>
          ,
          <source>in: Proceedings of the 3rd Multimedia Systems Conference, MMSys '12</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2012</year>
          , p.
          <fpage>89</fpage>
          -
          <lpage>94</lpage>
          . URL: https://doi.org/10.1145/ 2155555.2155570.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Szczypiorski</surname>
          </string-name>
          ,
          <article-title>Steganography in TCP/IP networks. state of the art and a proposal of a new system-</article-title>
          <string-name>
            <surname>HICCUPS</surname>
          </string-name>
          , Warsaw University of Technology, Poland Institute of Telecommunications, Warsaw, Poland (
          <year>2003</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zander</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. J.</given-names>
            <surname>Armitage</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Branch</surname>
          </string-name>
          ,
          <article-title>A survey of covert channels and countermeasures in computer network protocols</article-title>
          ,
          <source>IEEE Commun. Surv. Tutorials</source>
          <volume>9</volume>
          (
          <year>2007</year>
          )
          <fpage>44</fpage>
          -
          <lpage>57</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B. W.</given-names>
            <surname>Lampson</surname>
          </string-name>
          ,
          <article-title>A note on the confinement problem</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>16</volume>
          (
          <year>1973</year>
          )
          <fpage>613</fpage>
          -
          <lpage>615</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <surname>C. S. C.</surname>
          </string-name>
          (US), Computer Security Requirements:
          <article-title>Guidance for Applying the Department of Defense Trusted Computer System Evaluation Criteria in Specific Environments</article-title>
          , Dod Computer Security Center,
          <year>1985</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>U. D. National</surname>
          </string-name>
          Computer Security Center, Trusted Computer System Evaluation Criteria,
          <source>Value Thu, 01 Jan</source>
          <year>1991</year>
          00:
          <issue>00</issue>
          :00 CET · · · Thu, 16
          <source>Aug</source>
          <year>2022</year>
          16:
          <article-title>32:32 CET gmail</article-title>
          .
          <article-title>com outlook.com yahoo.com proton.me virgilio.it libero</article-title>
          .it email.
          <source>it mail.com 0</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>