A Preliminary Study on the Creation of a Covert
                                Channel with HTTP Headers
                                Stefano Bistarelli1 , Michele Ceccarelli3 , Chiara Luchini1,2 , Ivan Mercanti1 and
                                Francesco Santini1
                                1
                                  Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, Via Vanvitelli 1, 06123 Perugia (PG), Italy
                                2
                                  Dipartimento di Matematica e Informatica “Ulisse Dini”, Università degli Studi di Firenze, Viale Giovanni Battista
                                Morgagni, 67/a, 50134 Firenze (FI), Italy
                                3
                                  Colacem S.p.A., Gubbio (PG), Italy


                                            Abstract
                                            Steganography conceals confidential information within seemingly innocuous data and has evolved with
                                            technological advancements. In network steganography, data is hidden in packets exchanged at different
                                            levels (e.g., Ethernet, IP, TCP, etc.). This paper considers the HTTP protocol for setting up a covert
                                            channel between two endpoints: the main motivation is that creating ad-hoc HTTP packet headers does
                                            not require superuser privileges, while TCP segment headers, for example, require them. This simplifies
                                            the execution of tools implementing the channel. Moreover, HTTP/HTTPS traffic is usually allowed
                                            to flow to/from a local network and is often not modified (if not automatically proxied). Therefore, we
                                            propose a detailed exploration of a covert channel protocol by modulating standard fields in the HTTP
                                            headers for unidirectional communication, i.e., from a sender to a receiver.

                                            Keywords
                                            Network Steganography, Covert Channels, HTTP


                                1. Introduction
                                Steganography [1, 2] is an ancient technique used for centuries to hide sensitive information
                                from public view in seemingly innocent material. Its development throughout time, spurred
                                by technological breakthroughs, has produced a variety of steganographic methods that have
                                increased its applicability in a wide range of fields. A pivotal moment in 2003 marked the
                                introduction of “network steganography” [3], often referred to as covert channels [4], emerging as
                                a widely implemented type in practical settings. In general terms, covert transfer of information
                                always features the following elements regardless of its specific: a covert sender, the entity that
                                sends secret information, and a covert receiver, the entity that receives secret information. Then,
                                the covert object: is the data carrier in which the covert sender hides secret information. It must
                                be selected so that it does not represent an anomaly but at the same time has enough embedding


                                ITASEC 2024: The Italian Conference on CyberSecurity, April 8-12 2024, Salerno, Italy
                                $ stefano.bistarelli@unipg.it (S. Bistarelli); chiara.luchini@collaboratori.unipg.it (C. Luchini);
                                ivan.mercanti@unipg.it (I. Mercanti); francesco.santini@unipg.it (F. Santini)
                                 https://bista.sites.dmi.unipg.it/ (S. Bistarelli); https://francescosantini.sites.dmi.unipg.it/ (F. Santini)
                                 0000-0001-7411-9678 (S. Bistarelli); 0009-0001-6846-0922 (C. Luchini); 0000-0002-9774-1600 (I. Mercanti);
                                0000-0002-3935-4696 (F. Santini)
                                          © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
capacity. Finally, a representation specifies how secret information is embedded in the covert
object.
   Network steganography can be employed in various legitimate and potentially malicious
applications. Some examples of possible applications include data exfiltration in espionage and
Intelligence, confidential business communication, malicious activities (malware communication
or command-and-control traffic to evade IDSs), anonymous communication and anti-censorship,
and copyright protection (embedding information to identify the origin or ownership of digital
content).
   The creation of a thorough taxonomy for the scientific community was prompted by the
widespread usage of covert channels over time, which addressed issues ranging from cyber-
security to terrorism. When the “Information Hiding Project"1 was first launched, it used a
model-based classification system to classify different steganographic methods and suggested
performance assessment indices to gauge how effective they were over a wide range of covered
channels.
   Practical implementations of network steganography may involve the manipulation of various
communication protocols, such as IP [5], TCP [6], UDP [7], ICMP2 , DNS [8], and HTTP [9].
Researchers in this field continually work to develop new techniques that can evade monitoring
and detection systems, leading to an ongoing evolution of concealment strategies. Network
steganography poses significant challenges in the context of cybersecurity, requiring a constant
effort to develop advanced detection methods. Its growing relevance is highlighted by the
need to explore new approaches and countermeasures to protect digital networks from using
steganography for malicious purposes.
   Our proposal emphasises a detailed definition of an HTTP-level covert channel protocol. The
prototype’s client and server components use HTTP headers for one-way communication while
prioritising particular features. The following sections make up the structure of this paper.
In Section 2, we provide an overview of the HTTP protocol and some technical elements of
network steganography, including nomenclature and taxonomy. Section 3 presents the most
interesting works on HTTP steganography. Section 4 describes our prototype idea. We explore
our findings and possible future directions for this study in Section 5.


2. Background
This Section presents an overview of the network steganography and HTTP protocol.

2.1. Network Steganography
The fundamental concept of network steganography is to hide messages or data within other
seemingly innocuous data, ensuring that the act of concealment does not raise suspicions.
Unlike traditional forms of steganography, which often target images or audio files, network
steganography centres on manipulating data packets, frames, or communication protocols
within a computer network. Krzysztof Szczypiorsky originally presented the idea of network

1
    https://patterns.ztt.hs-worms.de
2
    https://www.rfc-editor.org/rfc/rfc1256.
steganography in 2003 [10]. Compared to other well-known methods like picture steganography,
this new kind of steganography had received little attention before its presentation. This
modern branch has seen tremendous growth regarding communication concealment in the last
few decades, bringing many innovative network steganography techniques to the scientific
community.
   Researchers use terms such as “information hiding" and “covert channel" to refer to the
same technique, which involves concealing information in network protocols [11]. Differences
between Lampson ’s [12] initial definition of a covert channel and a later definition by the US
Department of Defence (US DoD) [13] have partly contributed to this. This paper refers to a
covert channel as a hidden or secret channel intended to facilitate discrete data transmission
between two peers by covertly exchanging information within a network protocol. On the other
hand, a overt channel is a recognised channel where a sender and a receiver can legitimately
communicate information. When discussing the sender and receiver, it is important to make
clear distinctions. Specifically, an Overt Sender (OS) is defined as the individual who transmits
data through a legitimate channel, while an Overt Receiver (OR) is the individual who receives
the data. In contrast, a Secret Sender (SS) is defined as the entity sending data in a hidden
channel, while the Secret Receiver (SR) is the entity receiving it. It is important to note that SS
and SR may not always align with OS and OR, making the latter unaware that third parties
are using their communication for other operations. Illegitimate communication in a covert
channel involves two processes: embedding and extraction. The embedding process allows the
sender to conceal secret data within legitimate communication, while the extraction process
allows retrieving such data.
   Traditionally, covert channels were categorised into Covert Storage Channels (CSC) and
Covert Timing Channels (CTC), although there is no fundamental distinction between them [13,
14]. Storage channels involve the sender’s direct/indirect inscription of object values and
the receiver’s direct/indirect reading of these values. On the other hand, timing channels
entail the sender signalling information by modulating resource usage (e.g. CPU usage) over
time, allowing the receiver to observe and decode the transmitted data. The methodologies
employed in the construction of covert channels are numerous and diverse. However, they may
be broadly classified into two categories: those that alter the bits of packets, thereby storing
information directly in the traffic and those that modify the timing or behaviour of the flow,
allowing the receiver to decode covert data by observing and interpreting the traffic. Recently,
a third category, referred to as hybrid channels, has been introduced alongside storage and
timing channels [15]. These techniques combine the utilisation of both storage and timing
methodologies.

2.2. The HTTP Protocol Header Fields
Hypertext Transfer Protocol (HTTP) is a client-server protocol that is used to fetch resources
such as HTML documents. It is the foundation of web data exchange and is reconstructed from
sub-documents such as text, images, videos, and scripts [16]. HTTP requests are composed of
headers3 and a body. The headers convey essential information for processing the data in the

3
    HTTP headers: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers
Figure 1: Structure of an HTTP request message [17].


body. Clients include fields in the requests to inform the server about their handling capabilities
and preferences for receiving requested resources. For example, clients may specify their ability
to process compressed resources, preferred language, or acceptance of older resource versions.
However, it is essential to note that although clients express these preferences, servers may
disregard them or respond based on their preferences if they cannot comply.
   HTTP request headers in Figure 1, adhere to a consistent structure, comprising a case-
insensitive string followed by a colon (‘:’) and a value. The entire header, including the value, is
on a single line, which may be lengthy. Requests may feature diverse headers categorised into
groups [18]:

    • General headers refers to an HTTP header that may be used in both request and response
      messages but is unrelated to the content.
    • Request headers, such as User-Agent or Accept, further specify or provide context to the
      request. Examples include Accept-Language for language preference and Referer for
      contextual information. Some, like If-None-Match, conditionally restrict the request.
    • Representation headers, including Content-Type, delineate the original format of the mes-
      sage data and any applied encoding. These headers are only present if the message
      includes a body.

   Requests and responses feature standard and personalised fields, encompassing proxy settings,
security configurations, and server-set parameters. Standard request fields communicate the
client’s characters, encodings, manipulations, and language capabilities. They also provide
details about the request, such as date, user agent, or authentication-related data. Meanwhile,
response headers convey data specifications like length, type, encoding, or hash, with additional
fields expediting resource processing. Notably, the Set-Cookie header in responses communicates
cookies the client should set for future interactions.
   Non-standard headers cater to more resource-specific details, such as X-Content-Duration,
indicating the duration of audio or video content in seconds. Given the likelihood of requests
passing through various systems, including proxies, before reaching the server, these intermedi-
aries may modify or control parameters to optimise delivery. We will go over in detail each
HTTP header field that will be employed in our model:
       • Accept: This field contains the MIME types accepted for the response.
       • Accept-Encoding: The Accept-Encoding field contains the encoding formats accepted for
         the response, which can be a value or a list of values.
       • Accept-Language: This field allows clients to choose the language(s) they want to receive
         the requested resource. As a result, one or more values can be specified as well as the
         “:q=” which expresses a preference among several options.
       • Accept-Datetime: This field contains the version date of the requested resource. Dates
         are written using the standard format “<day-name>, <day> <month> <year>
         <hour>:<minute>:<second> <time-zone>".
       • From: The From field can contain the contact information of the person who submitted
         the request, which is useful if any issues need to be resolved by the server. The standard
         requires this contact to have a traditional email address.
       • If-Match: The If-Match field determines whether a resource request has been altered
         when using the POST method. It has an identifier or list of them, and the request won’t
         be handled until one matches the one saved in the server. The most popular method for
         generating resource identifiers is using a hash, such as SHA-256.
       • Range: This field will only be included in the request if the If-Match field is also present.
         This is because, in most circumstances, they are used together to request (or modify) a
         specified resource portion. This field is used to provide the two hypothetical byte ranges.
       • TE: This standard field in GET requests allows clients to specify the transfer encodings
         they accept. In HTTP protocol versions 2 and 3, this field is only permitted when set to
         trailers4 .
       • User-Agent: The User-Agent value is necessary for communication and indicates the
         client version making the request. The server uses this information to determine which
         software it is communicating with. It helps to decide which data to handle and apply
         further optimisations on top of what is supplied in the other fields.


3. Related Work
The highest layers of the ISO/OSI stack, the application layer protocols, have also been utilised
to suggest several hidden channels. At this level, the protocols we uncover can be client-server
or peer-to-peer, where users share information collaboratively. The primary application-level
protocol for information transmission on the Web is HTTP. Although a more secure TLS-based
version (HTTPS) is available, almost all organisations still permit Internet surfing over HTTP.
Dyatlov et al. [19] presented storage channels that exploit the HTTP request/response header
and/or body. The amount of allowed headers changes depending on the web server version,
making an accurate performance evaluation of these strategies impossible.
   One way to transmit instructions and output them secretly over HTTP is using the Reverse
WWW Shell tool [20]. On the other hand, Bowyer [21] encodes messages in URL parameters
4
    https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/TE
or after GET requests and utilises these hidden channels to connect with Trojans hidden
behind firewalls. To build an anonymous overlay network, Bauer [22] suggests using common
online user communications, such as headers, cookies, redirects, HTML components, and
active content. The majority of these methods, along with the currently available HTTP covert
channels analyses, are documented in the work by Brown et al. [23].
  Heilman et al. [24] presents a covert channel that mimics stealthy behaviour by leveraging the
base Linux shell and command language while relying minimally on system resources through
the usage of the User-Agent string in the HTTP Request Header. The channel’s usage of HTTP
enables it to blend in with network traffic and propagate over wide-area networks, possibly
expanding its reach and making it more accessible to Bash shell users.
  Kwecka [25] proposes a method to embed covert data into HTTP headers by leveraging
the protocol’s treatment of various amounts of whitespace as a single character. For instance,
tabulation can represent 1, while a standard space can represent 0. Additionally, the varying
capitalisation of letters can be used for secret data transfer.
  Ji et al. [26] introduced a technique based on the length of HTTP packets, achieving a recorded
performance of 50 bytes, which includes 20 bytes for the TCP Header, 20 bytes for the IP Header,
and 18 bytes for the Ethernet Header.
  Alman [27] elucidates how a connection can be established through a proxy server by
exploiting a vulnerability in the CONNECT method. Van Horenbeeck [28] developed the
Wondjina tool, enabling a client to validate its cached copy using HTTP Entity tags. Similar
concepts are applied in [29], incorporating LSB approaches on the Date and Last Modified fields.


4. A Prototype
Notably, real-world implementations have been articulated primarily for the IP, TCP, UDP,
ICMP, DNS, and HTTP protocols. The prominence of the first three protocols is attributed to
the substantial generation of network traffic at the upper layers of the ISO/OSI stack, enabling
broader applicability to numerous data packages. However, the requisite root permissions for
operations at these layers limit potential application scopes.
   ICMP and DNS emerge as commonly used protocols, offering easily accessible payloads
and the ability to initiate conversations without necessitating root privileges. Nevertheless,
their widespread use has led to the proliferation of monitoring techniques leveraging artificial
intelligence, which poses a challenge.
   The HTTP protocol, operating at the Application level and associated with web browsing,
dominates internet traffic within contemporary computer networks. Monitoring systems that
comprehensively analyse HTTP packets are notably scarce, with a prevalent focus on com-
munication characteristics. Anomalies, such as an imbalance in data transmission between
client and server during a single connection, may indicate potential covert communication.
Despite HTTP protocol communications generally circumventing the need for root access, most
implementations deviate from pure network steganography, employing tunnelling methods
within ostensibly innocuous HTTPS traffic.
   Our objective encompassed the development of a protocol designed to operate seamlessly on
most computers and evade detection by large-scale systems. The choice of the application level
was deliberate, considering the absence of root privileges and the widespread use of the HTTP
protocol among users. The decision to focus on a unidirectional channel, akin to data exfiltration,
aimed to enhance the methodology’s versatility across scenarios. Nevertheless, the outcomes
exhibit adaptability and can be easily extended to facilitate bidirectional communication, thus
broadening its potential applications. The data integration approach within HTTPS packages
was achieved through a unified solution using different header fields of an HTTP request.
   After examining the several recommendations in the literature, we concluded that [30] email
suggestion would be most helpful for our investigation. Initially, they proposed entering infor-
mation in the “Message-ID" and “Content-Type" boxes. We modified the method to represent
the data according to the characteristics of the different fields and expanded the methodology
to HTTP request headers. Before discussing each component’s operation in depth, we specified
the modulation of HTTP headers for data exfiltration.

4.1. HTTP Request Modulation
There are numerous available fields for creating queries, including nonstandard ones. Some
fields imply the presence or absence of others. We also consider the maximum number of bits
that could be injected into a single field and avoid fields that might be modified. To construct
our requests, we follow the format of requesting a portion of a previously passed resource using
the GET method. In Section 2, we listed all the HTTP header fields included in a GET request.
   This Section describes the technique used to represent bit strings for each field. Powers of 2 are
used primarily to facilitate the representation of possible binary strings. The technique proposed
in this paper falls under the PS11 category, as it involves Value Modulation and preserves the
structure [31]. We describe a potential modulation of the GET request’s HTTP header fields and
specify the appropriate course of action for each. Table 2 shows the modulation of the Accept,
Accept-Enconding and Accept-Language fields, while Table 3 displays the modulation of the
Accept-Datetime, From, Range and TE fields.

Accept. The Accept field specifies the MIME types accepted for the response. We focus on
the most common MIME types such as text/plain, text/css and so on. In this case, there
can be 16 possible values, allowing hiding a sequence of up to 4 bits within this field.

Accept-Encoding. Even with the Accept-Encoding field, which lists the accepted encoding
types for the response, the options should be limited to the eight most common types. This
would allow for the hiding of 3-bit sequences.

Accept-Language. The client uses the Accept-Language field to indicate the preferred lan-
guage(s) for receiving the requested resource. In our case, we considered three languages:
Italian, American English, and British English. Italian was chosen because it is the language
of the authors of this paper, while American and British English were selected due to their
widespread usage. We also considered the preference value ‘ q’ and the presence or absence
of the considered languages to expand the representable data set. Unlike the other fields, the
Accept-Language field can steganograph a maximum of three bits, but it can also be used for bit
sequences of shorter lengths.
Accept-Datetime. In this context, the version date of the requested resource is de-
noted, employing a standard date format: “ <day-name><day><month><year><hour>:
<minute>:<second><time-zone>". We use all fields except the first and last to exploit this
date format. The first field is omitted due to its dependency on others, and the last is disregarded
because requests from the same machine cannot differ in time zone.
   Like other fields, the approach involves considering the maximum power of two for each
piece of data, with binary strings assigned to that number of elements. For instance, considering
February the shortest month with 28 days, the largest power of two less than 28 is 16 (24 ). This
logic modulates the day-name value, minutes, hours, and seconds. The first eight (23 ) months
are considered, and the years from 1991 to 2022, resulting in 32 (25 ) valid values, avoiding dates
before the internet’s inception. The day of the week is determined once the date is generated,
and the specified Italian CET time zone can be used. However, these two values are disregarded
during the decoding process. In summary, it is feasible to conceal 26 bits, considering string
lengths of 4 for the day and hour, 3 for the month, and 5 for the year, minutes, and seconds.

From. This field includes the contact information of the request submitter, providing valuable
details for server issue resolution. The standard mandates that this contact information adhere
to a traditional email address format.
   To address this, we acquire databases containing the most popular DNS names in the US over
recent years, merge them, and generate a dictionary comprising 32,768 names. Each entry in
this dictionary corresponds to a 15-bit binary string (215 =32768). Additionally, we identify eight
of the most popular email domains and assign them their respective 3-bit binary strings.

If-Match. The If-Match field is employed to ascertain whether a resource request has un-
dergone alterations when using the POST method. This field includes an identification or a
list of identifiers, and the request is not processed unless it matches the identifier stored on
the server. Typically, resource IDs are produced with a hash algorithm such as SHA-256. We
assume that SHA-256 was used for digest computation and that the request includes two hashes.
We transform 256 bits of the secret message into hexadecimal digits and enter them in the field.
Given that SHA-256 yields a 256-bit digest, we may mask 512 bits in this field by entering two
numbers. Notably, this field and its accompanying Range field (described in the next paragraph)
are not used if less than 256 bits of data must be sent.

Range. As previously stated, the Range field appears in the request only when the If-Match
parameter is included. These fields are commonly used to request or edit a specified portion of
a resource. This parameter is used to provide two hypothetical byte ranges. Recognising that
a resource may consist of several bytes, we consider it fair to utilise integers between 0 and
1023 (210 values) to define two intervals. Each number is now assigned a 10-bit binary string,
which allows for hiding 40 bits inside this field. The first value is extracted from the first ten
bits, followed by the second value from the ten bits after that. The initial number is then added
to the latter to avoid an unreasonable interval in which the end is smaller than the start. The
same approach is used to compute the second interval.
TE. This standard field is present in GET requests and allows the client to declare which
transfer encodings it is ready to accept. Recognising its fundamental role in requests and
consistent presence, it was deemed suitable for concealing steganography information. Given
the restricted choices, the four most regularly used values were assigned to 2-bit binary strings.

User-Agent. The User-Agent field is obligatory in communications and serves to identify the
client version initiating the request. This information is important for the server to discern
the program with which it interacts, enabling the selection of data handling and implementing
optimisations beyond those offered in other fields.
   In this context, modulation is eschewed, and instead, the identifying string of a Mozilla
version is entered into each request. This choice is based on Mozilla being the most prevalent
browser across various systems. Detecting requests from the same computer with multiple
User-Agents in a brief period would likely raise concerns.

                                           Request structure
          Field                                   Example                                     Bit
        Accept:            text/. . . , image/. . . , video/. . . , application/. . .          4
   Accept-Encoding:            gzip, deflate, compress, br, identity, *                        3
   Accept-Language:              it-IT,it;q=0.9,en-US;q=0.8,en;q=0.7                           3
   Accept-Datetime                 Wed, 21 Oct 2015 07:28:00 GMT                               26
         From:                              aristea@libero.it                                  18
       If-Match:                           (x2) hash da 256bit                                512
         Range:                              bytes:1207-2367                                   40
           TE:                     compress, deflate, gzip, trailers                           2
      User-Agent         Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0)                      -
                                      Gecko/20100101 Firefox/47.0
                                              Total bit sent                            608 bit = 76 Byte
Table 1
Example of HTTP header structure.

   Table 1 provides an example of HTTP request fields associated with the corresponding amount
of bits that may be modulated.

4.2. Implementation Details
The proposed prototype has two primary components: sender and receiver. The first takes as
input arguments the file or directory to steganograph inside headers and the URL to make
requests to. Instead, the receiver is a web server listening on a specific port.
  The client executable, denoted as the sender, primarily functions to embed steganographically
or exfiltrate a file or directory via HTTP requests. Its workflow is as follows:

   1. Preliminary operations: The sender prints the tool banner and generates the dictionaries
      for the Range and From fields.
   2. Parameters: The sender takes the URL and the path to the file or directory to be sent as
      input parameters.
   3. Beginning of communication: the communication begins with a request that has the value
      of the Accept field set to application/zip, with the remaining fields either left blank or, if
      necessary, defaulted.
   4. Sending phase: sender manages path reading, file or directory identification, request
      creation, and send. The file is opened in binary read mode, and an estimate of the number
      of requests required is reported to the terminal. The process reads the bit blocks of the file,
      creating requests based on the binary string obtained from the file. Each field is assigned
      a dictionary, with the key being the binary string and the associated value representing
      the information to be written into the request. The file transfer process involves sending
      a new request when the input file’s length is incompatible with the binary string. If this
      happens, a new request is created cyclically until the entire block is sent. The number of
      requests sent is updated at each step, providing an estimate of the completion percentage.
   5. Ending phase: the server is informed of the end of exfiltration operations by sending a
      request with the Accept field set to application/rtf. The total number of requests used in
      the process is displayed on the screen, managed in a variable updated each time a request
      is sent.
   Contrarily, the receiver is a web server listening on a designated port, mirroring the sender’s
structure.
   1. Preliminary operations: the server’s execution is based on a main function, which prints
      the terminal banner and creates dictionaries related to person names and ranges. The
      distinction from those the client uses is that the key is the information found in the
      request, while the binary string represents the related value. The port number and file
      names to be generated are prerequisites; otherwise, default values, specifically port 8000
      and the name output_file, will be automatically assigned.
   2. Extraction phase: the headers are initially stored in a variable upon receiving each GET
      request. Subsequently, the HTML page for the response is chosen, and the headers are
      parsed to extract relevant information. The receiver aims to simulate the functioning of
      a real web server, responding with HTML pages of varying sizes without emphasising
      their actual content. In response to each request, the receiver employs a random web
      page, concluding the communication to prevent anomalies. Post-response, the server
      analyses the previously saved headers for information. It performs a reverse process
      from the client, reading request data and employing dictionaries to derive binary strings,
      which are sequentially stored. Following the parsing of each request, the first byte of
      the constructed binary string is examined, representing the length of the block read and
      sent. The corresponding bytes are written to the destination file if the sum of other bits
      surpasses the previously read value. Throughout this process, the terminal is updated
      with the count of received requests.
   3. Ending phase: once the server receives a request with the Accept field set to application/rtf,
      it writes the last bytes to the file and communicates that operations are complete.
   Notably, such a prototype, operating at the bit level, can transmit any data stream. Table 1
depicts all of the header fields of an HTTP GET request issued; the file sent requires around
eight queries, this being the sixth. A transmitted file of a size of 459 bytes produced a total of 9
requests.
5. Conclusions and Future Work
This paper reviewed network steganography and some related approaches in the literature by
summarising the fundamental characteristics of this research field. Steganography comprises
the science and art of hiding information transfer and storage. It is not to be confused with
cryptography: while they both share the ultimate goal of protecting information, the former
attempts to hide it to make it “difficult to notice”.
    Many existing works call this type of communication a covert channel, referencing a concept
first introduced by Lampson in 1973 [12]. The latter part of this paper outlines our proposed im-
plementation of a covert channel. Section 4 elucidates the rationale behind critical development
decisions and provides a comprehensive account of the implementation details. Our prototype
is rooted in HTTP/HTTPS requests, where we modulate header values to embed our targeted
information steganographically.
    While not explicitly addressed, a fundamental consideration for the technique’s development
is its behaviour in the presence of HTTP proxies along the packet path. In this case, requests
for web resources are routed via the proxy server rather than directly to the destination server.
After retrieving the response from the destination server, the proxy server relays the request
back to the client. HTTP proxies can modify several fields in the HTTP header as they process
requests and responses between clients and servers. For example, proxies might modify the
Accept-Encoding header to perform content compression.
    Given this, we think a future direction could be to adapt channels to different contexts,
i.e., by assembling a portfolio of them, for example, with the proposal in [32] by some of
the authors of this work. Instead of relying solely on one steganography method, a general
parametric framework could switch between several potential alternatives based on information
gathered on the target network. While there is still no guarantee that this channel would
evade the defender’s countermeasures, it has two benefits over a channel that employs a single
steganography approach. The first benefit is that it drives up costs for the defender because
it is probably necessary to deploy more tools (or, at the very least, configure the ones already
there more capillary) to secure the network against different exfiltration strategies. The second
benefit is that as the defender’s setup gets more intricate, there is a greater chance of human
error; if even one exfiltration method works, the attacker could finally prevail.


Acknowledgments
The authors are members of the Gruppo Nazionale Calcolo Scientifico-Istituto Nazionale di Alta
Matematica (GNCS-INdAM). This work has been partially supported by:

    • GNCS-INdAM, CUP_E53C22001930001 and CUP_E53C23001670001;
    • European Union - Next Generation EU PNRR MUR PRIN - Project J53D23007220006
      EPICA: “Empowering Public Interest Communication with Argumentation”;
    • University of Perugia - Fondo Ricerca di Ateneo (2020, 2021, 2022) - Projects
      BLOCKCHAIN4FOODCHAIN, FICO, AIDMIX, “Civil Safety and Security for Society”;
    • European Union - Next Generation EU NRRP-MUR - Project J97G22000170005 VITALITY:
      “Innovation, digitalisation and sustainability for the diffused economy in Central Italy”;
    • Piano di Sviluppo e Coesione del Ministero della Salute 2014-2020 - Project
      I83C22001350001 LIFE: “the itaLian system Wide Frailty nEtwork” (Linea di azione 2.1
      “Creazione di una rete nazionale per le malattie ad alto impatto” - Traiettoria 2 “E-Health,
      diagnostica avanzata, medical devices e mini invasività”).


References
 [1] D. Kahn, The history of steganography, in: R. Anderson (Ed.), Information Hiding, Springer
     Berlin Heidelberg, Berlin, Heidelberg, 1996, pp. 1–5.
 [2] O. I. Abdullaziz, V. T. Goh, H.-C. Ling, K. Wong, Network packet payload parity based
     steganography, in: 2013 IEEE Conference on Sustainable Utilization and Development in
     Engineering and Technology (CSUDET), IEEE, 2013, pp. 56–59.
 [3] J. Lubacz, W. Mazurczyk, K. Szczypiorski, Principles and overview of network steganog-
     raphy, IEEE Communications Magazine 52 (2014) 225–229. doi:10.1109/MCOM.2014.
     6815916.
 [4] S. Bistarelli, M. Ceccarelli, C. Luchini, I. Mercanti, F. Santini, A survey of steganography
     tools at layers 2-4 and HTTP, in: Proceedings of the 18th International Conference on
     Availability, Reliability and Security, ARES 2023, Benevento, Italy, 29 August 2023- 1
     September 2023, ACM, 2023, pp. 81:1–81:9.
 [5] A. Shamir, IP = PSPACE, J. ACM 39 (1992) 869–877.
 [6] J. S. Chase, A. Gallatin, K. G. Yocum, End system optimizations for high-speed TCP, IEEE
     Communications Magazine 39 (2001) 68–74. doi:10.1109/35.917506.
 [7] L.-A. Larzon, M. Degermark, S. Pink, UDP lite for real time multimedia applications,
     Hewlett-Packard Laboratories, 1999.
 [8] J. Jung, E. Sit, H. Balakrishnan, R. Morris, DNS performance and the effectiveness of
     caching, in: Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement,
     IMW ’01, Association for Computing Machinery, New York, NY, USA, 2001, p. 153–167.
     URL: https://doi.org/10.1145/505202.505223.
 [9] S. Lederer, C. Müller, C. Timmerer, Dynamic adaptive streaming over HTTP dataset,
     in: Proceedings of the 3rd Multimedia Systems Conference, MMSys ’12, Association for
     Computing Machinery, New York, NY, USA, 2012, p. 89–94. URL: https://doi.org/10.1145/
     2155555.2155570.
[10] K. Szczypiorski, Steganography in TCP/IP networks. state of the art and a proposal of a
     new system-HICCUPS, Warsaw University of Technology, Poland Institute of Telecommu-
     nications, Warsaw, Poland (2003).
[11] S. Zander, G. J. Armitage, P. Branch, A survey of covert channels and countermeasures in
     computer network protocols, IEEE Commun. Surv. Tutorials 9 (2007) 44–57.
[12] B. W. Lampson, A note on the confinement problem, Communications of the ACM 16
     (1973) 613–615.
[13] C. S. C. (US), Computer Security Requirements: Guidance for Applying the Department
     of Defense Trusted Computer System Evaluation Criteria in Specific Environments, Dod
     Computer Security Center, 1985.
[14] U. D. National Computer Security Center, Trusted Computer System Evaluation Criteria,
     Technical Report, DOD 5200.28-STD, National Computer Security Center, Dec 1985. http:
     //csrc.nist.gov/publications/history/dod85.pdf.
[15] A. Ganivev, O. Mavlonov, B. Turdibekov, et al., Improving data hiding methods in network
     steganography based on packet header manipulation, in: 2021 International Conference
     on Information Science and Communications Technologies (ICISCT), IEEE, 2021, pp. 1–5.
[16] mnd web docs, An overview of HTTP, Technical Report, Mozilla, 2022. https://developer.
     mozilla.org/en-US/docs/Web/HTTP/Overview.
[17] J. F. Kurose, K. W. Ross, Computer Networking: A Top-Down Approach, Pearson, 2020.
[18] mnd web docs, HTTP Messages, Technical Report, Mozilla, 2022. https://developer.mozilla.
     org/en-US/docs/Web/HTTP/Messages.
[19] A. Dyatlov, S. Castro, Exploitation of data streams authorized by a network access
     control system for arbitrarydata transfers: Tunneling and covert channels over the HTTP
     protocol, Zugriff am unter http://dl.packetstormsecurity.net/papers/protocols/covert_
     paper.txt (2003).
[20] V. Hauser, Placing backdoors through firewalls, WindowsSecuriy.com (1999).
[21] L. Bowyer, Firewall bypass via protocol steganography, Network Penetration (2002).
[22] M. Bauer, New covert channels in HTTP: adding unwitting web browsers to anonymity
     sets, in: Proceedings of the 2003 ACM workshop on Privacy in the electronic society, 2003,
     pp. 72–78.
[23] E. Brown, B. Yuan, D. Johnson, P. Lutz, Covert channels in the HTTP network protocol:
     Channel characterization and detecting man-in-the-middle attacks, Journal of Information
     Warfare 9 (2010) 26–38.
[24] S. Heilman, J. Williams, D. Johnson, "covert channel in HTTP user-agents", in: 11th Annual
     Symposium on Information Assurance, ASIA’16, 2016, pp. 68–73.
[25] Z. Kwecka, Application layer covert channel analysis and detection, Undergraduate Project
     Dissertation, Napier University (2006).
[26] L. Ji, W. Jiang, B. Dai, X. Niu, A novel covert channel based on length of messages, in: 2009
     International Symposium on Information Engineering and Electronic Commerce, IEEE,
     2009, pp. 551–554.
[27] D. Alman, Http tunnels through proxies, SANS Institute (2003).
[28] M. Van Horenbeeck, Deception on the network: thinking differently about covert channels,
     Australian Information Warfare and Security Conference (2006).
[29] R. Duncan, J. E. Martina, Steganographic message broadcasting using web protocols, in:
     proceedings of: Simposio Brasilerio de Seguranca (SBSeg 2010), Fortaleza, Brasil, 2010, pp.
     61–70.
[30] A. Castiglione, A. d. Santis, U. Fiore, F. Palmieri, E-mail-based covert channels for
     asynchronous message steganography, in: 2011 Fifth International Conference on In-
     novative Mobile and Internet Services in Ubiquitous Computing, 2011, pp. 503–508.
     doi:10.1109/IMIS.2011.133.
[31] S. Wendzel, L. Caviglione, W. Mazurczyk, A. Mileva, J. Dittmann, C. Krätzer, K. Lamshöft,
     C. Vielhauer, L. Hartmann, J. Keller, et al., A generic taxonomy for steganography methods,
     TechRxiv (2022). URL: http://dx.doi.org/10.36227/techrxiv.20215373.v2.
[32] S. Bistarelli, A. Imparato, F. Santini, A tcp-based covert channel with integrity check and
     retransmission, in: 20th Annual International Conference on Privacy, Security and Trust,
     PST 2023, Copenhagen, Denmark, August 21-23, 2023, IEEE, 2023, pp. 1–7.


A. Appendix

                  HTTP field         Value                                 Bit
                                     text/plain                           0000
                                     text/html                            0001
                                     text/css                             0010
                                     text/javascript                      0011
                                     image/gif                            0100
                                     image/png                            0101
                                     image/jpeg                           0110
                  Accept
                                     image/webp                           0111
                                     video/mpeg                           1000
                                     video/webm                           1001
                                     video/ogg                            1010
                                     video/mp4                            1011
                                     application/octet-stream             1100
                                     application/javascript               1101
                                     application/xml                      1110
                                     application/pdf                      1111
                                     gzip                                  000
                                     compress                              001
                                     deflate                               010
                                     identity                              011
                  Accept-Enconding
                                     gzip, compress                        100
                                     gzip, deflate                         101
                                     gzip, identity                        110
                                     gzip, br                              111
                                     it                                      0
                                     it, *                                   1
                                     it, en-US                              00
                                     it, en-US:q=0.8                        01
                                     it:q=0.9, en-US                        10
                                     it:q=0.9, en-US:q=0.8                  11
                                     it, en-US, en-GR                      000
                  Accept-Language
                                     it, en-US, en-GR:q=0.7                001
                                     it, en-US:q=0.8, en-GR                010
                                     it, en-US:q=0.8, en-GR:q=0.7          011
                                     it:q=0.9, en-US, en-GR                100
                                     it:q=0.9, en-US, en-GR:q=0.7          101
                                     it:q=0.9, en-US:q=0.8, en-GR          110
                                     it:q=0.9, en-US:q=0.8, en-GR:q=0.7    111
Table 2
Modulation of Accept, Accept-Enconding and Accept-Language values.
        HTTP field                                    Value                           Bit
                            Thu, 01 Jan 1991 00:00:00 CET      00000000000000000000000000
                                                        ···                           ···
        Accept-Datetime    Wed, 05 Feb 2003 12:24:13 CET       01000010110011001100001101
                                                        ···                           ···
                           Thu, 16 Aug 2022 16:32:32 CET       11111111111111111111111111
                                               gmail.com                              000
                                             outlook.com                              001
                                               yahoo.com                              010
                                                proton.me                             011
        From
                                                 virgilio.it                          100
                                                   libero.it                          101
                                                   email.it                           110
                                                 mail.com                             111
                                                           0                   0000000000
                                                        ···                           ···
        Range                                           673                    1010100001
                                                        ···                           ···
                                                       1023                    1111111111
                                                 compress                              00
                                                    deflate                            01
        TE
                                                       gzip                            10
                                                    trailers                           00
Table 3
Modulation of Accept-Datetime, From, Range and TE values.