<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Advancing Internet-Connected Devices Posture Analysis with a Meta-Search Engine: A Case Study in Energy Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Andrea Bernardini</string-name>
          <email>abernardini@fub.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Mario Lezoche</string-name>
          <email>mario.lezoche@univ-lorraine.fr</email>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Simone Angelini</string-name>
          <email>sangelini@fub.it</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Giovanna Dondossola</string-name>
          <email>giovanna.dondossola@rse-web.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Roberta Terruggia</string-name>
          <email>roberta.terruggia@rse-web.it</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Fondazione Ugo Bordoni, Viale del Policlinico</institution>
          ,
          <addr-line>147, 00161, Rome</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>RSE S.p.A.</institution>
          ,
          <addr-line>Via Rafaele Rubattino, 54, 20134, Milan</addr-line>
          ,
          <country country="IT">Italy</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>University of Lorraine, CNRS, CRAN, Nancy</institution>
          ,
          <addr-line>F-54000</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the contemporary digital ecosystem, Internet of Things Search Engines can be used for passive reconnaissance of Internet-connected devices, mapping possible attack surfaces without a direct interaction with the target devices or infrastructures. Each IoT search engine utilizes diverse scanning techniques and analytical methodologies, resulting in metadata with varying levels of coverage, accuracy, and relevance. This research introduces an IoT meta-search engine prototype designed to aggregate and merge metadata from commercial IoT search engines (Shodan, Censys, Netlas, Zoomeye, Binaryedge, Fofa) complemented by Common Vulnerabilities and Exposures (CVE) and Common Weakness Enumeration (CWE) sources. By merging those data, a more comprehensive and detailed perspective of the interconnected device landscape can be provided. Our methodology leverages an ontological framework using Stanford's Protégé and Python, implementing zero-shot learning with a panel of three Large Language Models (LLMs) under human supervision to map IoT search engine taxonomic structures and quantitatively validate the generated Knowledge Base. The IoT meta-search engine is tested on photovoltaic (PV) energy production and monitoring systems, a domain essential to renewable energy grids. Vulnerabilities in PV systems can be exploited by hackers, causing energy disruptions, data breaches, or manipulation of grid operations. Although the findings are preliminary, they serve as a proof of concept to demonstrate the feasibility of the methodology to provide various types of overviews and insights associated with individual and multiple hosts for security posture evaluation.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Search engine</kwd>
        <kwd>vulnerability</kwd>
        <kwd>security</kwd>
        <kwd>ontology</kwd>
        <kwd>Internet of Things</kwd>
        <kwd>Internet Connected Device</kwd>
        <kwd>LLM</kwd>
        <kwd>energy systems</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Internet-connected devices (ICDs) allow remote control and automation, transforming the human
interaction with the environment in various fields, from homes to industries and critical infrastructures. This
widespread reach, however, may raise serious issues regarding security and privacy [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] since successful
cyber-attacks can lead to financial losses, operational disruptions, or reputational damage.
Vulnerabilities in ICDs make them potentially subject to exploitation due to their poor patch management, presence
of open ports and usage of default credentials. Moreover their reliance on open-source software, which,
while due to code transparency, facilitates external audits and vulnerability remediation, simultaneously
provides attackers with potential exploit pathways. Moreover, the dependence on open-source software
may result in inconsistent maintenance unable to ensure timely updates and security patches.
      </p>
      <p>An IoT search engine (IoTSE) is a specialized tool designed to discover, index, and retrieve information
about ICDs connected to a network. Like web search engines, this tool performs automated network
indexing by systematically scanning for active devices, primarily through comprehensive port scans of
individual IP addresses using an advanced internet crawler.</p>
      <p>
        IoT search engines collect data by sending probe packets to network devices and analyzing their
responses. This process extracts metadata such as open ports, services, operating systems, device
types, and geographic locations. The information is then organized into "banners" textual metadata
collections for each device. These banners can be further enriched with additional details about the
type of service or product identified using specific attributes or tags. The IoTSEs difer in terms of the
algorithms and methodologies used to identify devices that generate unique and distinctive outputs,
helping to systematically map the digital footprints of systems, services, and infrastructures. However,
their systematic use and integration into vulnerability detection and security assurance processes are
limited by the lack of standards and the heterogeneity of responses from IoT engines. To face this
criticality, ontologies, established methodologies for defining and representing information, which
provide a structured view of a specific domain [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ], can be used. These approaches facilitate eficient,
machine-readable processing of large-scale information, significantly enhancing data integration and
interoperability. The main goal of an ontology is to ofer a formal and detailed representation of
knowledge to facilitate the sharing, integration, and understanding of information within a given
domain [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. As a drawback, the process of ontology engineering is a complex and time-consuming
task since it often demands substantial input from human experts and the creation of specialized
programmatic code. With the breakthrough of Large Language Models (LLMs), an artificial intelligence
type that understands and creates natural language text, ontology engineering has received a significant
boost. LLM may assist humans in performing tedious and complex tasks by leveraging their ability
to extract semantics from texts and structured information, even when used by non-expert but
semiknowledgeable users. Despite their well-documented limitations [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] such as occasional inaccuracies
and the phenomenon of generating inexplicable responses, known as "hallucinations", LLMs ofer
significant potential to accelerate ontology development. While their contributions remain partial and
require consistent human validation, LLMs represent a promising tool to enhance productivity in this
ifeld. Emerging evidence [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] shows that their evaluations often align with those of experts, and they
have already demonstrated success in various domains [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. As their capabilities improve, LLMs
are set to support humans in judgment and prediction tasks, creating new opportunities for efective
collaboration.
      </p>
      <p>The main contributions of this work are fourfold:
• Analyzing and comparing IoT search engine rankings and results to explore integration scenarios;
• Proposing a novel methodological approach for integrating search results from multiple IoT search
engines by a data collection pipeline for gathering and storing results and by a data processing
pipeline to populate an ontology with heterogeneous data sources into a Knowledge Base (KB);
• Investigating the usage of LLMs under human supervisory control (human in the loop) for
performing: (1) semantic metadata alignment of IoT search engine outputs (ontology alignment)
and (2) quantitative validation of the generated Knowledge Base;
• Validating the proposed methodology on an energy-related case study - photovoltaic production
monitoring systems.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Related works</title>
      <p>
        Research on identifying vulnerable ICDs is evolving through diversified methodological approaches
utilizing specialized IoT search engines, vulnerability tools, and external data sources. While Shodan
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] remains a fundamental platform, emerging tools like Censys [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], Zoomeye [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ], Fofa [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], Netlas
[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and Binaryedge [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] have expanded device reconnaissance possibilities while each platform ofers
unique indexing methodologies [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]. In [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ], authors decode the Common Platform Enumeration
(CPEs) found in the Shodan banners and compare them with hash tables containing the entire CPEs
database to eficiently extract vulnerabilities from the National Vulnerability Database (NVD) [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. A
similar approach, based on interfacing with Censys, is used by the Scout tool [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ], in which the
basic functionalities are extended with metadata from NVD to identify additional vulnerabilities in
specific services operating on publicly available IP addresses. The analysis then integrates the CPEs
by associating their results with the CVEs. Results were then validated against active industry tools
such as Nessus and OpenVAS. Similarly, in [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ], Shodan was used in conjunction with Nessus for
results evaluation. It is worth noting that neither of the tools mentioned above uses ontologies for
data structuring. Regarding the comparison of IoT search engine results, [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] provides preliminary
evaluations without delving into result integration. Other works compare methodologies and strategies
among search engines like Censys and Shodan [
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], and Zoomeye [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], highlighting the diferences
in vulnerability identification capabilities. In [ 25] the author provides a high-level model for merging
results from IoT search engines, focusing on challenges such as identifying relevant devices, reusing
results, and cross-referencing data with external sources, including multiple IoT search engines. [26]
introduces a Dynamic Cybersecurity Ontology (DCO) designed to map dynamic scan results from
services like Shodan and Censys. This approach aims to merge data and create a real-time correlation
mechanism for identifying organizational vulnerabilities, laying the groundwork for a systematic
cyber risk assessment process. A subsequent study [27] explores the use of ontologies to support
the identification and mitigation of vulnerabilities in PLC components. Although preliminary, it
demonstrates an interesting use case for CPE, NVD, NIST controls, and CERT reports. The work [28]
underlines the need to transform IoT search engine data into more efective and structured formats (e.g.,
device name, manufacturer, and software version) to facilitate the identification of exposed devices but
also enables vulnerability assessments and the implementation of appropriate security measures.
      </p>
      <p>About the merging results [29] integrates Shodan and Binaryedge while [30] is particularly relevant
to our study as it focuses on combining results from four search engines (Shodan, Censys, Zoomeye, and
Fofa) using 23 keywords related to EV Charging Management Systems. However, rather than analyzing
the complete set of available metadata, the authors limit their examination to banners associated with
the results, followed by manual verification of their relevance to the search query. Traditional ontology
development (engineering), is performed manually and is often constrained by the biases, expertise,
and perspective of the developers, leading to incomplete or overly rigid structures that fail to adapt to
evolving knowledge domains. It is indeed a very time-consuming activity. On the other hand, LLMs
have shown significant potential in the development and maintenance of ontologies, addressing some
limitations inherent in human-developed ontologies. LLMs are gradually often used for building [31],
aligning [32] [33] or populating [34] ontologies. In SPIRES [34] LLM technology is used in a knowledge
extraction framework to perform zero-shot learning and respond to queries by leveraging flexible
prompts, ensuring the output is aligned to a user-defined knowledge schema. All these recent works
suggest that a hybrid approach, combining human expertise with LLM capabilities, is needed to ensure
the resulting ontologies are accurate and flexible, representing real-world knowledge.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Integration Scenarios for IoT search engines</title>
      <p>The first evaluated scenario focuses on " ranking fusion", a methodology typical of information retrieval
which aims to combine and aggregate results from multiple sources to improve the overall quality and
relevance of the retrieved information. Each IoT search engine uses diferent scanning techniques and
has access to distinct yet partially overlapping internet segments, resulting in coverage, accuracy, and
relevance variations. Ranking fusion combines results from multiple engines to generate a single, more
relevant ordered list, optimizing the overall relevance of the outcomes. A first sample of 23 queries was
defined and submitted to the search engines to evaluate this scenario. For each engine, the top 500
results in terms of IP numbers were recorded. The IP number lists were then compared to identify any
overlaps among the responses of the search engines. The overlap between search engine results was
minimal, as shown in the Appendix, indicating that, unlike traditional search engines, the dataset in
this context is highly sparse. This suggests significant diferences in coverage, as well as in the indexing
and retrieval of results.</p>
      <p>The second scenario investigated focuses on "metadata fusion" associated with search engine results.
Metadata fusion could combine results from diferent sources to produce a unified set of data, optimizing
the consistency and completeness of the information. This process can drive a more accurate and
comprehensive representation of the data. For example, only two of the six IoT search engines (Shodan
and Netlas) in the available configurations reported vulnerabilities. Meanwhile, some engines provide
additional information that can refine the understanding of an IP address, ownership, and location,
which could be of use from a vulnerability assessment perspective. Zoomeye, for instance, indicates
whether an IP address corresponds to a honeypot; Censys provides details about the source and vendor
of a service; Binaryedge, Fofa, and Zoomeye report the identified banners as metadata itself. To
automatically identify services, generate descriptive tags, assign product labels, and gather metadata
about the hosting organization, various search engines use AI-based algorithms.</p>
      <p>This observation inspired the definition of a preliminary methodology for expanding the knowledge
base related to a service or exposed host by combining results from multiple search engines to generate
richer and more comprehensive information by integrating diferent perspectives.</p>
    </sec>
    <sec id="sec-4">
      <title>4. Proposed methodology for collecting and merging of IoT metadata</title>
      <p>To get an overview of the scope and of the objectives the system will have to respond is helpful to
create a set of competency questions (CQs) [35] [36] which describe what the system is expected to
answer or facilitate. These questions help define the scope and requirements of the system, serving as a
clear set of benchmarks that indicate what it should represent and how it should behave. Here, the goal
is to create a meta-engine to gather, merge, and harmonize the results of various IoT search engines
to provide analytical tools for users conducting vulnerability assessments. Below is a list of queries
identified as relevant, along with an example of the corresponding answers that will be extracted later
by the proposed meta-search engine, specifically tailored to an energy use case.</p>
      <p>• CQ 1: What vulnerabilities are associated with exposed hosts in a specific country?
Example: for queries targeting "ABB solar inverters" in Italy, the system identified three hosts
with several CVEs, including some recent ones, such as CVE-2023-487951 and CVE-2023-384082,
which have also been reported by the Italian CSIRT (Computer Security Incident Response Team).</p>
      <p>Such information could be essential for regional cybersecurity assessments.
• CQ 2: Do the identified results expose critical vulnerabilities for which exploits have
already been identified? Example: by cross-referencing IoTSE data with the CVE database, the
system flagged devices where exploits were actively discussed in security forums. For instance,
one host associated with "Modicon M340" revealed multiple documentation and links (e.g., GitHub
repository of exploit code) through the ontology enrichment module, demonstrating the system’s
capability to detect threats.
• CQ 3: Given a set of IP addresses, which services and products are identified by the IoT
engines and with which CPEs? Example: 12 devices tagged as End of Life (EOL) by IoTSEs
under the "SmartPOWER Energy Management System" may be at risk, as they will no longer
receive updates. Additionally, multiple firmware versions, ranging from 3.11 to 4.0.3, for the
Solarview product indicate potential vulnerability to known exploits.
• CQ 4: What weaknesses are associated with each host that could be exploited by a
malicious actor? Example: For a device afected by CVE-2022-28615, the system identified
CWE-190 (Integer Overflow ) as a significant weakness. This insight highlights the possibility of
attackers using arithmetic manipulations to disrupt the host’s logic, causing resource exhaustion
or service disruption.
1https://www.csirt.gov.it/contenuti/la-settimana-cibernetica-del-21-aprile-2024
2https://www.csirt.gov.it/contenuti/poc-pubblico-per-lo-sfruttamento-della-cve-2023-38408-relativa-a-openssh-al01-230720csirt-ita</p>
      <p>After defining the system’s objectives by the competency queries, the focus is on the information
lfow, which starts with a query sent to search engines and concludes with the aggregation of results in
a format suitable for further vulnerability assessment. To illustrate this flow, we present a sequence
diagram in Fig.1 to clearly represent the temporal flow of communications between the actors and
entities, highlighting the order in which messages are exchanged and operations are carried out.</p>
      <p>The diagram illustrates how a user query is progressively processed through the various components,
adding value or transforming the data at each step. The process concludes with the population of the
ontology and the return of results to the user through the system’s various layers in textual or graphical
form, addressing the competency questions. More in detail the "User", initiates the interaction with
the system by submitting a query to the "Query Mapper", a component that receives the user’s query
and translates it into a format compatible with IoT search engines. Queries are sent to the "IoTSEs"
component that processes the translated query to retrieve relevant results which are passed to a "Merging
Module" that combines and organizes the results obtained from the "IoTSEs". The merged results are
then augmented by an "Enrichment module" with additional information from external sources, such as
CWE and CVE databases. Lastly, the "Analysis Module" is in charge of exporting the results in multiple
formats, from data visualization to CSV exports.</p>
      <p>The corresponding architecture is presented in Fig. 2 structured into two main components: (1) a
data collection pipeline responsible for querying search engines and gathering and storing the results,
and (2) a data processing pipeline tasked with transforming raw data into an ontology, further enriched
with information from external data sources related to CVEs and CWEs.</p>
      <sec id="sec-4-1">
        <title>4.1. Data collection pipeline</title>
        <p>The data collection pipeline is organized in a partially automated process where (1) Censys is used
for an initial textual query (e.g., to search for a component under investigation), and (2) the hosts
identified by Censys are individually analyzed with all IoT engines. The selection of Censys as the
primary engine is tied to the expressiveness of its query language, which enables the development of
queries that allow for precise device fingerprinting [ 37]. Starting from a general query about a device
identified as potentially vulnerable, a more specific query is then developed using distinctive features
such as hashes, labels, titles, favicons, and other characteristics Each source contributes its stream of
raw data to the central system. The middle component of the architecture hosts the IoT connector
components, which establish connections using their respective APIs to ensure the format normalization
and integrity of acquired data. The emission system manages the distribution of processed information
by multiple output modalities: a console interface for operational monitoring, a file storage system for
data persistence, Redis integration for cache management, and a MongoDB connection for definitive
information storage.</p>
      </sec>
      <sec id="sec-4-2">
        <title>4.2. Data merging pipeline</title>
        <p>The subsequent step preliminary to the ontology deployment is the definition of a conceptual data
model (Fig. 3) which includes entities, relationships, and attributes that will be used in the logical
modeling process, particularly in defining the TBox ( Terminology Box) and the ABox (Assertion Box). This
conceptual distinction allows for the separation of general domain knowledge from specific information.
To simplify the complex ontology engineering process we detail the merging and alignment process:</p>
        <p>Ontology Conceptualization: A conceptual model defines core entities (e.g., IoTSE Result, CVE, CWE),
their relationships (e.g., hasCPE, hasCVE), and attributes (e.g., Exploit Link, Location). Protégé [38] is
employed to translate this model in Fig, 3 into an OWL ontology as in Fig. 4.</p>
        <p>Data Harmonization and Semantic Mapping: Semantic heterogeneity across search engine results is
addressed through alignment. Diferent IoT search engines may label the concept of a service using
diferent terms, such as "app", "category" or "product".</p>
        <p>Ontology Population: Data is imported into the ontology using a Jupyter Colab Notebook3, where
instances from IoTSE are merged with data generated from CVE and CWE databases. For instance, a
Shodan search result is transformed into an OWL instance as in Fig. 5. This approach minimizes human
intervention in aligning heterogeneous data, reducing manual efort while ensuring high accuracy.</p>
        <p>The integration of results from diferent IoT search engines is complex due to their structural
diferences and varying levels of nesting, as well as the type of response, which may focus on a single
port or multiple ports belonging to a host. This semantic fragmentation makes it dificult to achieve
3configured with baseline specifications: Python 3 Google Compute Engine backend, CPU, up to 12.7 GB of system RAM, 30
GB of disk storage without any performance guarantees.
the interoperability and efective aggregation of IoT data. During the ontology population, it was then
decided: (1) to maintain this structural heterogeneity when storing instances related to the analyzed IPs.
This aligns with the prototypical nature of the tool and allows for individual exploration of the results, as
well as enabling future comparisons to assess the level of detail in the responses provided. (2) To address
the heterogeneity of IoTSE results in terms of data nesting levels and the presence of multiple subclasses
by using annotations in a flat, single-level structure for each class, rather than a complex hierarchy with
multiple subclasses. This methodological approach simplified data management and loading, facilitating
the integration and comparison of information from heterogeneous sources. The task of aligning
the structures of the results was framed within the context of ontology alignment, an area focused
on identifying and establishing semantic correspondences between equivalent concepts in diferent
data structures to facilitate data integration. An experimental approach was used to extend the initial
conceptual mappings between a sample of manually conducted results. Outside the pipelines, three
LLMs4 were prompted to identify analogous concepts across the results metadata, enabling the creation
of a mapping table that converts and aggregates these concepts during the querying/visualization phase
of the data. The choice to use three LLMs is motivated by the fact that even LLMs, as humans, can have
diferent biases [ 39] vulnerable to exploits that can misguide the system [40] [41]. A possible approach
to improve the robustness and the fairness of automated judging systems is to include multiple LLM
models to form a sort of "LLM-judging committee". While LLMs provide significant support in mapping
and aligning heterogeneous IoT metadata, their limitations must be acknowledged and mitigated to
ensure the accuracy and reliability of the ontology.
4Claude - Haiku, Gemini - 1.5 Flash, and ChatGPT - GPT-4</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>5. Case study: Photovoltaic production monitoring systems</title>
      <p>The use case for testing the methodology is the PV system, a sector where there has been a proliferation
of devices and monitoring tools accessible via the internet and controlled by end users, third parties, and
service companies, creating a vast attack surface vulnerable to threats at the device level to gain access
to the power grid. From initial coordinated attacks on thousands of devices, scenarios of compromised
energy distribution networks, from energy communities to critical infrastructures, can emerge. Eight
components were identified by taking evidence from the literature and leveraging evidence from security
forums: "Solarview compact", "Enphase envoy","Altenergy","ABB solar inverter","Huawei smarterlog 2000
Power System Inspire", "SmartPOWER Energy Management System","Modicon M340","Victron energy". For
each component, a specific query using the Censys query language is then generated and submitted to
the pipelines. The KB generated contains over 1.9 million triples (33% from IoTSE results, 66% from the
CVE database, and less than 1% from the CWE database).</p>
      <p>Several strategies with varying levels of granularity are available for accessing and visualizing the
data. Data can be accessed by focusing on single instances and search engines as in Fig. 5 where a single
result from Shodan is shown in Protégé. In the left panel, the list of loaded instances is presented, while
the central-right panel displays the list of metadata among the other services and products identified
with CPE, the Autonomous System Name (ASN) indication, and the date field. Data can be accessed
through the Colab interface, allowing users to query both a single IP address and visualize content
representing the fusion of information extracted from IoTSE, as shown by answering the competency
questions in the Appendix and highlighting end-of-life product, old version of firmware of software
and CVEs. Beyond the identification of vulnerabilities, what stands out is that by merging information
generated by advanced proprietary methodologies of each engine, the identification of the product,
the service, the operating system, or the possible associated CPE is enabled. All collected information
potentially signals vulnerabilities, yet these indicators require meticulous verification, as they might
represent server-side patches or false positives that demand careful and systematic investigation, This
insight is useful as it can be paired and studied alongside banners, page HTML code, and other details
(such as identifying whether it is a honeypot or determining its vendor) to build a digital footprint
associated with single or multiple hosts.</p>
      <p>In the Appendix, a qualitative evaluation of the proposed solution is presented. Answering the
competency questions that guided the development of the ontology, a sample of the possible data extrapolations
is shown using SPARQL queries and Python functionalities for aggregating and visualizing the results.
For the quantitative evaluation of the ontology, three LLMs with a zero-shot prompting methodology
without a specific previous training for this task are used, on the base of the ontology statistics produced
by Protégé. The three evaluations redacted from LLMs describe an ontology that is "large, data-driven,
and focused on individual instances rather than complex terminological structures. The ontology features
a significant number of instances, annotations, and axioms, which suggest a well-documented system.
However, it remains relatively simple in logical complexity, emphasizing basic hierarchical structures and
object/data properties over complex logical constructs. Despite its size, the class hierarchy is not overly
intricate, with fewer equivalent or disjoint classes." LLMs have clearly and unequivocally identified the
limitations and potential of this ontology, confirming some of the objectives for the continuation of this
activity. These objectives pertain to reducing the amount of metadata associated with instances and
improving organization through relationships.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusions</title>
      <p>IoT search engines are valuable tools for analyzing the attack surface of infrastructures or studying
vulnerabilities associated with specific devices. This work proposes an ontological approach to
developing a prototype IoT meta-search engine that merges results from multiple IoT search engines and
external sources, such as CVE and CWE databases. It also integrates zero-shot prompting, supervised
by experts, for ontology alignment and Knowledge Base (KB) validation. The objective is to obtain
"better" and potentially more useful information for each identified host for subsequent vulnerability
assessment and risk mitigation stages. This work presents a unique perspective not previously explored
in terms of the number of properties analyzed and involved in IoT search engines. The proposed IoT
meta-search engine has been evaluated on a use case of devices for the production and monitoring of
photovoltaic systems. The generated KB contains over 1.9 million triples, and each identified host can
be characterized by up to 175 properties. This work has identified several indicators of potential
vulnerabilities, unpatched systems, and discontinued products. However, our primary focus was evaluating
the technical feasibility of incorporating these IoT findings into an ontology-driven solution. This will
help establish a foundation for more comprehensive assessments in future studies. From the obtained
results, in addition to specific analyses on a single host, more complex reasoning can be developed, such
as examining the presence of a vulnerable device within a particular country, generating aggregated
statistics for devices, and preliminarily evaluating the impact and the reliability of integrating a panel
of LLMs in the process of ontology engineering.</p>
    </sec>
    <sec id="sec-7">
      <title>7. Acknowledgments</title>
      <p>The authors would like to express gratitude to Shodan, Censys, Zoomeye, Fofa, Binaryedge, Netlas,
and Vulners for their collaboration and generosity in providing access to their search engines and
vulnerability scanners. This work is original and has been partly supported by a collaboration between
RSE S.p.A. and Fondazione Ugo Bordoni, financed by the Research Fund for the Italian Electrical System
under the Three-Year Research Plan 2022-2024 (DM MITE n. 337, 15.09.2022), in compliance with the
Decree of April 16th, 2018. The authors would like to thank our colleagues Claudio Carpineto and
Gianni Romano from Fondazione Ugo Bordoni for their help in the preliminary activities related to the
ranking merging scenario.</p>
    </sec>
    <sec id="sec-8">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used Grammarly in order to: Grammar and spelling
check. After using this tool, the authors reviewed and edited the content as needed and take full
responsibility for the publication’s content.
detection, and mitigation on iot devices, Future Internet 12 (2020) 27.
[25] M. Arnaert, Y. Bertrand, K. Boudaoud, Modeling vulnerable internet of things on shodan and
censys: An ontology for cyber security, in: Proceedings of the Tenth International Conference
on Emerging Security Information, Systems and Technologies (SECUREWARE 2016), 2016, pp.
299–302.
[26] J. Pastuszuk, P. Burek, B. Ksiezopolski, Cybersecurity ontology for dynamic analysis of it systems,</p>
      <p>Procedia Computer Science 192 (2021) 1011–1020.
[27] T. Heverin, M. Cordano, A. Zeyher, M. Lashner, S. Suresh, Exploring ontologies for mitigation
selection of industrial control system vulnerabilities, in: International Conference on Cyber
Warfare and Security, volume 17, 2022, pp. 72–80.
[28] M. Borhani, G. S. Gaba, J. Basaez, I. Avgouleas, A. Gurtov, A critical analysis of the industrial
device scanners’ potentials, risks, and preventives, Journal of Industrial Information Integration
100623 (2024).
[29] A. Daskevics, A. Nikiforova, Shobevodsdt: Shodan and binary edge based vulnerable open data
sources detection tool or what internet of things search engines know about you, in: 2021 second
international conference on intelligent data science technologies and applications (IDSTA), IEEE,
2021, pp. 38–45.
[30] T. Nasr, S. Torabi, E. Bou-Harb, C. Fachkha, C. Assi, Chargeprint: A framework for internet-scale
discovery and security analysis of ev charging management systems, in: NDSS, 2023, p. "".
[31] S. Toro, A. V. Anagnostopoulos, S. Bello, K. Blumberg, R. Cameron, L. Carmody, A. D. Diehl,
D. Dooley, W. Duncan, P. Fey, et al., Dynamic retrieval augmented generation of ontologies using
artificial intelligence (dragon-ai), arXiv preprint arXiv:2312.10904 (2023).
[32] Y. He, Language models for ontology engineering, Ph.D. thesis, University of Oxford, 2024.
[33] R. Amini, S. S. Norouzi, P. Hitzler, R. Amini, Towards complex ontology alignment using large
language models, arXiv preprint arXiv:2404.10329 (2024).
[34] J. H. Caufield, H. Hegde, V. Emonet, N. L. Harris, M. P. Joachimiak, N. Matentzoglu, H. Kim,
S. Moxon, J. T. Reese, M. A. Haendel, et al., Structured prompt interrogation and recursive
extraction of semantics (spires): A method for populating knowledge bases using zero-shot
learning, Bioinformatics 40 (2024) btae104.
[35] M. Gruninger, Methodology for the design and evaluation of ontologies, in: Proc. IJCAI’95,</p>
      <p>Workshop on Basic Ontological Issues in Knowledge Sharing, 1995, p. "".
[36] N. F. Noy, D. L. McGuinness, et al., Ontology development 101: A guide to creating your first
ontology, 2001.
[37] A. Bernardini, C. Carpineto, S. Angelini, G. Dondossola, R. Terruggia, Ask the right queries:
Improving search engine retrieval of vulnerable internet-connected devices through interactive
query reformulation, in: Proceedings of ITASEC 2024, 2024, p. "".
[38] M. A. Musen, The protégé project: a look back and a look forward, AI Matters 1 (2015) 4–12.
[39] P. Wang, L. Li, L. Chen, Z. Cai, D. Zhu, B. Lin, Y. Cao, Q. Liu, T. Liu, Z. Sui, Large language models
are not fair evaluators, arXiv preprint arXiv:2305.17926 (2023).
[40] J. Rando, F. Croce, K. Mitka, S. Shabalin, M. Andriushchenko, N. Flammarion, F. Tramèr,
Competition report: Finding universal jailbreak backdoors in aligned llms, arXiv preprint arXiv:2404.14461
(2024).
[41] A. Zou, Z. Wang, N. Carlini, M. Nasr, J. Z. Kolter, M. Fredrikson, Universal and transferable
adversarial attacks on aligned language models, arXiv preprint arXiv:2307.15043 (2023) "".</p>
      <sec id="sec-8-1">
        <title>Ranking merging</title>
        <p>To evaluate the overlapping of search engine results and understand their coverage and similarity we
created 23 general search queries. We submitted the queries to IoTSEs, collecting and documenting
the returned results. Table 1 show the results of the comparison between the five search engines, as
well as between combinations of the engines. It should be noted that at the time of this analysis, an
agreement had not yet been reached with Netlas for its use in the project activities, and therefore it was
not included among the engines analyzed. The ’intersection’ column indicates the IP addresses common
to all five engines. For each engine, the number of IPs found is listed, with the number of distinct IPs
for that engine in parentheses. For example, for query 18, Binaryedge reports 500 IPs, but only 50 are
distinct. The intersection among all the engines is calculated only for the distinct IPs. The intersection
of rankings obtained has yielded very few matches. Authors suppose it is primarily due to diferences
in evaluation criteria and result aggregation methods. Each engine may assign a diferent relevance
score to devices depending on its search algorithm, the quality of available data, or the priorities set by
its ranking system. Moreover, IoT search engines may operate on diferent network architectures and
use varying methods for data collection, further influencing the consistency of the results.</p>
      </sec>
      <sec id="sec-8-2">
        <title>Qualitative Evaluation: Answers to Competency Questions</title>
        <p>CQ 1: What vulnerabilities are associated with exposed hosts in a specific country?</p>
        <p>In Fig. 6 shows a table listing the IP address, domain, ASN (Autonomous System Name) related to the
"ABB solar inverter" in Italy. Regarding the identified CVEs, it is worth noting that CVEs were only
identified on three out of seven hosts, exclusively by the Shodan engine. Moreover and it could topic
for further investigation the three identified hosts show an identical distribution of vulnerabilities.</p>
        <p>CQ 2: Do the identified results expose vulnerabilities considered dangerous for which
exploits have already been identified?</p>
        <p>To answer this question it is possible to correlate metadata as number of CVEs, presence of exploit
and links to online resources explaining how to use such exploit as in Fig. 7 for "Modicon M340" query.
The direct link to the exploit is a available in KB, but it falls outside the scope of this work.</p>
        <p>CQ 3: Given a set of IP addresses, which services and products are identified by the IoT
engines and with which CPEs?</p>
        <p>The merged results combine the methodologies used by IoTSE to classify and describe devices,
services, or systems detected during network scanning using tags. Analysis can be broader as in Fig.8
where all products identified for a query are shown. Alternatively, one can conduct more sophisticated
analysis by examining specific characteristics as in Fig. 9 where 12 devices identified by searching for
"SmartPOWER Energy Management System" (column 4 in the chart) are tagged as EOL (End of Life),
indicating they will no longer receive updates and may be at risk. Another investigative approach
involves examining firmware and software versions to identify older editions. Fig. 10 highlights partial
details of CPEs (standard or enhanced 2.3 versions), showing firmware versions and occurrences for the
query Solarview from firmware version 3.11 up to 4.0.3. These indicators serve as a starting point for
deeper security analysis. However, such findings require rigorous validation, as they may have been
patched on the server side through backporting of security updates.. Nonetheless, they ofer a valuable
external reconnaissance snapshot of an internet-connected device’s potential vulnerabilities.</p>
        <p>CQ 4: What weaknesses are associated to an IP address that could be exploited by a malicious
actor?</p>
        <p>This question can be addressed by exploring individual IoT results using the Protégé interface or
querying the knowledge base (KB). The process involves extracting all CVEs associated with the
specified IP address and using the ontology’s hasCWE relationship to cross-reference and identify the
corresponding weaknesses linked to each CVE as shown below:</p>
        <p>The most frequent CWE identified (Fig.11) is associated is CWE-190 (" Integer Overflow or
Wraparound"), which occurs when an arithmetic operation produces a result too large for the
assigned integer variable to handle. An attacker using this weakness can manipulate application logic
to generate scenarios where standard program flow is disrupted, potentially leading to infinite loops,
system resource exhaustion, or complete application failure.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Baho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Abawajy</surname>
          </string-name>
          ,
          <article-title>Analysis of consumer iot device vulnerability quantification frameworks</article-title>
          ,
          <source>Electronics</source>
          <volume>12</volume>
          (
          <year>2023</year>
          )
          <fpage>1176</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Guarino</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Oberle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Staab</surname>
          </string-name>
          ,
          <article-title>What is an ontology?</article-title>
          , Handbook on ontologies (
          <year>2009</year>
          )
          <fpage>1</fpage>
          -
          <lpage>17</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>B.</given-names>
            <surname>Chandrasekaran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. R.</given-names>
            <surname>Josephson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. R.</given-names>
            <surname>Benjamins</surname>
          </string-name>
          ,
          <article-title>What are ontologies, and why do we need them?</article-title>
          ,
          <source>IEEE Intelligent Systems and their applications 14</source>
          (
          <year>1999</year>
          )
          <fpage>20</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>M.</given-names>
            <surname>Uschold</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Gruninger</surname>
          </string-name>
          ,
          <article-title>Ontologies: Principles, methods and applications, The knowledge engineering review 11 (</article-title>
          <year>1996</year>
          )
          <fpage>93</fpage>
          -
          <lpage>136</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Yu</surname>
          </string-name>
          , W. Ma,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Peng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Qin</surname>
          </string-name>
          , et al.,
          <article-title>A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions</article-title>
          ,
          <source>ACM Transactions on Information Systems</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jain</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Kankanhalli</surname>
          </string-name>
          ,
          <article-title>Hallucination is inevitable: An innate limitation of large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2401.11817</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Doshi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Bell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Mirzayev</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B. S.</given-names>
            <surname>Vanneste</surname>
          </string-name>
          ,
          <article-title>Generative artificial intelligence and evaluating strategic decisions</article-title>
          ,
          <source>Strategic Management Journal</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>X.</given-names>
            <surname>Luo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rechardt</surname>
          </string-name>
          , G. Sun,
          <string-name>
            <given-names>K. K.</given-names>
            <surname>Nejad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Yáñez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yilmaz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. O.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Borghesani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pashkov</surname>
          </string-name>
          , et al.,
          <article-title>Large language models surpass human experts in predicting neuroscience results</article-title>
          ,
          <source>Nature Human Behaviour</source>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>11</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>J.</given-names>
            <surname>Gu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ma</surname>
          </string-name>
          , H. Liu, et al.,
          <article-title>A survey on llm-as-a-judge</article-title>
          ,
          <source>arXiv preprint arXiv:2411.15594</source>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Matherly</surname>
          </string-name>
          ,
          <article-title>Shodan: The search engine for internet-connected devices</article-title>
          , https://www.shodan.io,
          <year>2022</year>
          . Online tool,
          <source>retrieved on November 27</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Durumeric</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Adrian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Mirian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Bailey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. A.</given-names>
            <surname>Halderman</surname>
          </string-name>
          ,
          <article-title>A search engine backed by Internet-wide scanning</article-title>
          ,
          <source>in: 22nd ACM Conference on Computer and Communications Security</source>
          ,
          <year>2015</year>
          , p.
          <source>""</source>
          .
          <source>Online tool, retrieved on November 27</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          <article-title>[12] nd, Zoomeye global internet asset data</article-title>
          , https://www.zoomeye.hk,
          <year>2021</year>
          . Online tool,
          <source>retrieved on November 27</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13] nd,
          <source>Fofa search engine version 4</source>
          .9.148, https://en.fofa.info/, nd.
          <source>Online tool, retrieved on November 27</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14] nd, Netlas, https://netlas.io/, nd.
          <source>Online tool, retrieved on November 27</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15] nd, binaryedge, https://app.binaryedge.io/login, nd.
          <source>Online tool, retrieved on November 27</source>
          ,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>R.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Duan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <article-title>A survey on cyberspace search engines</article-title>
          ,
          <source>in: Cyber Security: 17th China Annual Conference, CNCERT 2020</source>
          , Beijing, China,
          <year>August 12</year>
          ,
          <year>2020</year>
          ,
          <source>Revised Selected Papers 17</source>
          ,
          <string-name>
            <surname>Springer</surname>
            <given-names>Singapore</given-names>
          </string-name>
          ,
          <year>2020</year>
          , pp.
          <fpage>206</fpage>
          -
          <lpage>214</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>B.</given-names>
            <surname>Genge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Enăchescu</surname>
          </string-name>
          , Shovat:
          <article-title>Shodan-based vulnerability assessment tool for internet-facing services, Security and communication networks 9 (</article-title>
          <year>2016</year>
          )
          <fpage>2696</fpage>
          -
          <lpage>2714</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>H.</given-names>
            <surname>Booth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rike</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. A.</given-names>
            <surname>Witte</surname>
          </string-name>
          ,
          <article-title>The national vulnerability database (nvd): Overview, nd (</article-title>
          <year>2013</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <surname>J. O'Hare</surname>
            ,
            <given-names>Scout:</given-names>
          </string-name>
          <article-title>A contactless 'active'reconnaissance known vulnerability assessment tool</article-title>
          , nd (
          <year>2018</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>J. O'Hare</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Macfarlane</surname>
            ,
            <given-names>O.</given-names>
          </string-name>
          <string-name>
            <surname>Lo</surname>
          </string-name>
          ,
          <article-title>Identifying vulnerabilities using internet-wide scanning data</article-title>
          ,
          <source>in: 2019 IEEE 12th International Conference on Global Security, Safety and Sustainability (ICGS3)</source>
          , IEEE,
          <year>2019</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>10</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>R.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>McMahon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Samtani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Patton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Identifying vulnerabilities of consumer internet of things (iot) devices: A scalable approach</article-title>
          , in: 2017
          <source>IEEE International Conference on Intelligence and Security Informatics (ISI)</source>
          , IEEE,
          <year>2017</year>
          , pp.
          <fpage>179</fpage>
          -
          <lpage>181</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>F. Z.</given-names>
            <surname>Fagroud</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. H.</given-names>
            <surname>Ben Lahmar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Toumi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Achtaich</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>El</surname>
          </string-name>
          <string-name>
            <surname>Filali</surname>
          </string-name>
          ,
          <article-title>Iot search engines: Study of data collection methods</article-title>
          ,
          <source>in: Advances on Smart and Soft Computing: Proceedings of ICACIn 2020</source>
          , Springer Singapore,
          <year>2021</year>
          , pp.
          <fpage>261</fpage>
          -
          <lpage>272</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>C.</given-names>
            <surname>Bennett</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Abdou</surname>
          </string-name>
          ,
          <string-name>
            <surname>P. C. van Oorschot</surname>
          </string-name>
          ,
          <article-title>Empirical scanning analysis of censys and shodan</article-title>
          , in: Workshop on Measurements, Attacks, and
          <article-title>Defenses for the Web</article-title>
          ,
          <year>2021</year>
          , p.
          <source>"".</source>
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>M.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhuge</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Shi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <article-title>A survey of security vulnerability analysis, discovery,</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>