<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Orchestration and Energy-Optimized Data Management across the Edge-Cloud Continuum: The GLACIATION Approach</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Aidan O'Mahony</string-name>
          <email>aidan.omahony@dell.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Pournima Sonawane</string-name>
          <email>pournima.sonawane@dell.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shraddha Gupta</string-name>
          <email>shraddha.gupta@dell.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Edge-Cloud Systems</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Applied Research</institution>
          ,
          <addr-line>Dell Technologies, Cork</addr-line>
          ,
          <country country="IE">Ireland</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Distributed Knowledge Graphs</institution>
          ,
          <addr-line>Data Placement, Workload Orchestration, AI Scheduling, SPARQL with LLMs</addr-line>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>SEMANTiCS'25: International Conference on Semantic Systems</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>The rapid growth of data-intensive applications across the edge-cloud continuum presents a dual challenge: immense energy consumption and complex data governance. This paper presents the GLACIATION project, which tackles these issues through a platform for energy-eficient and privacy-preserving data operations. We showcase an integrated approach that combines a Distributed Knowledge Graph (DKG) with AI-driven orchestration to automate and optimize data management. Key innovations include the Green Index, a real-time metric for sustainable energy-aware workload shifting; a bio-inspired scheduling engine using Ant Colony Optimization; and a lightweight, zero-shot Large Language Model (LLM) interface that enables semantic exploration of the DKG via natural language. The efectiveness of this framework is demonstrated through its validation in real-world industrial and public sector pilot deployments.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>The proliferation of IoT and data-intensive applications is driving a shift towards the edge-cloud
computing continuum, a paradigm that introduces significant challenges related to data governance,
latency, and energy consumption. As data generation at the edge explodes, the need for intelligent,
automated, and eficient data operations becomes critical. Traditional cloud-centric models are often
ill-equipped to handle the complex trade-ofs between processing data locally to reduce latency and
moving it to centralized clouds for powerful analytics, all while adhering to strict privacy regulations
and minimizing the environmental footprint.</p>
      <p>This paper outlines the key innovations of the GLACIATION platform. We begin by describing the
project’s architecture and the core semantic metadata model (GLC-MRM) that enables interoperability.
We then detail the platform’s AI-driven optimization capabilities, including the use of Ant Colony
Optimization for data discovery, a bi-level scheduling approach for managing trade-ofs, and a lightweight
Large Language Model (LLM) interface for democratizing data access. Finally, we present the validation
of our approach through real-world industrial pilots, with a focus on the Green Index for sustainable,
energy-aware orchestration in the energy sector.
GLACIATION (Green responsibLe privACy preservIng dAta operaTIONS) is a Horizon Europe project
designed to address the significant energy consumption and carbon emissions resulting from the rapid
growth of big data analytics across the edge-to-cloud continuum. The project’s central ambition is to
create a platform for energy-eficient, privacy-preserving data operations. The core of this initiative is
the development of a novel Distributed Knowledge Graph (DKG) that spans the entire edge-core-cloud
(S. Gupta)</p>
      <p>CEUR</p>
      <p>ceur-ws.org
architecture. By leveraging AI-enforced minimal data movement and optimizing the physical location of
analytics and data storage, GLACIATION aims to achieve substantial reductions in power consumption.</p>
      <p>The technical approach revolves around a modular, microservices-based platform that supports the
DKG. The conceptual architecture, illustrated in Figure 1, is comprised of key components including
a Metadata service to manage data annotations, a Trade-of service to balance latency and resource
use, and a Prediction service for forecasting data popularity and workload needs. A vital element is
the project’s metadata framework, which provides the tools to embed privacy and trust requirements
directly into data operations. The eficacy and generality of the GLACIATION platform will be validated
through four demanding, real-world use cases in the public-service, manufacturing, enterprise, and
energy sectors, led by partners MEF/Sogei, Dell, SAP, and IPTO, respectively1.</p>
    </sec>
    <sec id="sec-2">
      <title>3. Semantic Metadata Model</title>
      <p>
        GLACIATION introduces the GLACIATION Metadata Reference Model (GLC-MRM) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], a flexible and
extensible ontology designed to enable interoperable data and metadata representation within a
Distributed Knowledge Graph (DKG). Founded on Linked Data principles, the GLC-MRM uses standards
like RDF, RDFS, and OWL to formalize a generic conceptualization of a task scheduling and resource
monitoring environment. The core of the model consists of primary classes such as Task, Resource,
Constraint, and Measurement, which describe the assignment of tasks to resources under specific
hard or soft constraints, while monitoring their real-time and predicted performance.
      </p>
      <p>The GLC-MRM is designed for extensibility through a use-case-driven specialization methodology,
where the core ontology is adapted with modular vocabularies to describe specific assets, services, and
contexts. This includes detailed specializations for Kubernetes resources (nodes, pods, workloads),
IoT devices, and crucial energy metadata such as the Green Index. To ensure broad interoperability,
the model is formally mapped to established external ontologies, including the DCAT vocabulary for
describing datasets and the ETSI SAREF ontology for modeling devices and energy-related measurements.
1The source code and use cases of the GLACIATION platform are available at https://github.com/glaciation-heu
This semantic framework, supported by technologies like SHACL for data validation and JSON-LD for
lightweight data exchange, enables both declarative SPARQL queries and automated policy enforcement,
forming the cornerstone of the platform’s orchestration capabilities.</p>
    </sec>
    <sec id="sec-3">
      <title>4. Semantic-Based Optimization</title>
      <p>
        The GLACIATION optimization strategy is centered on a Distributed Knowledge Graph (DKG), which
serves as the foundation for a novel metadata fabric codenamed IceStream. The approach taken is
inspired by the Agri-Gaia project [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. The DKG’s primary objective is to furnish a real-time,
comprehensive, and global perspective on data distributed across the network, encompassing cluster information,
resource usage, and application status. This is achieved by representing all entities—such as nodes,
services, data, and their associated properties—within a graph-based data model built on W3C standards
like the Resource Description Framework (RDF) and Web Ontology Language (OWL). This approach
creates an expressive, machine-readable semantic layer that describes the entire state of the
edgecloud continuum, enabling AI-driven decisions to optimize resource allocation and minimize energy
consumption.
      </p>
      <p>This rich semantic representation facilitates powerful optimization mechanisms through
standardized protocols. The architecture supports declarative, decision-making queries via the SPARQL query
language, allowing the orchestration engine to retrieve precise metadata for various microservices. For
instance, the Trade-of Service queries the DKG to acquire metadata on latency, resource usage, energy,
privacy, and security to inform workload placement decisions. Furthermore, the system leverages the
Shapes Constraint Language (SHACL) to validate RDF data against a set of predefined conditions and
policies, ensuring data quality, consistency, and compliance with domain-specific constraints. This
combination of semantic querying and constraint validation provides a robust framework for
implementing and enforcing contextual policies, enabling dynamic, automated, and intelligent optimization
across the platform.</p>
    </sec>
    <sec id="sec-4">
      <title>5. AI-Driven Scheduling and Trade-Of</title>
      <p>
        GLACIATION employs a bio-inspired approach for distributed data discovery and movement, utilizing
a variant of Ant Colony Optimization (ACO) [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. In this model, search queries generate forward ants
that traverse the Distributed Knowledge Graph (DKG) following pheromone trails. These trails are
dynamically laid by backward ants, which retrace the path of successful queries, reinforcing routes
that lead to desired RDF triples. Beyond discovery, the established pheromone gradients guide a data
movement strategy, relocating frequently accessed data closer to its point of demand. This process
improves search eficiency and hit rates by optimizing the physical location of data within the
edge-fogcloud architecture, thereby reducing network trafic and query latency.
      </p>
      <p>
        The scheduling and orchestration logic is framed as a Bi-level Multi-Objective Optimization Problem
(BMO-TSLB), addressed by an Improved Multi-Objective Ant Colony Optimization (IMOACO)
algorithm [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. At the upper level, the system optimizes for makespan, cost, and energy consumption of tasks.
This is dependent on the lower-level optimization, which focuses on load balancing by minimizing
response time and maximizing resource utilization across available fog nodes. A dedicated Trade-Of
Service orchestrates this process, reasoning over the often-conflicting objectives of performance, power
eficiency, and policy compliance. By integrating the outputs of the ACO-based data discovery with
workload predictions, the service selects optimal workload placements that satisfy the multi-faceted
constraints of the system.
      </p>
    </sec>
    <sec id="sec-5">
      <title>6. Interfacing with LLMs</title>
      <p>
        To democratize access to the Distributed Knowledge Graph (DKG), we introduce a framework for
question answering that leverages Large Language Models (LLMs) in a zero-shot setting [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. This
approach enables non-technical domain experts to query the DKG using natural language, eliminating
the need for familiarity with SPARQL or the underlying KG schema. The framework operates by
taking a user’s question and automatically extracting relevant context, such as pertinent class types
and predicates from the KG schema. This contextual information, along with the natural language
question, is then passed to an LLM, which generates a corresponding SPARQL query. This initial query
can be further refined through an iterative enhancement loop, where the LLM itself parses, validates,
and improves the generated query, thereby increasing the likelihood of a successful execution without
requiring pre-existing question-query training pairs or model fine-tuning.
      </p>
      <p>Our preliminary evaluations demonstrate the viability and efectiveness of this approach. For the
task of generating SPARQL queries from natural language questions, using models such as Llama2-7B,
Llama3-8B, and Mistral-7B, we observed that increasing the number of examples (n-shot prompting, n ≥
1) generally leads to significant improvements in BERT scores (Precision, Recall, and F1) for pretrained
LLMs compared to a 0-shot setting. Notably, a fine-tuned Mistral-7B model (Mistral-7B FT), even when
ifne-tuned on a diferent dataset, achieved the best overall F1 score of 0.9671 with 5-shot prompting.
This fine-tuned model also significantly improved performance in a 0-shot setting compared to its
nonifne-tuned counterpart, indicating its adaptability to new datasets for the SPARQL2Q task. The query
enhancer component within our framework has been particularly efective in significantly improving
the quality of generated queries. For example, the average Acc@10 score (percentage of correct answers
out of 10 runs) increased from 0.22 (without enhancer) to 0.57 (with enhancer) across a set of 30 questions.
For specific complex questions, the Acc@10 improved from zero to 0.9 and 1.0 respectively with the
enhancer.</p>
    </sec>
    <sec id="sec-6">
      <title>7. Real-World Evaluation</title>
      <p>The GLACIATION framework is validated across three diverse, real-world pilot deployments,
demonstrating its practical applicability in industrial and public sector settings. These use cases serve to test
and measure the efectiveness of the platform’s core capabilities, including energy-aware workload
distribution, privacy-preserving data orchestration, and semantic policy enforcement.</p>
      <p>
        The pilots showcase the integrated functionality of the GLACIATION approach. At the Independent
Power Transmission Operator (IPTO) in Greece, the system utilizes the Green Index derived from
real-time SCADA data to perform energy-aware workload distribution for grid anomaly detection tasks
across three interconnected data centers [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Once computed, its values, along with their temporal and
spatial context, are represented in the DKG as RDF triples, following the GLC-MRM ontology.
      </p>
      <p>
        In a smart manufacturing scenario in Ireland, the platform manages privacy-preserving orchestration
using data from robotic and IoT sensors, enforcing data sovereignty and access control policies to govern
data movement and computation. A third pilot with MEF/SOGEI in Italy focuses on optimizing data
movement and energy consumption in a public administration context. Across these deployments, the
Open Policy Agent (OPA) framework is utilized to implement semantic policy control, with real-time
dashboards providing visibility into performance and compliance [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
7.1. Example Pilot - Green-Aware Orchestration at IPTO
To facilitate green-aware workload placement, GLACIATION partner IPTO introduces the Green Index,
a supply-side metric that quantifies the availability of sustainable energy at potential execution sites.
The index is designed to be comparable across diferent locations and times, enabling the orchestrator
to dynamically steer computational tasks toward sites and times with higher green energy availability,
thereby minimizing environmental cost. It is derived directly from real-time Supervisory Control and
Data Acquisition (SCADA) telemetry data provided by the Transmission System Operator (TSO), which
monitors the power flow at grid substations.
      </p>
      <p>The calculation of the Green Index begins with the net power balance (  ) at a substation, indicating
whether the site is a net exporter or importer of energy. This local value is then combined with the
national ratio of renewable energy sources (  ) to produce a RES-adjusted power balance,  . To
ensure comparability, this value is normalized using a signed percentile rank over a one-week historical
window and scaled to a final range of 0 to 1. The resulting index is strongly correlated with national
 2 emissions, where a higher index value (approaching 1) signifies greater availability of green energy
and signals optimal conditions for executing workloads.</p>
    </sec>
    <sec id="sec-7">
      <title>8. Conclusion</title>
      <p>This paper presented the GLACIATION framework, a platform for energy-optimized and
semanticallyaware data management. We demonstrated how its integrated technologies—a Distributed Knowledge
Graph (DKG) based on the GLC-MRM ontology, an AI-driven scheduling engine using Ant Colony
Optimization, and the Green Index—create an intelligent orchestration platform.</p>
      <p>The successful validation highlights the potential of semantic technologies to solve complex
optimization problems in distributed environments. The DKG, paired with a lightweight LLM interface, is
a significant step toward making sophisticated data infrastructures more autonomous, eficient, and
accessible. Future work will focus on scaling the AI models, optimizing data movement strategies for
heterogeneous networks, and improving human-in-the-loop support for domain experts.</p>
    </sec>
    <sec id="sec-8">
      <title>Acknowledgments</title>
      <p>This work is funded by the European Union’s Horizon Europe research and innovation programme
under grant agreement No 101070141 (GLACIATION).</p>
    </sec>
    <sec id="sec-9">
      <title>Declaration on Generative AI</title>
      <p>The author(s) have not employed any Generative AI tools.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>GLACIATION</given-names>
            <surname>Consortium</surname>
          </string-name>
          ,
          <article-title>GLACIATION Metadata Reference Model (GLC-MRM</article-title>
          )
          <year>v1</year>
          .
          <fpage>1</fpage>
          .
          <issue>0</issue>
          ,
          <year>2024</year>
          . URL: https://glaciation-project.
          <source>eu/MetadataReferenceModel/1</source>
          .1.0/, accessed:
          <fpage>2025</fpage>
          -07-03.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Wamhof</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Bernardi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Martini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Leinberger</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sinha</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Tapken</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Schliebitz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Graf</surname>
          </string-name>
          ,
          <article-title>Metadata management and asset exchange in the agricultural data ecosystem of the project agri-gaia</article-title>
          ,
          <source>Datenbank-Spektrum</source>
          <volume>23</volume>
          (
          <year>2023</year>
          )
          <fpage>107</fpage>
          -
          <lpage>115</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>H.</given-names>
            <surname>Hamann</surname>
          </string-name>
          , et al.,
          <article-title>Ant-search algorithm for distributed knowledge graphs</article-title>
          ,
          <source>in: Swarm Intelligence: 14th International Conference, ANTS</source>
          <year>2024</year>
          , Konstanz, Germany, October 9-
          <issue>11</issue>
          ,
          <year>2024</year>
          , Proceedings, volume
          <volume>14987</volume>
          , Springer,
          <year>2024</year>
          , p.
          <fpage>243</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>N.</given-names>
            <surname>Kouka</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Piuri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Samarati</surname>
          </string-name>
          ,
          <article-title>Tasks scheduling with load balancing in fog computing: a bilevel multi-objective optimization approach</article-title>
          ,
          <source>in: Proceedings of the Genetic and Evolutionary Computation Conference</source>
          ,
          <year>2024</year>
          , pp.
          <fpage>538</fpage>
          -
          <lpage>546</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>G.</given-names>
            <surname>Piao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Mountantonakis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Papadakos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Sonawane</surname>
          </string-name>
          , A. OMahony,
          <article-title>Toward exploring knowledge graphs with llms (</article-title>
          <year>2022</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>O.</given-names>
            <surname>Vantzos</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Skipis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Chassioti</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Moraitis</surname>
          </string-name>
          ,
          <article-title>Green index: A measure of locally available green energy</article-title>
          ,
          <source>in: Proceedings of the 25th International Conference on Environment and Electrical Engineering (EEEIC)</source>
          ,
          <year>2025</year>
          .
          <source>Presented at the 25th International Conference on Environment and Electrical Engineering (EEEIC</source>
          <year>2025</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>S.</given-names>
            <surname>Paraboschi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abbadini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Böhler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Capano</surname>
          </string-name>
          , S. De Capitani di Vimercati,
          <string-name>
            <given-names>D.</given-names>
            <surname>Facchinetti</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Foresti</surname>
          </string-name>
          , G. Livraga, G. Oldani,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Samarati</surname>
          </string-name>
          ,
          <source>Deliverable D4</source>
          .
          <article-title>1 - Policies and Techniques for Data Protection in Modern Distributed Environments</article-title>
          ,
          <source>Technical Report 101070141</source>
          ,
          <string-name>
            <given-names>GLACIATION</given-names>
            <surname>Consortium</surname>
          </string-name>
          ,
          <year>2023</year>
          .
          <article-title>EU Horizon Europe Project GLACIATION</article-title>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>