<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Approach: AI for Digitalised Carbon Storage Analysis</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Yuanwei Qu</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Arild Waaler</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Anita Torabi</string-name>
          <email>anita.torabi@geo.uio.no</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Baifan Zhou</string-name>
          <email>baifan.zhou@oslomet.no</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, Oslo Metropolitan University</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Department of Geosciences, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Department of Informatics, University of Oslo</institution>
          ,
          <country country="NO">Norway</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <fpage>17</fpage>
      <lpage>18</lpage>
      <abstract>
        <p>Carbon Capture and Storage (CCS) plays an essential role in mitigating climate change. A comprehensive CO2 storage analysis is key to CCS success, however, its advancement is hindered by interdisciplinary complexities, data fragmentation, and the need for enhanced transparency. This paper presents ongoing work in an integrated digitalisation framework for CO2 storage analysis that addresses three major challenges: knowledge complexity across geosciences, reservoir engineering, and computer science; the heterogeneous, multi-scale nature of geological data; and the requirement for explainable decision-making in high-risk subsurface storage scenarios. Our approach leverages both symbolic AI and generative AI. On the symbolic AI side, it includes ontologydriven knowledge engineering to harmonise domain-specific terminologies, and employs a layered information modelling and data integration system to standardise diverse datasets. Additionally, a generative AI-powered query and explanation system provides context-aware, transparent analyses. This framework aims to streamline multidisciplinary collaboration and enhance the accountability and reliability of long-term CO2 storage, thereby supporting the transition to a carbon-neutral future.</p>
      </abstract>
      <kwd-group>
        <kwd>carbon capture and storage</kwd>
        <kwd>knowledge engineering</kwd>
        <kwd>information modelling</kwd>
        <kwd>explainable AI</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>(B. Zhou)
(B. Zhou)
CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Carbon Capture and Storage (CCS) captures CO2 from industrial sources or the atmosphere and securely
stores it in geological formations. This technology mitigates greenhouse gas emissions and supports
the transition to net-zero emissions [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. To achieve its full potential in global CO2 mitigation and
meet predicted storage capacities, CCS requires advanced, cost-efective analysis methods that ensure
long-term storage safety and integrity [
        <xref ref-type="bibr" rid="ref2 ref3">2, 3</xref>
        ]. A key advancement in this area is digital transformation:
transitioning from traditional workflows to data- and knowledge-driven methods that improve the
eficiency and accuracy of storage analysis.
      </p>
      <p>
        However, digitalising carbon storage analysis remains challenging despite its recognised importance
in the energy sector [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This challenge arises from the need for CO2 storage analysts to integrate vast
amounts of domain-specific knowledge and analyse data in various formats, from structured datasets
to images and diagrams. While machine learning advances ofer promise for specific tasks [
      </p>
      <sec id="sec-2-1">
        <title>5], their</title>
        <p>accuracy is often constrained by the unstructured nature of CO 2 storage data. While machine learning
advances ofer promise for specific tasks [</p>
      </sec>
      <sec id="sec-2-2">
        <title>5], their efectiveness is often constrained by the unstructured</title>
        <p>nature of CO2 storage data due to accessibility, quality, and heterogeneity issues.</p>
        <p>Compounding these technical dificulties, the interdisciplinary nature of CO
2 storage analysis requires
collaboration across domains such as geoscience, reservoir engineering, and computer science. Each
domain has its unique methodologies and terminologies, hindering efective collaboration due to
difering approaches and tools, complicating data processing and interpretation. In this context, we
characterise three major challenges for the design and operation of digitalised carbon storage analysis:</p>
        <sec id="sec-2-2-1">
          <title>Knowledge complexity and diversity</title>
        </sec>
        <sec id="sec-2-2-2">
          <title>Data fragmentation and heterogeneity</title>
          <p>Leakage or Stable?
Fractures
Seismic survey</p>
          <p>Petrophysical plot
Microscopic image
Outcrop</p>
          <p>Fieldwork
data
Well log</p>
        </sec>
        <sec id="sec-2-2-3">
          <title>Accountability</title>
          <p>Data Integrity</p>
          <p>Compliance
Validation
Geologists</p>
          <p>Computer Scientist Reservoir Engineer
Explainability</p>
          <p>Traceability</p>
          <p>Responsibility
• C1, Knowledge complexity and diversity highlights the need to reconcile disparate terminologies and
conceptual frameworks across geoscience, reservoir engineering, and computer science. Experts may
struggle to interpret concepts or data presented in another discipline’s terminology.
• C2, Data fragmentation and heterogeneity is a critical issue, as the CO2 storage analysis relies on
heterogeneous data types such as well logs, seismic surveys, core samples and analogues from the
onshore fields. These data sources are highly fragmented cross data silos, rendering manual data
interpretation and integration by human experts ineficient.
• C3, Accountability is crucial, as CO2 storage processes are high-stake. All aspects of analysis and
system operation, including data queries and knowledge retrieval, must be transparent and easily
explainable to ensure trust, accountability, and reliability.</p>
          <p>To address these challenges, we propose an integrated approach. For C1, knowledge engineering
bridges semantic gaps across disciplines to create a unified conceptual framework . C2 is addressed
through information modelling and data integration techniques that standardise and process diverse
data sets, providing unified data access . Finally, C3 is tackled with an AI-enhanced system for data
and knowledge query that provides transparent, context-aware responses and explanations for storage
analyses and knowledge queries. Together, these strategies aim to enable efective interdisciplinary
collaboration, supported by high-quality data and knowledge, positioning digitalised carbon storage
analysis as a key element in decarbonisation eforts.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2. Challenges</title>
      <p>
        Digitalising carbon storage analysis demands tight collaboration among geoscientists, reservoir
engineers, and computer scientists. This section elaborates on the three major challenges, with
examples illustrated in Figure 1.
2.1. Challenge C1: Knowledge Complexity and Diversity
CO2 storage analysis spans multiple disciplines, each with its own specialised knowledge and conceptual
models. These frameworks are not only complex individually but also diverse in terminology and
perspective. For instance, the term fracture can vary in meaning: reservoir engineers may describe it as lines
or 2D planes in models, while geoscientists may define it as a 3D volume network with a core and damage
zone. These semantic diferences afect the interpretation of caprock integrity analysis, potentially
leading to inconsistent leakage predictions. Computer scientists, unfamiliar with these domain nuances, face
dificulties understanding such contextual variations. Furthermore, existing analysis methods are largely
adapted from petroleum engineering, which prioritises extraction. In contrast, CO2 storage demands
a focus on long-term containment, requiring a shift in both objectives and conceptual understanding.
2.2. Challenge C2: Data Fragmentation and Heterogeneity
CO2 storage analysis depends on diverse geological datasets that are often fragmented across isolated
systems. Data may originate from well logs that provide detailed measurements of subsurface properties,
seismic surveys that map geological structures, core samples that ofer physical insights into rock
composition, and onshore outcrop data that provide analogue scenarios for geologists to reference and
build geo-mechanical models to predict the behaviour of reservoir and cap-rock. These datasets difer
not only in format and resolution but also in the type of information they convey. For example, seismic
data are often high-dimensional and require sophisticated signal processing, while well log data are
typically time-series measurements with varying scales. This fragmentation and heterogeneity hinder
efective data access and analysis, complicating data interpretation and limiting the eficiency of both
human analysis and data science methods.
2.3. Challenge C3: Accountability
CO2 storage involves high-stake infrastructure where decisions directly afect safety, environmental
impact, and operational eficiency. As AI-supported methods become increasingly integral to storage
analysis [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], ensuring transparency, explainability, and traceability in decision-making is essential. For
instance, during the CO2 injection phase, when an anomaly—such as a sudden pressure change in a
storage reservoir is detected, the geologists and reservoir engineers need to understand the underlying
reasoning behind any automated response or control adjustment. Accountability also demands that
data queries and knowledge retrieval are traceable and interpretable, supporting compliance with
regulations, verification of data integrity, and clear responsibility in operations.
      </p>
      <p>Each of these challenges highlights the complexity inherent in creating a robust digitalised CO2
storage analysis method. In the following section, we describe our approach that targets these challenges
through a combination of knowledge engineering, information modelling with data integration, and
large language model-based query and explanation systems.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Approach</title>
      <p>To overcome the challenges in digitalising carbon storage analysis, our approach is structured around
three key strategies, each targeting a specific challenge (Figure 2).
3.1. Knowledge Engineering
To address knowledge complexity and diversity (C1), we establish a unified conceptual framework that
aims to bridge the semantic gaps across geoscience, reservoir engineering, and computer science. Our
approach employs knowledge engineering techniques to develop a unified ontology that harmonises
domain-specific terminologies and concepts. For example, the ontology should define terms such as
cap rock, sealing stable time, permeability, etc., with machine-readable definitions that are agreed upon
by domain experts. In the current knowledge model (Unified Conceptual Framework in Figure 2), we
separate the concept of cap rock from rock as material and make a conceptual distinction between the
temporal region of the sealing stable time and the predicted value of storage time.</p>
      <p>
        To not reinvent the wheel, this framework is built above the existing useful ontological resources
such as top-level ontology: Basic Formal Ontology [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], core ontology: GeoCore Ontology [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ], and
domain-specific ontology: GeoFault Ontology [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]. By mapping and clarifying these concepts into a
unified conceptual model, this framework enables experts from diferent domains to share and interpret
information accurately, supporting efective multidisciplinary collaboration.
      </p>
      <p>Unified Conceptual Framework</p>
      <p>Ontology</p>
      <p>layer
Information
model layer</p>
      <p>Data
layer
Unified Conceptual Framework
Facility</p>
      <p>Reservoir Digital System</p>
      <p>Text
Database</p>
      <p>Image
Database
3.2. Information Modelling and Data Integration
The second challenge (C2) relates to the fragmented multi-scale data sources, such as petrographic thin
section, well logs, core samples, seismic surveys, and onshore outcrops. Our approach aims at a unified
data access framework, enabling a consistent and structured way of accessing diverse data through a
common semantic and modelling framework, comprising of three layers:
• The ontology layer that relies on the unified conceptual framework to ensure clarity and consistent
terminology.
• The information model layer that serves as a mediator between the ontology layer and data layer by
constructing information models from the perspectives of geoscience, reservoir engineering, and
computer science, and by implementing data integrity and constraint checks.
• The data layer are mapped to and accessed via the information model layer, comprising of diverse
data, such as text data (e.g., documents, reports), image data (e.g., tomography, seismic survey), table
data (e.g., sensor data).</p>
      <p>
        The unified data access approach organises geological and operational data into a structured, query-able
format that can be eficiently accessed and interpreted by both humans and automated systems. To
ensure the work results are well aligned with industrial standards, the construction of information
model will follow the recommended practice for asset information modelling framework from DNV
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] and apply the suggested Information Modelling Framework language [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ].
3.3. AI-enhanced System for Data and Knowledge Query
Given the risks associated with CO2 storage analysis and the followed operating processes, accountability,
including transparency and explainability (C3) are critical. To meet this need, we aim to develop an
AI-enhanced system for data and knowledge query that provides context-aware, human-understandable
answers and explanations. The system relies on the unified data access and provides semantic-enriched
responses. The system employs retrieval-augmented generation (RAG) techniques to draw upon
up-todate domain knowledge and operational data, ensuring that each decision or output is accompanied by
clear and traceable explanations. For example, when an operator queries the caprock sealing capability,
which is crucial for ensuring that the subsurface carbon storage system does not leak, the system will
not only assess the sealing capability based on the retrieved relevant sensor data, but also explain the
knowledge and reasoning behind the assessment in plain language (giving explanations). This approach
enhances operators’ trust and supports informed decision-making in complex subsurface environments.
      </p>
    </sec>
    <sec id="sec-5">
      <title>4. Summary</title>
      <p>This paper presents our ongoing efort to develop a data- and knowledge-driven approach enhanced
by AI for CO2 storage analysis. We identify three core challenges: the complexity and diversity of
domain-specific knowledge, the heterogeneous and multi-scale nature of data from diferent sources, and
the critical need for transparency and explainability raised by accountability in carbon storage analysis
and operations. To address these challenges, our approach integrates: (1) knowledge engineering to
create a unified conceptual framework, (2) information modelling and data integration to standardise
and fuse diverse datasets, enabling unified data access aligned with industry best practices, and (3) a
generative AI-enhanced system for data and knowledge query to support accountable, transparent, and
traceable decision-making. Together, these components form a foundation for a more eficient, reliable,
collaborative, and digitalised CO2 storage analysis framework, contributing to a carbon-neutral future.
Declaration on Generative AI. The authors have used ChatGPT to assist with the polishing of
human-authored text. The authors take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>IEA</given-names>
            ,
            <surname>Net</surname>
          </string-name>
          <string-name>
            <surname>zero</surname>
          </string-name>
          <source>by</source>
          <year>2050</year>
          ,
          <year>2021</year>
          . URL: https://www.iea.org/reports/net-zero-by-
          <year>2050</year>
          , accessed
          <issue>10</issue>
          <year>April 2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Kazlou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cherp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Jewell</surname>
          </string-name>
          ,
          <article-title>Feasible deployment of carbon capture and storage and the requirements of climate targets</article-title>
          ,
          <source>Nature Climate Change</source>
          <volume>14</volume>
          (
          <year>2024</year>
          )
          <fpage>1047</fpage>
          -
          <lpage>1055</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gholami</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Raza</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Iglauer</surname>
          </string-name>
          ,
          <article-title>Leakage risk assessment of a co2 storage site: A review</article-title>
          ,
          <source>EarthScience Reviews</source>
          <volume>223</volume>
          (
          <year>2021</year>
          )
          <fpage>103849</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>IEA</surname>
          </string-name>
          , Digitalisation and energy,
          <year>2017</year>
          . URL: https://www.iea.org/reports/digitalisation-and
          <source>-energy, accessed 9 April</source>
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>F.</given-names>
            <surname>Hussin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A. N. M.</given-names>
            <surname>Rahim</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. S. M.</given-names>
            <surname>Hatta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Aroua</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. A.</given-names>
            <surname>Mazari</surname>
          </string-name>
          ,
          <article-title>A systematic review of machine learning approaches in carbon capture applications</article-title>
          ,
          <source>Journal of CO2 Utilization</source>
          <volume>71</volume>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fawad</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. J.</given-names>
            <surname>Rahman</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. H.</given-names>
            <surname>Mondol</surname>
          </string-name>
          ,
          <article-title>Seismic reservoir characterization of potential co2 storage reservoir sandstones in smeaheia area, northern north sea</article-title>
          ,
          <source>Journal of Petroleum Science and Engineering</source>
          <volume>205</volume>
          (
          <year>2021</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>A.</given-names>
            <surname>Torabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Alaei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <article-title>Fault characteristics in exhumed basement rocks; implications for understanding subsurface basement faults</article-title>
          ,
          <source>Tectonophysics</source>
          <volume>887</volume>
          (
          <year>2024</year>
          )
          <fpage>230445</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Nassabeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>You</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Keshavarz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Iglauer</surname>
          </string-name>
          ,
          <article-title>Sub-surface geospatial intelligence in carbon capture, utilization and storage: a machine learning approach for ofshore storage site selection</article-title>
          ,
          <source>Energy</source>
          <volume>305</volume>
          (
          <year>2024</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>R.</given-names>
            <surname>Arp</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Smith</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. D.</given-names>
            <surname>Spear</surname>
          </string-name>
          ,
          <article-title>Building ontologies with basic formal ontology</article-title>
          , Mit Press,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L. F.</given-names>
            <surname>Garcia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrin</surname>
          </string-name>
          , R. dos Santos Alvarenga,
          <article-title>The geocore ontology: a core ontology for general use in geology</article-title>
          ,
          <source>Computers &amp; Geosciences</source>
          <volume>135</volume>
          (
          <year>2020</year>
          )
          <fpage>104387</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Perrin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Torabi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Abel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Giese</surname>
          </string-name>
          ,
          <article-title>Geofault: A well-founded fault ontology for interoperability in geological modeling</article-title>
          ,
          <source>Computers &amp; Geosciences</source>
          <volume>182</volume>
          (
          <year>2024</year>
          )
          <fpage>105478</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <surname>DNV</surname>
          </string-name>
          ,
          <string-name>
            <surname>DNV-</surname>
          </string-name>
          RP-
          <volume>0670</volume>
          ,
          <article-title>Recommended practice for asset information modelling framework</article-title>
          ,
          <year>2024</year>
          . URL: https://www.dnv.
          <article-title>com/digital-trust/recommended-practices/ asset-information-modelling-</article-title>
          <string-name>
            <surname>dnv-</surname>
          </string-name>
          rp-
          <volume>0670</volume>
          /, accessed
          <issue>10</issue>
          <year>April 2025</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A.</given-names>
            <surname>Waaler</surname>
          </string-name>
          , M. Skjaeveland, (editors),
          <source>Information modelling framework manual version 0.3.0</source>
          ,
          <year>2024</year>
          . URL: https://www.imfid.org/,
          <source>accessed 10 April</source>
          <year>2025</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>