<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Proactive concurrency control for data lakehouse: a meta-scheduling framework for urban construction data pipelines⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Olga Solovei</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Kyiv National University of Construction and Architecture</institution>
          ,
          <addr-line>31, Air Force Avenue, Kyiv, 03037</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2026</year>
      </pub-date>
      <abstract>
        <p>Transactional conflicts caused by concurrent operations on shared Delta Lake partitions pose significant challenges to reliability and predictability in modern data lakehouse environments. This paper introduces a dynamic, partition-aware meta-scheduling framework that operates above existing platform schedulers (e.g., Databricks Jobs) to proactively prevent such conflicts. The core of the framework involves analyzing SQL task semantics to infer resource access patterns, constructing a conflict graph, and applying a greedy graph coloring algorithm to produce conflictfree execution plans. Unlike static schedulers or ML-based approaches relying on historical data, the proposed method dynamically triggers scheduling decisions only when material changes occur, such as new task arrivals or task completions. Experimental results demonstrate the scalability of the algorithm, with execution time following an expected (n2) trend based on task count, and reveal performance sensitivity to conflict density. While performance anomalies were observed at higher task volumes due to structural graph complexity, the framework remains computationally feasible for real-world engineering pipelines. Future extensions will incorporate multi-objective optimization to account for factors such as cost, deadlines, and priority in the scheduling process.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Delta table</kwd>
        <kwd>concurrent transaction conflict</kwd>
        <kwd>scheduling algorithm</kwd>
        <kwd>resource aware conflict graph</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        The Internet of Things (IoT) has emerged as a core technology applied across all five key
stages of modern construction projects: investment, planning, construction, operations, and
demolition [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Empirical studies have statistically validated a strong and significant
correlation between IoT adoption and measurable improvements in critical areas such as
environmental monitoring, equipment administration, predictive maintenance, and on-site
safety monitoring [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. Despite these documented benefits, the adoption of IoT for managing
complex urban construction projects remains slower than anticipated, hindered by several
significant barriers [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Key among these are technical challenges are the demand for
substantial computing
power, issues
with
scalability, and
the
underperformance
of
technologies when deployed in real-world conditions [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ].
      </p>
      <p>© 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        In this context, the Data Lakehouse architecture is well-suited to serve as a centralized
platform for managing the vast amounts of structured and unstructured data generated by
these systems [
        <xref ref-type="bibr" rid="ref5 ref6">5-7</xref>
        ]. A foundation of the Data Lakehouse is built on Apache Spark and cloud
storage provides three fundamental advantages: 1) the decoupling of compute and storage,
which enables cost efficiency and elastic scaling; 2) distributed processing, ensuring high
performance on large datasets; 3) distributed storage, offering immense scalability.
      </p>
      <p>However, while this architecture effectively addresses challenges of scale and performance,
it introduces new operational complexities, particularly in managing transactional
concurrency. In a dynamic construction environments, numerous tasks attempt to read and
write to the same Delta Lake tables simultaneously. This concurrency can lead to transaction
failures and cascading job retries, especially in high-velocity streaming scenarios [8].</p>
      <p>For instance, Figure 1 illustrates a common multi-task data engineering job designed to
implement a typical Extract Transform and Load (ETL) pattern. The job consists of parallel
tasks to ingest IoT sensor data from different device types (Tasks 1-2), followed by parallel
data validation tasks (Tasks 3-4). The validated streams are then combined in an enrichment
task (Task 5), after which two final tasks are triggered concurrently: Task 6 to update a status
table and Task 7 to clean up the now-processed raw data. Under a standard, resource-agnostic
scheduler, both Task 6 and Task 7 would be executed in parallel. This creates a race condition
and a high probability of a transactional conflict: if the DELETE operation from Task 7
modifies a table partition while Task 6 is attempting to read or update that same partition, the
latter task will fail.</p>
      <p>The root causes of this conflict lies in the interaction between parallel execution model and
the optimistic concurrency control mechanism of Delta Lake. As shown in Figure 2, Spark's
ability to run tasks in concurrent threads allows both Task 6 and Task 7 to proceed
independently until the final commit stage. The conflict is only detected at the storage layer
when one thread attempts to commit its transaction to a table version that has already been
modified by the other, leading to an exception.</p>
      <p>Existing solutions for managing concurrent transactions in Delta Lake, such as partitioning
strategies and the built-in Optimistic Concurrency Control, have notable limitations. While
partitioning can isolate operations and reduce contention, its effectiveness diminishes
significantly in the presence of skewed workloads where many operations target the same
partition [9]. Optimistic Concurrency, by design, is a reactive mechanism. It assumes conflicts
are rare, allowing jobs to proceed and only checking for conflicts at the commit stage. When a
conflict occurs, the failing job must be rolled back and retried, leading to wasted
computational resources, increased data latency, and unpredictable job completion times [10].</p>
      <p>To address the issue of transactional conflicts, several studies have explored the integration
of machine learning models with transaction schedulers. These approaches typically predict
conflicts based on learned patterns from historical data and generate a fixed execution plan.
However, such solutions assume a static snapshot of tasks and resources, lacking support for
dynamic task arrivals. Consequently, these static methods exhibit limited adaptability to the
variability and elasticity inherent in cloud computing environments.</p>
      <p>Hence, an enhanced solution is required, involving the design of a dynamic scheduling
layer positioned above the data platform. By constructing and analyzing a partition-aware
conflict graph, the proposed meta-scheduler can generate and continuously adapt a
conflictfree execution plan. This paradigm shifts concurrency control from a reactive, storage-level
mechanism to a proactive, orchestration-level strategy, thereby mitigating the need for costly
transactional rollbacks and yielding significant improvements in performance, cost efficiency,
and reliability of critical data pipelines.</p>
      <p>This paper is structured as follows: following the Introduction and Literature Review, the
Materials and Methods section provides a formal definition of the proposed dynamic
scheduling framework, including the scheduling algorithm and an analysis of its
computational complexity. The Experiment Preparation section outlines the test scenario and
procedure. The Results and Discussion section presents and interprets the practical outcomes
of the experiments. Finally, the Conclusions section summarizes the key findings and specifies
the next work focus.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Literature Review</title>
      <p>Transactional conflicts and efficient resource scheduling are critical challenges in
cloudbased data processing systems, especially under optimistic concurrency control models such
as those employed by Delta Lake. Several recent studies have proposed solutions that combine
machine learning and graph-based algorithms to address these challenges, each with distinct
strengths and limitations.</p>
      <p>In [11], the authors introduce a classifier-based system to proactively predict hot key
access patterns for transaction scheduling. This system is trained on historical transaction
traces to identify common operational footprints. The training pipeline involves: (i) encoding
transactions as integer-based metadata vectors based on transaction type and known hot keys;
(ii) clustering these vectors using Euclidean distance, with the optimal number of clusters
determined through validation error minimization; and (iii) identifying canonical hot key
access patterns for each cluster. At runtime, the system applies a K-Nearest Neighbors (KNN)
classifier to assign incoming transactions to a cluster, thereby predicting their access patterns.
However, the method assumes that transactions of the same type exhibit a single, predictable
access pattern—a premise that does not hold in complex systems where the same transaction
type may produce diverse access patterns depending on its parameters.</p>
      <p>The “CCaaLF” model proposed in [12] (Concurrency Control as a Learnable Function)
takes a different approach. It is a workload-adaptive concurrency control mechanism that
predicts optimal waiting times for transactions using a machine learning model trained on
historical traces. An oracle is used during training to generate conflict-free scheduling
sequences. At runtime, for each transaction, a feature vector is extracted and passed through
the CCaaLF model, which predicts a wait time to minimize conflicts. Similarly, [13] presents a
binary classification model that estimates the likelihood of a conflict between two transactions
based on past interactions. The output of this model informs the transaction scheduler, which
reorders the execution queue accordingly. Despite their predictive accuracy, both [12] and
[13] depend heavily on historical data and are limited in environments where workloads and
data access patterns evolve rapidly, as is typical in cloud-based systems.</p>
      <p>In [14], a graph-based approach is explored for optimizing task scheduling in a
generalpurpose cloud environment. The methodology involves constructing a conflict graph, where
each node represents a task, and an edge denotes contention over a shared resource. A greedy
graph coloring algorithm is applied to assign time slots or execution batches to tasks such that
no two conflicting tasks run concurrently. While effective in reducing contention, the
proposed solution assumes a static set of tasks and available resources, performing scheduling
once prior to execution. This static planning fails to address the need for runtime adaptability
and dynamic task arrival, which are common in cloud-native data pipelines.</p>
      <p>In summary, while prior works have made significant progress toward proactive
scheduling, they tend to rely on assumptions of static workloads and deterministic access
patterns. These limitations highlight the need for a dynamic, runtime-aware scheduling
mechanism capable of adapting to transactional variability and concurrency in modern cloud
environments.</p>
    </sec>
    <sec id="sec-3">
      <title>3. Materials and methods</title>
      <p>To address the challenge of transactional conflicts inherent in the optimistic concurrency
model of Delta Lake, this study proposes a novel scheduling framework. As illustrated in
Figure 3, this framework to be placed before “Job Scheduler” (Figure 3) and has a capability to:
 Analyze the SQL queries for each task of the submitted Job to infer its resource access
patterns. This includes identifying the target tables, the access mode (e.g., READ,
WRITE, MERGE), and, the specific partitions being modified.
 Construct a partition-aware conflict graph based on the discovered resource access
patterns.</p>
      <p>To maintain computational efficiency, the graph coloring algorithm is triggered under
conditions that indicate a material change in the conflict graph. These include:</p>
      <p>The arrival of a new task tnew whose resource’s access overlaps with existing running
task ti, defined as:
(1)
where r is a resource which is requested, mi, mnew – resource access mode by running and
new task.</p>
      <p>The completion of a task that previously introduced exclusive access constraints:
(2)
where A (t k ) is a set of resource accesses made by the completed task tk; ΔE represents the
set of edges removed from the conflict graph G due to the task’s completion; the access modes
MERGE, DELETE, UPDATE are treated as exclusive and may have blocked other tasks.</p>
      <p>The following subsections (3.1-3.3) provide a formal mathematical model and a detailed
specification of the scheduling algorithm. The proposed approach is designed to be general,
applying to any data platform that utilizes Delta Lake or a similar storage format employing
optimistic concurrency control.</p>
      <sec id="sec-3-1">
        <title>3.1. Mathematical model of tasks conflict graph</title>
        <p>Let a workload be defined by a set of jobs J={j1,j2,…,jm}. Each job is composed of a set of tasks
T, such that the set of all tasks in the workload is Vt. Each task t∈Vt is defined by the tuple:
where Id is a task unique identifier, Job_Id is the identifier of the parent job, Dtis a set of
the task IDs that must complete before task t can start, Rt is a set of resources the task
accesses, where each resource is a tuple:
,
&lt;Id, Job_Id, &lt;Dt&gt;, Rt&gt;,
(6)
(3)
(4)
(5)</p>
        <p>This definition specifies that a conflict exists if two tasks access the same table, at least one
is a write-like operation, and either one operates on the full table or their partition
specifications overlap. Based on this, we construct the Task Conflict Graph as:
direct edge (ti , t j)∈ ED exists if ti∈ D{t j}</p>
        <p>A transactional conflict between two tasks (tp∈T, idq∈T) (from any jobs) occurs if they
attempt to modify the same resource simultaneously. We formally define the conflict function
as:</p>
        <p>where table is a name of delta table, mode is the access type (e.g. MERGE, DELETE,
UPDATE, READ), P is a set of partition key-value pairs defining the specific partition being
accessed. An empty set P denotes a full table operation.</p>
        <p>The workload's dependencies form a Directed Acyclic Graph (DAG) GD=(VT,ED) where a
In equation (7) an edge (tp,tq)∈ET exists if C(tp,tq)=1</p>
      </sec>
      <sec id="sec-3-2">
        <title>3.2. Scheduling Algorithm to prevent concurrent transaction conflict in delta tables</title>
        <p>The objective of scheduling algorithm is to assign each task t to a discrete execution batch
( time slot), denoted by the function χ(t), such that all dependencies are respected and no two
conflicting tasks are assigned to the same batch. This is subject to two constraints: tasks’
dependencies are respected (8) and conflicts are avoided (9):
(7)
(8)
(9)
(10)</p>
        <p>The algorithm operates iteratively. At each iteration k, it identifies the set of "runnable"
tasks Rk - those whose predecessors have all completed. Let Ck be the set of all tasks
completed up to iteration k. The set of runnable tasks is then:</p>
        <p>The algorithm, detailed in Algorithm 1, proceeds by building a conflict subgraph GT
k
induced by the runnable tasks Rk. It then applies a greedy graph coloring algorithm to GTk,
which assigns a "color" c to each runnable task such that no two conflicting tasks share the
same color. All tasks assigned the same color form a conflict-free execution batch S{k,c} that can
be run in parallel. The union of all these color-based batches forms the set of tasks s_k to be
scheduled in the current macro-step. This process repeats until all tasks have been scheduled.
Algorithm 1. Conflict free scheduler
Input: A set of tasks VT, dependency edges ED, conflict edges ET.</p>
        <p>Output: A partitioned schedule S = {S₀, S₁, ..., SN}
Initialization: C0←∅, k←0
repeat
1. Identify runnable tasks Rk ← {t ∈ VT \ Ck | Dt ⊆ Ck}
2. Build conflict subgraph
3. Assign colors to
4. Sk ←∅
5. for each color c in χk do
6. S{k,c} ← {t ∈ Rk | χ(t) = c}
resolve conflicts χ(t)← GreedyColoring(GTk)
7. Sk ← Sk ∪ S{k,c}
8. end for
9. Update completed tasks C{k+1}←Ck ∪ Sk
10. k ← k + 1 until Ck = VT</p>
      </sec>
      <sec id="sec-3-3">
        <title>3.3. Computational complexity of the scheduling algorithms to prevent a concurrent transaction conflict in delta tables</title>
        <p>Let n = |VT| be the total number of tasks and |ED| be the number of dependency edges. The
initial construction of the full conflict graph GT requires a pairwise comparison of all tasks,
resulting in a complexity of O(n2). The main loop runs at most n times. In each iteration k:
Identifying runnable tasks Rk can be done efficiently in O(n + |ED|). Building the induced
subgraph GT</p>
        <p>k and applying a greedy coloring algorithm has a complexity of O(|Rk| + |ETK |). In
the worst case, |Rk| can be up to n, giving a complexity of O(n + |ET|).</p>
        <p>Therefore, the dominant step is the initial all-pairs conflict discovery, leading to an overall
worst-case complexity of O(n2) for generating the complete schedule. This quadratic
complexity is computationally feasible for typical data engineering workloads where n is less
than hundred, making the algorithm practical for real-world implementation.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>4. Experiment preparation</title>
      <p>To validate the correctness and practical utility of the proposed meta-scheduling framework, a
test scenario was designed to test the scheduler's ability to detect and resolve a delete update
concurrent transaction conflict that occurs within a single, multi-tasks job.</p>
      <p>The initial state of the workflow is defined in a job with the following Directed Acyclic
Graph (DAG) of dependencies (Figure 4):



</p>
      <p>Two parallel ingestion tasks (Ingest_air_sensor_data, Ingest_meteo_sensor_data) load
raw data from IoT sensors of different types.</p>
      <p>These are followed by parallel verification tasks (Verify_air_sensor_data,
Verify_meteo_sensor_data), which validate data according to schema.</p>
      <p>The verified data is then combined in a central (Aggregate_to_Enrich) task. This task
writes its output to a primary data mart table for further enrichment with geo special
objects.</p>
      <p>After the aggregation step, two tasks are designed to run in parallel: 1)
Clean_up_raw_data: task performs a DELETE operation on the raw data tables that
have now been processed ( deletes records from a staging table, staging.raw_events,
where status = 'processed'). 2) Update_Status: task reads from the same
staging.raw_events table to gather metrics before updating a final status table.</p>
      <p>Under the standard scheduler, both Clean_up_raw_data and Update_Status would be
triggered simultaneously after Aggregate_to_Enrich completes. This creates a concurrent
transaction conflict: If the DELETE operation from Clean_up_raw_data modifies the table
while Update_Status is attempting to UPDATE it, a ConcurrentDeleteReadException is raised,
causing the Update_Status task to fail (Figure 5).</p>
      <p>To evaluate the computational performance and scalability of the proposed meta-scheduling
algorithm, a test scenario was designed with the objectives to measure the algorithm's
execution time as a function of both the total number of tasks in a workload (n) and the
degree of conflicts (d).</p>
      <p>A synthetic workload with controlled characteristics: 1) a total number of tasks that
represent the number of vertices in the conflict graph was varied across the range [50, 100,
200, ... ,1000] to simulate workloads of increasing scale. 2) Conflict Density (d) represents the
probability that any two tasks in the workload have a resource conflict with. Three distinct
levels of conflict density were defined for this study: Small Density: A low probability of
conflict (approx. 14-20% conflicting tasks), representing a largely independent set of tasks
where high parallelism is expected. Medium Density: An intermediate level of conflicts
(approx. 20-28%), representing a more complex and interconnected workload. High Density: A
high probability of conflict (approx. 33%), representing a "worst-case" scenario where many
tasks compete for the same resources, necessitating significant serialization.</p>
      <p>The performance evaluation was conducted executing the procedure:
For each value of n in the specified range:</p>
      <p>For each conflict density level d:
1.Generate a synthetic workload of n tasks with the given conflict density d.
2.Record a start timestamp.
3.Execute the complete meta-scheduling algorithm 1.
4.Record an end timestamp.</p>
      <p>5.Calculate the total Scheduler Execution Time as the difference between the end and
start timestamps.</p>
      <p>endFor
endFor
Based on our theoretical analysis, we formulated two key hypotheses to be validated by this
experiment:
 Hypothesis 1 (Scalability with n): the execution time of the scheduler will grow at a
rate consistent with a quadratic O(n2) complexity as the number of tasks n increases.</p>
      <p>This is due to the all-pairs comparison required for initial conflict discovery.
 Hypothesis 2 (Sensitivity to d): while execution time is primarily a function of n, it will
also be influenced by the conflict density d, reflecting the varying computational cost
of the graph coloring algorithm on graphs of different structures and densities.
The experiment will be executed with delta lake tables in DataBricks platform. Algorithm 1 is
implemented with PySpark libraries.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Results and discussions</title>
      <p>The job configuration on Figure 6 represents the output of meta-scheduling algorithm after it
has analyzed the submitted job; identified that Clean_up_raw_data (a DELETE operation) and
Update_Status (an Update operation) both target the same table, staging.raw_events and the
same partition’s value. The scheduler modifies the dependency graph - adds a conflict edge
between the Clean_up_raw_data and Update_Status nodes.</p>
      <p>On figure 8 (a) is captured number of tasks with conflict for given level of conflict density
and total number of tasks ranging from 50 to 1,000; on figure 8 (b) is illustrated the execution
time it took for the algorithm to construct the conflict graph, to apply graph coloring, and
generate the final conflict-free schedule.
On Figure 8 (b) is visible that as the total number of tasks (n) increases, the scheduler's
execution time grows at a non-linear rate, which is consistent with the O(n 2) complexity
derived from the all-pairs conflict discovery phase. For example, under a small conflict
density, increasing the workload from 100 tasks to 1,000 tasks (a 10x increase) resulted in the
execution time growing from approximately 3.45 seconds to 742.72 seconds (a ~215x increase),
clearly demonstrating a quadratic-like trend. This confirms that the dominant computational
cost of the algorithm is the initial construction of the conflict graph, as predicted by our
complexity analysis.</p>
      <p>In addition, for a fixed number of tasks, conflict density influences the execution time
(Figure 9). The most visible is when the total number of tasks is 100 (n=100) and execution
time is 3.45s, 18.48s, 2.75s for conflict density is small, medium, high correspondingly. 18.48ms
is not expected according to the rule: Time (Small) &lt; Time (Medium) &lt; Time (High), so a
proposed algorithm has a performance anomaly. Those outliers in collected performance
statistics are identified according to IQR Method and Z-score formular:
 Execution time is 792.7s for total number of tasks is 900 and conflict density is small.
</p>
      <p>Execution time is 742.7s for total number of tasks is 1000 and conflict density is small.</p>
      <p>The reason for outliers is the characteristics of the underlying greedy graph coloring
algorithm, which is sensitive to the graph's structure. For the total number of tasks is more
than 800 the algorithm explores many more color options for each vertex this leads to more
cache misses and less predictable branching in the algorithm's execution path, resulting in
longer runtimes. As including more than 800 tasks running simultaneously is a rare scenario,
for engineering pipeline we conclude that the proposed algorithm demonstrates a predictable
O(n2) scaling, confirming its feasibility for real-world application.</p>
    </sec>
    <sec id="sec-6">
      <title>6. Conclusion</title>
      <p>This research set out to develop and evaluate a proactive, partition-aware meta-scheduling
framework for dynamic preventing transactional conflicts in modern data lakehouse
environments. A proposed scheduling framework constructs a partition-aware conflict graph
based on the discovered resource access patterns. It triggers graph coloring algorithm under
conditions that indicate a material change only. The primary objectives were to design a
computationally feasible algorithm and to validate its performance scalability.</p>
      <p>The experimental results presented in this paper have successfully verified our initial
hypotheses. Firstly, the scheduler's execution time was shown to scale quadratically with the
number of tasks, consistent with the theoretical O(n2) complexity of the conflict discovery
phase (Hypothesis 1). Secondly, the results confirmed that the algorithm's performance is
sensitive to the structural complexity of the conflict graph, which is influenced by the conflict
density (Hypothesis 2). These findings validate the proposed algorithm as a practical and
effective solution.</p>
      <p>The current research limitation is that a proposed meta-scheduler creates a valid,
conflictfree schedule but does not consider other factors like cost, deadlines, or job priority. The
"greedy" coloring algorithm produces a valid schedule, but not necessarily the optimal one
from a business perspective.</p>
      <p>Further work focus will be to incorporate multi-objective optimization into the execution
planner. This involves extending the conflict graph into a weighted, directed acyclic graph
where vertices are weighted by job priority, estimated compute cost, and expected runtime.</p>
      <p>The scheduling algorithm would then move beyond simple graph coloring to employ more
advanced techniques from operations research, such as critical path analysis or list scheduling
algorithms, to generate a schedule that not only avoids conflicts but also aims to minimize
total workflow cost.</p>
      <p>Declaration on Generative AI
The author(s) have not employed any Generative AI tools.
[7] K. Gade, “Data Lakehouses: Combining the Best of Data Lakes and Data Warehouses”.</p>
      <p>Journal of Computational Innovation,” vol. 2, no.1, 2022.
[8] Isolation levels and write conflicts on Databricks. Available at:
https://docs.databricks.com/aws/en/optimizations/isolation-level.
[9] PJ Liu, CP Li, H Chen, "Enhancing storage efficiency and performance: A survey of data
partitioning techniques." Journal of Computer Science and Technology 39, no. 2, 2024 ,
pp. 346-368.
[10] Concurrency control - Delta Lake Documentation. Available at:
https://docs.delta.io/latest/concurrency-control.html.
[11] A Cheng, A Kabcenell, J Chan, X Shi, P Bailis, N. Crooks, and I. Stoica, "Towards optimal
transaction scheduling."Proceedings of the VLDB Endowment 17, no. 11 (2024):
26942707. doi:10.14778/3681954.3681956.
[12] H Pan, S Cai, TTA Dinh, Y Wu, YM Chee, G Chen, BC Ooi, “CCaaLF: Concurrency</p>
      <p>Control as a Learnable Function”. arXiv preprint arXiv:2503.10036, 2025.
[13] S Chen, C Shen, C Wu, “Intelligent Transaction Scheduling to Enhance Concurrency in
High-Contention Workloads”. Applied Sciences, vol. 15, no. 11, 2025, 6341;
doi:10.3390/app15116341.
[14] S. De An efficient technique of resource scheduling in cloud using graph coloring
algorithm. Global Transitions Proceedings, vol. 3, no. 1, pp. 169-176, 2022.
doi: 10.1016/j.gltp.2022.03.005.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>F.</given-names>
            <surname>Althoey</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Waqar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>SH</given-names>
            .
            <surname>Alsulamy</surname>
          </string-name>
          , AM. Khan,
          <string-name>
            <given-names>A.</given-names>
            <surname>Alshehri</surname>
          </string-name>
          , Il Falqi, M. Abuhussain, “
          <article-title>Influence of IoT implementation on Resource management in construction,” Heliyon</article-title>
          , vol.
          <volume>10</volume>
          , no.
          <issue>15</issue>
          ,
          <year>August 2024</year>
          . doi:
          <volume>10</volume>
          .1016/j.heliyon.
          <year>2024</year>
          .e32193.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>A.</given-names>
            <surname>Khan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Alrasheed</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Waqar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Almujibah</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Benjeddou</surname>
          </string-name>
          , “
          <article-title>Internet of things (IoT) for safety and efficiency in construction building site operations,” Scientific reports</article-title>
          , vol
          <volume>14</volume>
          , no.
          <issue>2</issue>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .1038/s41598-024-78931-0.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Tao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Sun</surname>
          </string-name>
          , P. Fu, “
          <article-title>Lifecycle management of urban renewal enabled by Internet of Things: Development, application</article-title>
          , and challenges,” Results in Engineering,
          <year>2025</year>
          . doi:
          <volume>10</volume>
          .1016/j.rineng.
          <year>2025</year>
          .
          <volume>105706</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>W.</given-names>
            <surname>Zonghui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Veniaminovna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Vladimirovna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Ivan</surname>
          </string-name>
          , H. Isleem, “
          <article-title>Sustainability in construction economics as a barrier to cloud computing adoption in small-scale Building projects,” Scientific Reports</article-title>
          , vol.
          <volume>15</volume>
          , no.
          <issue>1</issue>
          ,
          <year>2025</year>
          . doi:
          <volume>10</volume>
          .1038/s41598-025-93973-8
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>C.</given-names>
            <surname>Rucco</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Longo</surname>
          </string-name>
          , M. Saad, “
          <article-title>Efficient Data Ingestion in Cloud-based architecture: a Data Engineering Design Pattern Proposal”</article-title>
          ,
          <year>2025</year>
          . arXiv preprint arXiv:
          <volume>2503</volume>
          .
          <fpage>16079</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <surname>I. Hassan</surname>
          </string-name>
          , “
          <article-title>Storage structures in the era of big data: from data warehouse to lakehouse</article-title>
          ,
          <source>” Journal of Theoretical and Applied Information Technology</source>
          , vol.
          <volume>102</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>2428</fpage>
          -
          <lpage>2441</lpage>
          ,
          <year>March 2024</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>