<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Cache-Aware Real-Time Scheduling Simulator: Implementation and Return of Experience</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Hai Nam Tran</string-name>
          <email>hai-nam.tran@univ-brest.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Frank Singhoff</string-name>
          <email>singhoff@univ-brest.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Stéphane Rubini</string-name>
          <email>rubini@univ-brest.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Jalil Boukhobza</string-name>
          <email>boukhobza@univ-brest.fr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Univ. Bretagne Occidentale, UMR 6285, Lab-STICC</institution>
          ,
          <addr-line>F-29200 Brest</addr-line>
          ,
          <country country="FR">France</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2015</year>
      </pub-date>
      <issue>0</issue>
      <fpage>2</fpage>
      <lpage>7</lpage>
      <abstract>
        <p>Evaluating cache related preemption delay (CRPD) in preemptive scheduling context for Real-Time Embedded System (RTES) stays an open issue despite of its practical importance. Indeed, various parameters should be taken into account such as cache utilization of tasks, memory layout of tasks, processor utilization and priority assignment algorithm. In state-of-the-art work, dependencies amongst those parameters are not investigated with precision because of the lack of scheduling analysis tool taking them into account. In this article, we propose a tool to investigate and evaluate scheduling analysis of RTES with cache memory and various scheduling parameters. Both modeling guidelines and implementation is detailed. Implementation is made in Cheddar, an open-source scheduling analyzer, which is freely available; Experiments are conducted in order to illustrate performance and applicability of our tool. Furthermore, we discuss about implementation issues, problems raised and lessons learned from those experiments.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. INTRODUCTION</title>
      <p>
        Cache memory is a crucial hardware component used to
reduce the performance gap between processors and main
memories. In the context of real-time embedded system
(RTES), the popularization of processors with large size and
multi level caches motivates the proposition of veri cation
methods handling this hardware component [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>Scheduling simulation is a classical veri cation method
performed at design step of RTES. It provides means to
investigate that all timing constraints of a RTES are satis ed.
To perform scheduling simulation, several assumptions are
usually made in order to simplify system modeling. One of
them is that preemption costs are negligible. Preemption
cost is the additional time to process interrupts, manipulate
task queues and actually perform context switches.</p>
      <p>
        Integrating cache memory in RTES generally enhances the
whole performance in term of execution time, but
unfortunately it can lead to an increase in preemption cost and
execution time variability [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ]. When a task is preempted,
memory blocks belonging to the task could be removed from
the cache. Once the task resumes, previously removed
memory blocks have to be reloaded into the cache. Thus, a new
preemption cost named Cache Related Preemption Delay
(CRPD) is introduced. By de nition, CRPD is the
additional time to re ll the cache with memory blocks evicted
by the preemption. In [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], the authors showed that the
cost of context switching raises from the range of 4.2 s to
8.7 s to the range of 38.6 s to 203.2 s when the size of the
data sets of programs is larger than the cache. Thus,
taking CRPD into account may be important when performing
scheduling analysis of a RTES.
      </p>
      <p>Problem statement: Scheduling analysis of RTES with
cache memory in xed priority preemptive scheduling
context is complex because there are many parameters a ecting
the outcome and dependencies amongst them. For example,
CRPD analysis, which is the process of evaluating the
impact of CRPD on a RTES, cannot be only based on
singletask analysis as it depends on correlations amongst tasks.
For a set of tasks, CRPD analysis has to take into account
parameters such as cache utilizations of tasks, memory
layout of tasks, processor utilization but also priority
assignment algorithms and WCET tasks.</p>
      <p>To address all parameters, many tools are involved, in
two domains: WCET/Cache analysis and scheduling
analysis. Each of those domains is only dedicated to a sub-part
of those parameters. Unfortunately, there are no tools
addressing the whole problem in state-of-the-art work.</p>
      <p>
        Contributions : We propose a tool to investigate and
evaluate scheduling analysis of RTES with cache memory and
various scheduling parameters. The work is implemented in
Cheddar, an open-source scheduling analyzer, which is freely
available to researchers and practitioners. We propose an
approach to use the analysis models and results of
WCET/cache analysis tools into a scheduling analysis tool. The
programming model we used is compliant with the existing
one in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. Experiments are performed to
illustrate performance and applicability of our tool.
      </p>
      <p>The rest of the article is organized as follows. Section 2
presents background of our work. In Section 3, we give an
overview of our approach. In Section 4, we present
development process and detailed information about the
implementation of our work. In Section 5, an evaluation of our
proposed scheduling simulator is given. Section 6 discusses
related works and Section 7 concludes the article.
2.</p>
    </sec>
    <sec id="sec-2">
      <title>BACKGROUND</title>
      <p>In this section, we introduce the system model and we
explain how preemption cost and CRPD are computed. We
assume an uniprocessor RTES with direct-mapped
instruction cache. As far as we know, instruction cache is popular in
practical implementation of RTES. The assumption about
direct-mapped is used to simplify the data ow analysis,
it could be easily relaxed. There are n independent tasks,
1. Modeling
System Model
(Cheddar-ADL)
2. Data Flow</p>
      <p>Analysis
: Input</p>
      <p>
        : Output
1; 2; :::; n scheduled by a preemptive scheduler. CPRD is
bounded by g BRT , where g is an upper bound on the
number of cache block reloaded due to preemption, and BRT is
an upper-bound on the time necessary to reload a memory
block in the cache (block reload time). To analyze the e ect
of preemption on a preempted task, Lee et al.[
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] introduced
the concept of useful cache block (UCB ):
      </p>
      <p>
        De nition 1. A memory block m is called a useful cache
block (UCB) at program point P , if m may be cached at P
and m may be reused at program point P 0 after P that may
be reached from P without eviction of m on this path.
The number of UCB at program point P gives an upper
bound on the number of additional reloads due to a
preemption at P . The maximum possible preemption cost for
a task is determined by the program point with the highest
number of UCB. In [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ], the authors exploit the fact that for
the i-th preemption, only the i-th highest number of UCB
has to be considered. However, as shown in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] and [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], a
signi cant reduction typically only occurs at a high number
of preemptions. Thus, we only consider the program point
with highest number of UCB.
      </p>
      <p>
        The impact of preempting task is given by the number
of cache blocks that the task may evict during its
execution. Busquet et al.[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] introduce the concept of evicting
cache block (ECB ):
      </p>
      <p>De nition 2. A memory block of the preempting task is
called an evicting cache block (ECB), if it is accessed during
the execution of the preempting task.</p>
      <p>The notation UCBi and ECBi are used to present the set
of UCBs and ECBs of a task i. Assume that the sets of UCB
and ECB of each task are preliminary computed, UCB'i is
the set of UCBs currently in the cache of the preempted
task. i;j is the preemption cost (i.e. the CRPD) when a
task j directly preempt task i. In case of a preemption
between two tasks i and j, i;j is computed by:
i;j = BRT j UCB'i \ ECBj j
(1)
However, in case of nested preemptions, a task j can
preempt more than one task. Thus, computation of CRPD
must take all preempted tasks into account.</p>
    </sec>
    <sec id="sec-3">
      <title>OUR APPROACH</title>
      <p>Our approach consists of three steps as shown in Fig. 1.
First, we modeled a RTES with components required to
apply analysis methods stated in Section 2. Those components
include hardware and software parts. Hardware components
are processor, core and cache memory. Software components
are tasks and control ow graphs of tasks.</p>
      <p>
        Second, from this model, we applied data ow analysis
presented in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] in order to compute the set of UCB and
3. Scheduling
Simulation
      </p>
      <p>Cache Access
Profile (UCB, ECB)</p>
      <p>
        ECB of each task, which is called cache access pro le in the
sequel. A detailed description of these rst and second steps
could be found in [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], which is a preliminary work toward
cache integration.
      </p>
      <p>Third, system model with computed cache access pro les
are loaded into the scheduling simulator. Scheduling
simulation is done and provides various outcomes such as feasibility
of the system, worst case response time of tasks, number of
preemption, CRPD per task and total CRPD.</p>
      <p>In our work, the cache utilization of a program at step 1
is modeled at both low level, which is the control ow graph
of program produced by a WCET analysis tool, and at high
level, which is a pre-computed set of UCB and ECB. We
expect the user to re-use results produced by a WCET
analysis tool within Cheddar. We can extract all information
related to CRPD from the scheduling simulator as written
above to fully investigate the impact of CRPD. Detailed
implementation of each step is presented in the next section.</p>
    </sec>
    <sec id="sec-4">
      <title>4. IMPLEMENTATION</title>
      <p>In this section, we present the implementation of our
approach. We introduce our framework, discuss about the
development process and point out several implementation
issues. The source code of the presented work is available
under GNU GPL licence at http://beru.univ-brest.fr/
svn/CHEDDAR/trunk/src/.</p>
      <p>
        The work took place in the context of the Cheddar project
[
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. Cheddar is an open source real-time scheduling
analysis tool. Classical feasibility tests, scheduling algorithms and
scheduling simulator for real-time systems are implemented
in Cheddar. System architecture is de ned with Cheddar
Architecture Description Language (Cheddar-ADL).
Cheddar class les are automatically generated by the tool
Platypus [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] through a model-driven process. The Cheddar
metamodel de nes hardware components such as: processor, core
and shared resource; and software components such as: task
and task group [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
      </p>
      <p>
        The development process consisted of three steps. First,
Cheddar-ADL is extended to model RTES with cache
memory and cache access pro le. Second, we implemented UCB
computation by data ow analysis. Third, a cache-aware
scheduling simulator is implemented by extending the
scheduling simulator of Cheddar. In [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ], we presented how
CheddarADL is extended to take into account cache memory and
how data ow analysis in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] is implemented. In this
article, we provide a summary of our models and focus more on
the implementation of the cache-aware scheduling simulator
and its application.
4.1
      </p>
    </sec>
    <sec id="sec-5">
      <title>Cache Model and Cache Access Profile</title>
      <p>To support cache aware scheduling analysis, the Cheddar
meta-model has been extended with the entities below:
The scheduling simulation in Cheddar works as follows.
First, a system architecture model, including
hardware/software components, is loaded. Then, the scheduling is
computed by three successive steps: computing priority,
inserting ready task into queues and electing task. The elected
task will receive the processor for the next unit of time.</p>
      <p>The scheduling simulator records di erent events raised
during the simulation, such as task releases, task
completions and shared resources lockings or unlockings. The
result of the scheduling analysis is the set of events produced
at simulation time.</p>
      <p>We extended the scheduling simulator of Cheddar as
follows. First, we extended the set of events Cheddar can
produce. For example, an event PREEMPTION, which is
raised when a preemption occurs, is added. Second, event
RUNNING TASK, which is raised when a task executes, is
extended with the assumption about CRPD that any partial
execution of a task uses all its ECBs and UCBs.</p>
      <p>The pseudo code of the event handler is written below.
The notation i.cUCB represents the set of UCBs of task
Several issues were raised when designing and
implementing the simulator. Most of them were raised because we need
to mix timing speci cations of di erent orders of magnitude.
Others are related to tools interoperability.</p>
      <p>
        In practice, cache block reload time is signi cantly smaller
than period or capacity of a task. In Cheddar, we do not
prescribe 1 unit of time is equivalent to 1 ms or 1 s, which
are the unit of task period and block reload time. Mixing
timing speci cations of di erent orders of magnitude makes
complex the computation of the feasibility interval. We
recall that a feasibility interval is an interval for which testing
of task feasibility is needed [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ]. The scheduling
simulation interval needed to verify the schedulability of a task set
could be signi cantly large if a s is chosen as a time unit.
A solution in practice is to design a system with harmonic
task set in order to reduce the feasibility interval; however,
it is clearly not always possible. In addition, instead of
using 1 s, we use the cache block reload time as a base value
for 1 unit of time as in our experiment. Furthermore, a
long scheduling simulation interval also raises issues
regarding performance and scalability. Even with harmonic task
set, the tool must be able to perform scheduling simulation
in a large interval to overcome the di erent between cache
block reload time and task period, which may be CPU and
memory expensive. As Cheddar stores scheduling
simulation results into XML les, it can also be I/O intensive. To
reduce memory and I/O overhead, we selected a subset of
events the simulator has to handle and store.
      </p>
      <p>A second issue we are facing is about tool interoperability.
The input data of the CRPD analysis in our tool is designed
to be compatible with data provided by a WCET analysis
tool. We also support import data in XML format. At
the moment, we do not enforce tool interoperability and we
expect to investigate WCET tools in order to overcome this
issue.</p>
    </sec>
    <sec id="sec-6">
      <title>EXPERIMENT AND DISCUSSION</title>
      <p>
        In this section, we show that our tool can handle
parameters compliant with the existing works in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] in
the rst experiment. In addition, we discuss about the
dependency between CRPD and scheduling parameters.
Furthermore, we point out that our tool can run CRPD
optimization techniques by taking an example of memory layout
optimization by simulated annealing following the work of
[
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] in the second experiment. We also provide performance
and scalability tests of the tool in the third experiment.
      </p>
      <p>
        Task period and cache utilization generation of our
experiments is based on the existing work in [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Task sets
are generated with the following con guration. Task
periods are uniformly generated from 5 ms to 500 ms, as found
in most automotive and aerospace hard real-time
applications [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Generated task sets are harmonic in order to have
a low feasibility interval and scheduling simulation period.
Task deadlines are implicit, i.e. 8i : Di = Ti. Processor
utilization values (PU) are generated using the UUniFast
algorithm [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. Task execution times are set based on the
processor utilizations and the generated periods: 8i : Ci = Ui Ti,
where Ui is the processor utilization of task i. Task o sets
are uniformly distributed from 1 to 30 ms.
      </p>
      <p>Cache memory and cache access pro le of tasks are
generated as follows. The cache is direct mapped. The number of
cache blocks is equal to 256. The block-reload time is 8 s.
The cache usage of each task is determined by the number
of ECBs. They are generated using UUniFast algorithm for
a total cache utilization (CU) of 5. UUniFast may produce
values larger than 1 which means a task lls the whole cache.
ECBs of each tasks are consecutively arranged from a cache
set. For each task, the UCBs are generated according to a
uniform distribution ranging from 0 to the number of ECB
multiplied by a reuse factor (RF). If the set of ECBs
generated exceeds the number of cache sets, the set of ECBs is
limited to the number of cache sets. For the generation of
the UCBs, the original set of ECBs is used.
5.1</p>
    </sec>
    <sec id="sec-7">
      <title>CRPD with Priority Assignment and Processor Utilization (PU)</title>
      <p>In this experiment, we present CRPD analysis with
different priority assignments or scheduling algorithms. In
addition, we discuss about the impact of changing priority
assignment/scheduling algorithm and increasing PU to CRPD.</p>
      <p>PU is varied from 0.5 to 0.95 in steps of 0.05. RF is xed
at 0.3. For each value of PU, we performed scheduling
simulations with 100 task set and computed the average
number of preemptions and average total CRPD in a scheduling
interval of 1000 ms. Experiments are conducted with two
priority assignment algorithms: Rate Monotonic (RM) and
one named PA*, which assigns the highest priority level to
the task with the largest set of UCB. In addition, we also
take into account Earliest Deadline First (EDF) scheduler.</p>
      <p>The result of this experiment is Fig. 2. As the graph
illustrates, the number of preemptions and the preemption
cost increases steadily from the processor utilization of 50%
50 55 60 65 70 75 80 85 90 95</p>
      <p>Processor Utilization (%)
RM (#)
RM (CRPD)</p>
      <p>EDF (#)
EDF (CRPD)</p>
      <p>PA*(#)</p>
      <p>PA*(CRPD)
to 80%. After this point, there is a downward trend in the
preemption cost and in the number of preemptions of EDF
while there is an upward trend in those data of RM and
PA*. Observed from the scheduler, when PU is larger than
80%, many task sets are not schedulable.</p>
      <p>
        In conclusion, rst, when PU increases, the total
number of preemption and CRPD also increase. However, the
change is not linear. Second, a priority assignment
algorithm with less number of preemption tends to give lower
total CRPD. EDF and PA* generate less preemption and
CRPD than RM. In fact, to enforce the xed priority order,
the number of preemptions that typically occur under RM
is higher than under EDF [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. From this experiment, we see
that CRPD depends on the chosen priority assignment or
scheduler.
      </p>
      <p>
        In addition, this experiment shows that both scheduling
analysis and CRPD analysis should be performed jointly.
PA*, a priority assignment taking CRPD into account has a
signi cant lower total CRPD. The decrease in total CRPD
of PA* with RM and EDF is roughly 30ms on a scheduling
interval of 1000ms. However, comparing to RM and DM,
feasibility constraints of tasks are not satis ed with PA*,
only total CRPD is reduced. In [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ], we proposed a priority
assignment heuristic to take into account both feasibility
constraints and CRPD.
5.2
      </p>
    </sec>
    <sec id="sec-8">
      <title>CRPD with Memory Layout Optimization by Simulated Annealing</title>
      <p>
        The objective of this experiment is to show that users can
use CRPD optimization approaches with our tool. We apply
memory layout optimization by simulated annealing (SA)
based on the work of [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] with our generated task sets. In our
experiment, the objective of SA is to lower the total CRPD
after a scheduling simulation over a scheduling interval of
1000 ms.
      </p>
      <p>For each iteration of SA, we perform a swap in memory
layout between two random tasks. Changes are made to
the layout of tasks in memory, and then mapped to their
cache layout for evaluation. The total CRPD is computed
by scheduling simulation. The optimum layout is the layout
which has the lowest total CRPD. Initial temperature of SA
is 1.0, and after every iteration, it is reduced by multiplying
it by a cooling rate of 0.5 until it reaches the target
temperature of 0.2. Number of iteration for each temperature
is 10.</p>
      <p>The result of this experiment is Fig. 3. From the graph,
we can see the impact of memory layout optimization to
total CRPD. We can reduce roughly 20-30 ms of total CRPD.
To sum up, this experiment shows that our tool allows users
to perform a speci c optimization of CRPD.
50 55 60 65 70 75 80 85 90 95</p>
      <p>Processor Utilization (%)</p>
      <p>RM</p>
      <p>RM-SA
Average Computation Time</p>
      <p>Max Computation Time</p>
      <p>The objective of this experiment is to test the performance
and the scalability of our tool when scheduling simulation
interval increases. In general, there are four factors a ecting
the performance of a scheduling simulator: (1) the number
of tasks, (2) the scheduling simulation interval, (3) the cache
size and (4) the number of events. The rst three factors
depend on a chosen RTES. The number of events depends on
characteristics of the RTES; for example, a higher processor
utilization means a higher number of preemption events. In
this experiment, we choose to test a system model of 10 tasks
and 256 cache blocks. Processor utilization is set to 70 %.
Scheduling simulation is ranging from 100,000 to 1,000,000
units of time where 1 unit = 8 s.</p>
      <p>Fig. 4 displays results of our experiment on a PC with
Intel Core i5-3360 CPU, running Ubuntu 12.04. For each
simulation interval, 100 task sets are generated. We perform
scheduling simulation and compute the max and average
computation time.</p>
      <p>As we can see, while maximum computation time increases
slightly when simulation interval increases, average
computation time only uctuates around 6 seconds. This shows
that the tool is scalable when simulation interval is high.</p>
    </sec>
    <sec id="sec-9">
      <title>RELATED WORKS</title>
      <p>In this section, we present several real-time scheduling
analysis and WCET analysis tools.</p>
      <p>
        MAST[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] is a modeling and analysis suite for real-time
applications. The hardware component abstraction of MAST
model is generic and it includes processing resources and
shared resources. The shared resource component is not
supposed to model a cache memory unit. However, MAST
considers the overhead parameters of the components that
may be used to model CRPD.
      </p>
      <p>
        STORM[
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] and YARTISS [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] are scheduling simulation
tools mainly designed for evaluating and comparing
realtime scheduling algorithm for multiprocessor architectures.
      </p>
      <p>
        SymTA/S[
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] and RealTime-at-Work 1 are model-based
scheduling analysis tools targeting automotive industry. The
hardware components supported in those tools are speci c
to their domains (ECU, CAN and AFDX Networks).
      </p>
      <p>To the best of our knowledge, the support for cache
memory does not exists in the tools above.</p>
      <p>
        SimSo[
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a scheduling simulation tool. It supports cache
sharing on multi-processor systems. It takes into account
impact of the caches through statistical models and also the
direct overheads such as context switches and scheduling
decisions. The memory behavior of a program is modeled
based on Stack Distance Pro les (SDP) - the distribution
of the stack distances for all the memory accesses of a task,
where a stack distance is by de nition the number of unique
cache lines accessed between two consecutive accesses to a
same line [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The di erence between SDP and our model
is that SDP is achieved by on-line monitored counters such
as valgrind [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] while UCB and ECB are achieved by an o
line WCET analysis tools as below. At the moment, there
is no archived comparison between the two. UCB analysis
with scheduling simulation could be more pessimistic but
safer because it takes into account the program point with
largest number of UCB.
      </p>
      <p>
        Several WCET analysis tools allow designers to perform
cache analysis. SymTA/P[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ], HEPTANE[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], Chronos[
        <xref ref-type="bibr" rid="ref16">16</xref>
        ]
and aiT2 are examples of them. UCB computation of a
program is supported by aiT. The analysis of those tools are
based on program path analysis of the control ow graph of
the program. It is compliant with the requirement of our
proposed tool.
      </p>
      <p>
        In conclusion, WCET tools focus on the evaluation of
program's control ow graph to compute the WCET and also a
few tools can compute cache access pro le. The analysis
result could be used as an input for a scheduling analysis tool.
In the domain of real-time scheduling analysis, the support
for cache and CRPD when performing scheduling analysis
is not very well speci ed. As far as we know, only SimSo
clearly supports scheduling simulation with cache analysis
based on SDP. Then, we proposed a tool available to the
community which can either compute cache access pro le
of a task from its control ow graph or re-use
information obtained from a WCET/cache analysis tool to perform
scheduling analysis. Our model is compliant with existing
work in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. In addition, because Cheddar
provides a large set of scheduling analysis methods, we can
fully investigate the dependency between CRPD and other
scheduling parameters in order to either adjust or optimize
a RTES design to meet its timing constraints.
7.
      </p>
    </sec>
    <sec id="sec-10">
      <title>CONCLUSIONS</title>
      <p>In the article, we presented an approach to implement a
cache-aware scheduling simulator. The work was proceeded
in the context of the Cheddar real-time scheduling
analyzer, which is open-source, freely available to researchers
and practitioners that want to investigate scheduling
analysis of RTES with cache memory. Our solution consists of
three parts: modeling cache memory and cache access
prole, implementing several cache analysis methods and
per1RealTime-at-Work, http://www.realtimeatwork.com/
2AbsInt Inc., http://www.absint.com/
forming scheduling simulation. We extended Cheddar to be
able to deal with cache memory and illustrated the
dependency between cache and other scheduling parameters.</p>
      <p>
        There are open problems we are aiming to address in
the future. The rst one concerns the re nement of other
CRPD analysis methods. Several improvements have been
proposed in [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] to reduce the upper-bound of the CRPD.
In addition, we plan to compare our approach with the
approach based on Stack Distance Pro le in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. Second, we are
going to study the problem of feasibility interval of RTES
with cache memory.
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>S.</given-names>
            <surname>Altmeyer</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. I.</given-names>
            <surname>Davis</surname>
          </string-name>
          , and
          <string-name>
            <given-names>C.</given-names>
            <surname>Maiza</surname>
          </string-name>
          .
          <article-title>Improved cache related pre-emption delay aware response time analysis for xed priority pre-emptive systems</article-title>
          .
          <source>Real-Time Systems</source>
          ,
          <volume>48</volume>
          (
          <issue>5</issue>
          ):
          <volume>499</volume>
          {
          <fpage>526</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>S.</given-names>
            <surname>Altmeyer</surname>
          </string-name>
          and
          <string-name>
            <given-names>C. Maiza</given-names>
            <surname>Burguiere</surname>
          </string-name>
          .
          <article-title>Cache-related preemption delay via useful cache blocks: Survey and rede nition</article-title>
          .
          <source>Journal of Systems Architecture</source>
          ,
          <volume>57</volume>
          (
          <issue>7</issue>
          ):
          <volume>707</volume>
          {
          <fpage>719</fpage>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>M.</given-names>
            <surname>Bertogna</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Xhani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Marinoni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Esposito</surname>
          </string-name>
          , and
          <string-name>
            <given-names>G.</given-names>
            <surname>Buttazzo</surname>
          </string-name>
          .
          <article-title>Optimal selection of preemption points to minimize preemption overhead</article-title>
          .
          <source>In Real-Time Systems (ECRTS)</source>
          ,
          <year>2011</year>
          23rd Euromicro Conference on, pages
          <volume>217</volume>
          {
          <fpage>227</fpage>
          . IEEE,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>E.</given-names>
            <surname>Bini</surname>
          </string-name>
          and
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Buttazzo</surname>
          </string-name>
          .
          <article-title>Measuring the performance of schedulability tests</article-title>
          .
          <source>Real-Time Systems</source>
          ,
          <volume>30</volume>
          (
          <issue>1-2</issue>
          ):
          <volume>129</volume>
          {
          <fpage>154</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Busquets-Mataix</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. J.</given-names>
            <surname>Serrano</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Ors</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Gil</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Wellings</surname>
          </string-name>
          .
          <article-title>Adding instruction cache e ect to schedulability analysis of preemptive real-time systems</article-title>
          .
          <source>In Proceedings of the 2nd IEEE Real-Time Technology and Applications Symposium (RTAS)</source>
          , pages
          <fpage>204</fpage>
          {
          <fpage>212</fpage>
          ,
          <year>1996</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>G. C.</given-names>
            <surname>Buttazzo</surname>
          </string-name>
          .
          <article-title>Rate monotonic vs. edf: judgment day</article-title>
          .
          <source>Real-Time Systems</source>
          ,
          <volume>29</volume>
          (
          <issue>1</issue>
          ):5{
          <fpage>26</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Chandarli</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Fauberteau</surname>
          </string-name>
          , D. Masson, S. Midonnet,
          <string-name>
            <given-names>M.</given-names>
            <surname>Qamhieh</surname>
          </string-name>
          , et al.
          <article-title>Yartiss: A tool to visualize, test, compare and evaluate real-time scheduling algorithms</article-title>
          .
          <source>In Proceedings of the 3rd International Workshop on Analysis Tools and Methodologies for Embedded and Real-time Systems</source>
          , pages
          <fpage>21</fpage>
          {
          <fpage>26</fpage>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>M.</given-names>
            <surname>Cheramy</surname>
          </string-name>
          ,
          <string-name>
            <surname>A.-M. Deplanche</surname>
            ,
            <given-names>P.-E.</given-names>
          </string-name>
          <string-name>
            <surname>Hladik</surname>
          </string-name>
          , et al.
          <article-title>Simulation of real-time multiprocessor scheduling with overheads</article-title>
          .
          <source>In International Conference on Simulation and Modeling Methodologies, Technologies and Applications (SIMULTECH)</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Colin</surname>
          </string-name>
          and
          <string-name>
            <given-names>I.</given-names>
            <surname>Puaut</surname>
          </string-name>
          .
          <article-title>Worst-case timing analysis of the rtems real-time operating system</article-title>
          .
          <source>Rapport N^aUe, PI1277</source>
          ,
          <string-name>
            <surname>IRISA</surname>
          </string-name>
          , France,
          <year>1999</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>L.</given-names>
            <surname>Cucu</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Goossens</surname>
          </string-name>
          .
          <article-title>Feasibility intervals for xed-priority real-time scheduling on uniform multiprocessors</article-title>
          .
          <source>In IEEE Conference on Emerging Technologies and Factory Automation (ETFA)</source>
          , pages
          <fpage>397</fpage>
          {
          <fpage>404</fpage>
          ,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>C.</given-names>
            <surname>Fotsing</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Singho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Plantec</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Gaudel</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rubini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. N.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Lemarchand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dissaux</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Legrand</surname>
          </string-name>
          .
          <article-title>Cheddar architecture description language</article-title>
          .
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>M.</given-names>
            <surname>Gonzalez Harbour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Gutierrez</given-names>
            <surname>Garc</surname>
          </string-name>
          <article-title>a</article-title>
          , J. Palencia Gutierrez, and
          <string-name>
            <given-names>J. Drake</given-names>
            <surname>Moyano</surname>
          </string-name>
          .
          <article-title>Mast: Modeling and analysis suite for real time applications</article-title>
          .
          <source>In Real-Time Systems, 13th Euromicro Conference on</source>
          ,
          <year>2001</year>
          ., pages
          <volume>125</volume>
          {
          <fpage>134</fpage>
          . IEEE,
          <year>2001</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>R.</given-names>
            <surname>Henia</surname>
          </string-name>
          , A. Hamann, M. Jersak,
          <string-name>
            <given-names>R.</given-names>
            <surname>Racu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Richter</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Ernst</surname>
          </string-name>
          .
          <article-title>System level performance analysis{the symta/s approach</article-title>
          .
          <source>IEE Proceedings-Computers and Digital Techniques</source>
          ,
          <volume>152</volume>
          (
          <issue>2</issue>
          ):
          <volume>148</volume>
          {
          <fpage>166</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>C.-G. Lee</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          <string-name>
            <surname>Hahn</surname>
          </string-name>
          , Y.
          <string-name>
            <surname>-M. Seo</surname>
            ,
            <given-names>S. L.</given-names>
          </string-name>
          <string-name>
            <surname>Min</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          <string-name>
            <surname>Ha</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          <string-name>
            <surname>Hong</surname>
            ,
            <given-names>C. Y.</given-names>
          </string-name>
          <string-name>
            <surname>Park</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Lee</surname>
            , and
            <given-names>C. S.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
          </string-name>
          .
          <article-title>Analysis of cache-related preemption delay in xed-priority preemptive scheduling</article-title>
          .
          <source>IEEE Transactions on Computers</source>
          ,
          <volume>47</volume>
          (
          <issue>6</issue>
          ):
          <volume>700</volume>
          {
          <fpage>713</fpage>
          ,
          <year>1998</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>C.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ding</surname>
          </string-name>
          , and
          <string-name>
            <given-names>K.</given-names>
            <surname>Shen</surname>
          </string-name>
          .
          <article-title>Quantifying the cost of context switch</article-title>
          .
          <source>In Proceedings of the 2007 workshop on Experimental computer science. ACM</source>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>X.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mitra</surname>
          </string-name>
          ,
          <article-title>and</article-title>
          <string-name>
            <given-names>A.</given-names>
            <surname>Roychoudhury</surname>
          </string-name>
          .
          <article-title>Chronos: A timing analyzer for embedded software</article-title>
          .
          <source>Science of Computer Programming</source>
          ,
          <volume>69</volume>
          (
          <issue>1</issue>
          ):
          <volume>56</volume>
          {
          <fpage>67</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>W.</given-names>
            <surname>Lunniss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Altmeyer</surname>
          </string-name>
          , and
          <string-name>
            <surname>R. I. Davis.</surname>
          </string-name>
          <article-title>Optimising task layout to increase schedulability via reduced cache related pre-emption delays</article-title>
          .
          <source>In Proceedings of the 20th International Conference on Real-Time and Network Systems</source>
          , pages
          <fpage>161</fpage>
          {
          <fpage>170</fpage>
          . ACM,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>R. L.</given-names>
            <surname>Mattson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gecsei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. R.</given-names>
            <surname>Slutz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>and I. L.</given-names>
            <surname>Traiger</surname>
          </string-name>
          .
          <article-title>Evaluation techniques for storage hierarchies</article-title>
          .
          <source>IBM Systems journal, 9</source>
          (
          <issue>2</issue>
          ):
          <volume>78</volume>
          {
          <fpage>117</fpage>
          ,
          <year>1970</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>N.</given-names>
            <surname>Nethercote</surname>
          </string-name>
          and
          <string-name>
            <given-names>J.</given-names>
            <surname>Seward</surname>
          </string-name>
          .
          <article-title>Valgrind: a framework for heavyweight dynamic binary instrumentation</article-title>
          .
          <source>In ACM Sigplan notices</source>
          , volume
          <volume>42</volume>
          , pages
          <fpage>89</fpage>
          {
          <fpage>100</fpage>
          ,
          <year>2007</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>A.</given-names>
            <surname>Plantec</surname>
          </string-name>
          and
          <string-name>
            <given-names>F.</given-names>
            <surname>Singho</surname>
          </string-name>
          .
          <article-title>Refactoring of an ada 95 library with a meta case tool</article-title>
          .
          <source>In ACM SIGAda Ada Letters</source>
          , volume
          <volume>26</volume>
          , pages
          <fpage>61</fpage>
          {
          <fpage>70</fpage>
          . ACM,
          <year>2006</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>F.</given-names>
            <surname>Singho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Legrand</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Nana</surname>
          </string-name>
          , and
          <string-name>
            <given-names>L.</given-names>
            <surname>Marce</surname>
          </string-name>
          .
          <article-title>Cheddar: a exible real time scheduling framework</article-title>
          .
          <source>ACM SIGAda Ada Letters</source>
          ,
          <volume>24</volume>
          (
          <issue>4</issue>
          ):1{
          <issue>8</issue>
          ,
          <year>2004</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>J.</given-names>
            <surname>Staschulat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schliecker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Ernst</surname>
          </string-name>
          .
          <article-title>Scheduling analysis of real-time systems with precise modeling of cache related preemption delay</article-title>
          .
          <source>In Proceedings of the 17th Euromicro Conference on Real-Time Systems (ECRTS)</source>
          , pages
          <fpage>41</fpage>
          {
          <fpage>48</fpage>
          ,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>J.</given-names>
            <surname>Staschulat</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Schliecker</surname>
          </string-name>
          , and
          <string-name>
            <given-names>R.</given-names>
            <surname>Ernst</surname>
          </string-name>
          .
          <article-title>Scheduling analysis of real-time systems with precise modeling of cache related preemption delay</article-title>
          .
          <source>In Euromicro Conference on Real-Time Systems (ECRTS)</source>
          , Palma de Mallorca, Spain,
          <year>2005</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>H. N.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Singho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rubini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Boukhobza</surname>
          </string-name>
          .
          <article-title>Instruction cache in hard real-time systems: modeling and integration in scheduling analysis tools with AADL</article-title>
          .
          <source>In Proceedings of the 12th IEEE International Conference on Embedded and Ubiquitous Computing</source>
          , Milan, Italy,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <given-names>H. N.</given-names>
            <surname>Tran</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Singho</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Rubini</surname>
          </string-name>
          , and
          <string-name>
            <given-names>J.</given-names>
            <surname>Boukhobza</surname>
          </string-name>
          .
          <article-title>Addressing cache related preemption delay in xed priority assignment</article-title>
          .
          <source>In Proceedings of the 20th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA)</source>
          ,
          <year>Luxembourg</year>
          ,
          <year>2015</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>R.</given-names>
            <surname>Urunuela</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Deplanche</surname>
          </string-name>
          , and
          <string-name>
            <given-names>Y.</given-names>
            <surname>Trinquet</surname>
          </string-name>
          .
          <article-title>Storm, a simulation tool for real-time multiprocessor scheduling evaluation</article-title>
          .
          <source>In IEEE Conference on Emerging Technologies and Factory Automation (ETFA)</source>
          , pages
          <fpage>1</fpage>
          <issue>{8</issue>
          ,
          <year>2010</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>