1. INTRODUCTION

Cache-Aware Real-Time Scheduling Simulator: Implementation and Return of Experience

Hai Nam Tran

hai-nam.tran@univ-brest.fr 0

Frank Singhoff

singhoff@univ-brest.fr 0

Stéphane Rubini

rubini@univ-brest.fr 0

Jalil Boukhobza

boukhobza@univ-brest.fr 0 0 Univ. Bretagne Occidentale, UMR 6285, Lab-STICC , F-29200 Brest , France

2015

0 2 7

Evaluating cache related preemption delay (CRPD) in preemptive scheduling context for Real-Time Embedded System (RTES) stays an open issue despite of its practical importance. Indeed, various parameters should be taken into account such as cache utilization of tasks, memory layout of tasks, processor utilization and priority assignment algorithm. In state-of-the-art work, dependencies amongst those parameters are not investigated with precision because of the lack of scheduling analysis tool taking them into account. In this article, we propose a tool to investigate and evaluate scheduling analysis of RTES with cache memory and various scheduling parameters. Both modeling guidelines and implementation is detailed. Implementation is made in Cheddar, an open-source scheduling analyzer, which is freely available; Experiments are conducted in order to illustrate performance and applicability of our tool. Furthermore, we discuss about implementation issues, problems raised and lessons learned from those experiments.

1. INTRODUCTION

Cache memory is a crucial hardware component used to reduce the performance gap between processors and main memories. In the context of real-time embedded system (RTES), the popularization of processors with large size and multi level caches motivates the proposition of veri cation methods handling this hardware component [ 14 ], [ 5 ], [ 2 ].

Scheduling simulation is a classical veri cation method performed at design step of RTES. It provides means to investigate that all timing constraints of a RTES are satis ed. To perform scheduling simulation, several assumptions are usually made in order to simplify system modeling. One of them is that preemption costs are negligible. Preemption cost is the additional time to process interrupts, manipulate task queues and actually perform context switches.

Integrating cache memory in RTES generally enhances the whole performance in term of execution time, but unfortunately it can lead to an increase in preemption cost and execution time variability [ 15 ]. When a task is preempted, memory blocks belonging to the task could be removed from the cache. Once the task resumes, previously removed memory blocks have to be reloaded into the cache. Thus, a new preemption cost named Cache Related Preemption Delay (CRPD) is introduced. By de nition, CRPD is the additional time to re ll the cache with memory blocks evicted by the preemption. In [ 15 ], the authors showed that the cost of context switching raises from the range of 4.2 s to 8.7 s to the range of 38.6 s to 203.2 s when the size of the data sets of programs is larger than the cache. Thus, taking CRPD into account may be important when performing scheduling analysis of a RTES.

Problem statement: Scheduling analysis of RTES with cache memory in xed priority preemptive scheduling context is complex because there are many parameters a ecting the outcome and dependencies amongst them. For example, CRPD analysis, which is the process of evaluating the impact of CRPD on a RTES, cannot be only based on singletask analysis as it depends on correlations amongst tasks. For a set of tasks, CRPD analysis has to take into account parameters such as cache utilizations of tasks, memory layout of tasks, processor utilization but also priority assignment algorithms and WCET tasks.

To address all parameters, many tools are involved, in two domains: WCET/Cache analysis and scheduling analysis. Each of those domains is only dedicated to a sub-part of those parameters. Unfortunately, there are no tools addressing the whole problem in state-of-the-art work.

Contributions : We propose a tool to investigate and evaluate scheduling analysis of RTES with cache memory and various scheduling parameters. The work is implemented in Cheddar, an open-source scheduling analyzer, which is freely available to researchers and practitioners. We propose an approach to use the analysis models and results of WCET/cache analysis tools into a scheduling analysis tool. The programming model we used is compliant with the existing one in [ 14 ], [ 5 ], [ 2 ] and [ 17 ]. Experiments are performed to illustrate performance and applicability of our tool.

The rest of the article is organized as follows. Section 2 presents background of our work. In Section 3, we give an overview of our approach. In Section 4, we present development process and detailed information about the implementation of our work. In Section 5, an evaluation of our proposed scheduling simulator is given. Section 6 discusses related works and Section 7 concludes the article. 2.

BACKGROUND

In this section, we introduce the system model and we explain how preemption cost and CRPD are computed. We assume an uniprocessor RTES with direct-mapped instruction cache. As far as we know, instruction cache is popular in practical implementation of RTES. The assumption about direct-mapped is used to simplify the data ow analysis, it could be easily relaxed. There are n independent tasks, 1. Modeling System Model (Cheddar-ADL) 2. Data Flow

Analysis : Input

: Output 1; 2; :::; n scheduled by a preemptive scheduler. CPRD is bounded by g BRT , where g is an upper bound on the number of cache block reloaded due to preemption, and BRT is an upper-bound on the time necessary to reload a memory block in the cache (block reload time). To analyze the e ect of preemption on a preempted task, Lee et al.[ 14 ] introduced the concept of useful cache block (UCB ):

De nition 1. A memory block m is called a useful cache block (UCB) at program point P , if m may be cached at P and m may be reused at program point P 0 after P that may be reached from P without eviction of m on this path. The number of UCB at program point P gives an upper bound on the number of additional reloads due to a preemption at P . The maximum possible preemption cost for a task is determined by the program point with the highest number of UCB. In [ 22 ], the authors exploit the fact that for the i-th preemption, only the i-th highest number of UCB has to be considered. However, as shown in [ 1 ] and [ 3 ], a signi cant reduction typically only occurs at a high number of preemptions. Thus, we only consider the program point with highest number of UCB.

The impact of preempting task is given by the number of cache blocks that the task may evict during its execution. Busquet et al.[ 5 ] introduce the concept of evicting cache block (ECB ):

De nition 2. A memory block of the preempting task is called an evicting cache block (ECB), if it is accessed during the execution of the preempting task.

The notation UCBi and ECBi are used to present the set of UCBs and ECBs of a task i. Assume that the sets of UCB and ECB of each task are preliminary computed, UCB'i is the set of UCBs currently in the cache of the preempted task. i;j is the preemption cost (i.e. the CRPD) when a task j directly preempt task i. In case of a preemption between two tasks i and j, i;j is computed by: i;j = BRT j UCB'i \ ECBj j (1) However, in case of nested preemptions, a task j can preempt more than one task. Thus, computation of CRPD must take all preempted tasks into account.

OUR APPROACH

Our approach consists of three steps as shown in Fig. 1. First, we modeled a RTES with components required to apply analysis methods stated in Section 2. Those components include hardware and software parts. Hardware components are processor, core and cache memory. Software components are tasks and control ow graphs of tasks.

Second, from this model, we applied data ow analysis presented in [ 14 ] in order to compute the set of UCB and 3. Scheduling Simulation

Cache Access Profile (UCB, ECB)

ECB of each task, which is called cache access pro le in the sequel. A detailed description of these rst and second steps could be found in [ 24 ], which is a preliminary work toward cache integration.

Third, system model with computed cache access pro les are loaded into the scheduling simulator. Scheduling simulation is done and provides various outcomes such as feasibility of the system, worst case response time of tasks, number of preemption, CRPD per task and total CRPD.

In our work, the cache utilization of a program at step 1 is modeled at both low level, which is the control ow graph of program produced by a WCET analysis tool, and at high level, which is a pre-computed set of UCB and ECB. We expect the user to re-use results produced by a WCET analysis tool within Cheddar. We can extract all information related to CRPD from the scheduling simulator as written above to fully investigate the impact of CRPD. Detailed implementation of each step is presented in the next section.

4. IMPLEMENTATION

In this section, we present the implementation of our approach. We introduce our framework, discuss about the development process and point out several implementation issues. The source code of the presented work is available under GNU GPL licence at http://beru.univ-brest.fr/ svn/CHEDDAR/trunk/src/.

The work took place in the context of the Cheddar project [ 21 ]. Cheddar is an open source real-time scheduling analysis tool. Classical feasibility tests, scheduling algorithms and scheduling simulator for real-time systems are implemented in Cheddar. System architecture is de ned with Cheddar Architecture Description Language (Cheddar-ADL). Cheddar class les are automatically generated by the tool Platypus [ 20 ] through a model-driven process. The Cheddar metamodel de nes hardware components such as: processor, core and shared resource; and software components such as: task and task group [ 11 ].

The development process consisted of three steps. First, Cheddar-ADL is extended to model RTES with cache memory and cache access pro le. Second, we implemented UCB computation by data ow analysis. Third, a cache-aware scheduling simulator is implemented by extending the scheduling simulator of Cheddar. In [ 24 ], we presented how CheddarADL is extended to take into account cache memory and how data ow analysis in [ 14 ] is implemented. In this article, we provide a summary of our models and focus more on the implementation of the cache-aware scheduling simulator and its application. 4.1

Cache Model and Cache Access Profile

To support cache aware scheduling analysis, the Cheddar meta-model has been extended with the entities below: The scheduling simulation in Cheddar works as follows. First, a system architecture model, including hardware/software components, is loaded. Then, the scheduling is computed by three successive steps: computing priority, inserting ready task into queues and electing task. The elected task will receive the processor for the next unit of time.

The scheduling simulator records di erent events raised during the simulation, such as task releases, task completions and shared resources lockings or unlockings. The result of the scheduling analysis is the set of events produced at simulation time.

We extended the scheduling simulator of Cheddar as follows. First, we extended the set of events Cheddar can produce. For example, an event PREEMPTION, which is raised when a preemption occurs, is added. Second, event RUNNING TASK, which is raised when a task executes, is extended with the assumption about CRPD that any partial execution of a task uses all its ECBs and UCBs.

The pseudo code of the event handler is written below. The notation i.cUCB represents the set of UCBs of task Several issues were raised when designing and implementing the simulator. Most of them were raised because we need to mix timing speci cations of di erent orders of magnitude. Others are related to tools interoperability.

In practice, cache block reload time is signi cantly smaller than period or capacity of a task. In Cheddar, we do not prescribe 1 unit of time is equivalent to 1 ms or 1 s, which are the unit of task period and block reload time. Mixing timing speci cations of di erent orders of magnitude makes complex the computation of the feasibility interval. We recall that a feasibility interval is an interval for which testing of task feasibility is needed [ 10 ]. The scheduling simulation interval needed to verify the schedulability of a task set could be signi cantly large if a s is chosen as a time unit. A solution in practice is to design a system with harmonic task set in order to reduce the feasibility interval; however, it is clearly not always possible. In addition, instead of using 1 s, we use the cache block reload time as a base value for 1 unit of time as in our experiment. Furthermore, a long scheduling simulation interval also raises issues regarding performance and scalability. Even with harmonic task set, the tool must be able to perform scheduling simulation in a large interval to overcome the di erent between cache block reload time and task period, which may be CPU and memory expensive. As Cheddar stores scheduling simulation results into XML les, it can also be I/O intensive. To reduce memory and I/O overhead, we selected a subset of events the simulator has to handle and store.

A second issue we are facing is about tool interoperability. The input data of the CRPD analysis in our tool is designed to be compatible with data provided by a WCET analysis tool. We also support import data in XML format. At the moment, we do not enforce tool interoperability and we expect to investigate WCET tools in order to overcome this issue.

EXPERIMENT AND DISCUSSION

In this section, we show that our tool can handle parameters compliant with the existing works in [ 14 ], [ 5 ], [ 2 ] in the rst experiment. In addition, we discuss about the dependency between CRPD and scheduling parameters. Furthermore, we point out that our tool can run CRPD optimization techniques by taking an example of memory layout optimization by simulated annealing following the work of [ 17 ] in the second experiment. We also provide performance and scalability tests of the tool in the third experiment.

Task period and cache utilization generation of our experiments is based on the existing work in [ 1 ]. Task sets are generated with the following con guration. Task periods are uniformly generated from 5 ms to 500 ms, as found in most automotive and aerospace hard real-time applications [ 1 ]. Generated task sets are harmonic in order to have a low feasibility interval and scheduling simulation period. Task deadlines are implicit, i.e. 8i : Di = Ti. Processor utilization values (PU) are generated using the UUniFast algorithm [ 4 ]. Task execution times are set based on the processor utilizations and the generated periods: 8i : Ci = Ui Ti, where Ui is the processor utilization of task i. Task o sets are uniformly distributed from 1 to 30 ms.

Cache memory and cache access pro le of tasks are generated as follows. The cache is direct mapped. The number of cache blocks is equal to 256. The block-reload time is 8 s. The cache usage of each task is determined by the number of ECBs. They are generated using UUniFast algorithm for a total cache utilization (CU) of 5. UUniFast may produce values larger than 1 which means a task lls the whole cache. ECBs of each tasks are consecutively arranged from a cache set. For each task, the UCBs are generated according to a uniform distribution ranging from 0 to the number of ECB multiplied by a reuse factor (RF). If the set of ECBs generated exceeds the number of cache sets, the set of ECBs is limited to the number of cache sets. For the generation of the UCBs, the original set of ECBs is used. 5.1

CRPD with Priority Assignment and Processor Utilization (PU)

In this experiment, we present CRPD analysis with different priority assignments or scheduling algorithms. In addition, we discuss about the impact of changing priority assignment/scheduling algorithm and increasing PU to CRPD.

PU is varied from 0.5 to 0.95 in steps of 0.05. RF is xed at 0.3. For each value of PU, we performed scheduling simulations with 100 task set and computed the average number of preemptions and average total CRPD in a scheduling interval of 1000 ms. Experiments are conducted with two priority assignment algorithms: Rate Monotonic (RM) and one named PA*, which assigns the highest priority level to the task with the largest set of UCB. In addition, we also take into account Earliest Deadline First (EDF) scheduler.

The result of this experiment is Fig. 2. As the graph illustrates, the number of preemptions and the preemption cost increases steadily from the processor utilization of 50% 50 55 60 65 70 75 80 85 90 95

Processor Utilization (%) RM (#) RM (CRPD)

EDF (#) EDF (CRPD)

PA*(#)

PA*(CRPD) to 80%. After this point, there is a downward trend in the preemption cost and in the number of preemptions of EDF while there is an upward trend in those data of RM and PA*. Observed from the scheduler, when PU is larger than 80%, many task sets are not schedulable.

In conclusion, rst, when PU increases, the total number of preemption and CRPD also increase. However, the change is not linear. Second, a priority assignment algorithm with less number of preemption tends to give lower total CRPD. EDF and PA* generate less preemption and CRPD than RM. In fact, to enforce the xed priority order, the number of preemptions that typically occur under RM is higher than under EDF [ 6 ]. From this experiment, we see that CRPD depends on the chosen priority assignment or scheduler.

In addition, this experiment shows that both scheduling analysis and CRPD analysis should be performed jointly. PA*, a priority assignment taking CRPD into account has a signi cant lower total CRPD. The decrease in total CRPD of PA* with RM and EDF is roughly 30ms on a scheduling interval of 1000ms. However, comparing to RM and DM, feasibility constraints of tasks are not satis ed with PA*, only total CRPD is reduced. In [ 25 ], we proposed a priority assignment heuristic to take into account both feasibility constraints and CRPD. 5.2

CRPD with Memory Layout Optimization by Simulated Annealing

The objective of this experiment is to show that users can use CRPD optimization approaches with our tool. We apply memory layout optimization by simulated annealing (SA) based on the work of [ 17 ] with our generated task sets. In our experiment, the objective of SA is to lower the total CRPD after a scheduling simulation over a scheduling interval of 1000 ms.

For each iteration of SA, we perform a swap in memory layout between two random tasks. Changes are made to the layout of tasks in memory, and then mapped to their cache layout for evaluation. The total CRPD is computed by scheduling simulation. The optimum layout is the layout which has the lowest total CRPD. Initial temperature of SA is 1.0, and after every iteration, it is reduced by multiplying it by a cooling rate of 0.5 until it reaches the target temperature of 0.2. Number of iteration for each temperature is 10.

The result of this experiment is Fig. 3. From the graph, we can see the impact of memory layout optimization to total CRPD. We can reduce roughly 20-30 ms of total CRPD. To sum up, this experiment shows that our tool allows users to perform a speci c optimization of CRPD. 50 55 60 65 70 75 80 85 90 95

Processor Utilization (%)

RM-SA Average Computation Time

Max Computation Time

The objective of this experiment is to test the performance and the scalability of our tool when scheduling simulation interval increases. In general, there are four factors a ecting the performance of a scheduling simulator: (1) the number of tasks, (2) the scheduling simulation interval, (3) the cache size and (4) the number of events. The rst three factors depend on a chosen RTES. The number of events depends on characteristics of the RTES; for example, a higher processor utilization means a higher number of preemption events. In this experiment, we choose to test a system model of 10 tasks and 256 cache blocks. Processor utilization is set to 70 %. Scheduling simulation is ranging from 100,000 to 1,000,000 units of time where 1 unit = 8 s.

Fig. 4 displays results of our experiment on a PC with Intel Core i5-3360 CPU, running Ubuntu 12.04. For each simulation interval, 100 task sets are generated. We perform scheduling simulation and compute the max and average computation time.

As we can see, while maximum computation time increases slightly when simulation interval increases, average computation time only uctuates around 6 seconds. This shows that the tool is scalable when simulation interval is high.

RELATED WORKS

In this section, we present several real-time scheduling analysis and WCET analysis tools.

MAST[ 12 ] is a modeling and analysis suite for real-time applications. The hardware component abstraction of MAST model is generic and it includes processing resources and shared resources. The shared resource component is not supposed to model a cache memory unit. However, MAST considers the overhead parameters of the components that may be used to model CRPD.

STORM[ 26 ] and YARTISS [ 7 ] are scheduling simulation tools mainly designed for evaluating and comparing realtime scheduling algorithm for multiprocessor architectures.

SymTA/S[ 13 ] and RealTime-at-Work 1 are model-based scheduling analysis tools targeting automotive industry. The hardware components supported in those tools are speci c to their domains (ECU, CAN and AFDX Networks).

To the best of our knowledge, the support for cache memory does not exists in the tools above.

SimSo[ 8 ] is a scheduling simulation tool. It supports cache sharing on multi-processor systems. It takes into account impact of the caches through statistical models and also the direct overheads such as context switches and scheduling decisions. The memory behavior of a program is modeled based on Stack Distance Pro les (SDP) - the distribution of the stack distances for all the memory accesses of a task, where a stack distance is by de nition the number of unique cache lines accessed between two consecutive accesses to a same line [ 18 ]. The di erence between SDP and our model is that SDP is achieved by on-line monitored counters such as valgrind [ 19 ] while UCB and ECB are achieved by an o line WCET analysis tools as below. At the moment, there is no archived comparison between the two. UCB analysis with scheduling simulation could be more pessimistic but safer because it takes into account the program point with largest number of UCB.

Several WCET analysis tools allow designers to perform cache analysis. SymTA/P[ 23 ], HEPTANE[ 9 ], Chronos[ 16 ] and aiT2 are examples of them. UCB computation of a program is supported by aiT. The analysis of those tools are based on program path analysis of the control ow graph of the program. It is compliant with the requirement of our proposed tool.

In conclusion, WCET tools focus on the evaluation of program's control ow graph to compute the WCET and also a few tools can compute cache access pro le. The analysis result could be used as an input for a scheduling analysis tool. In the domain of real-time scheduling analysis, the support for cache and CRPD when performing scheduling analysis is not very well speci ed. As far as we know, only SimSo clearly supports scheduling simulation with cache analysis based on SDP. Then, we proposed a tool available to the community which can either compute cache access pro le of a task from its control ow graph or re-use information obtained from a WCET/cache analysis tool to perform scheduling analysis. Our model is compliant with existing work in [ 14 ], [ 5 ], [ 2 ] and [ 17 ]. In addition, because Cheddar provides a large set of scheduling analysis methods, we can fully investigate the dependency between CRPD and other scheduling parameters in order to either adjust or optimize a RTES design to meet its timing constraints. 7.

CONCLUSIONS

In the article, we presented an approach to implement a cache-aware scheduling simulator. The work was proceeded in the context of the Cheddar real-time scheduling analyzer, which is open-source, freely available to researchers and practitioners that want to investigate scheduling analysis of RTES with cache memory. Our solution consists of three parts: modeling cache memory and cache access prole, implementing several cache analysis methods and per1RealTime-at-Work, http://www.realtimeatwork.com/ 2AbsInt Inc., http://www.absint.com/ forming scheduling simulation. We extended Cheddar to be able to deal with cache memory and illustrated the dependency between cache and other scheduling parameters.

There are open problems we are aiming to address in the future. The rst one concerns the re nement of other CRPD analysis methods. Several improvements have been proposed in [ 2 ] to reduce the upper-bound of the CRPD. In addition, we plan to compare our approach with the approach based on Stack Distance Pro le in [ 8 ]. Second, we are going to study the problem of feasibility interval of RTES with cache memory.

[1]

Altmeyer ,

R. I.

Davis , and

Maiza . Improved cache related pre-emption delay aware response time analysis for xed priority pre-emptive systems . Real-Time Systems , 48 ( 5 ): 499 { 526 , 2012 .

[2]

Altmeyer and

C. Maiza

Burguiere . Cache-related preemption delay via useful cache blocks: Survey and rede nition . Journal of Systems Architecture , 57 ( 7 ): 707 { 719 , 2011 .

[3]

Bertogna ,

Xhani ,

Marinoni ,

Esposito , and

Buttazzo . Optimal selection of preemption points to minimize preemption overhead . In Real-Time Systems (ECRTS) , 2011 23rd Euromicro Conference on, pages 217 { 227 . IEEE, 2011 .

[4]

Bini and

G. C.

Buttazzo . Measuring the performance of schedulability tests . Real-Time Systems , 30 ( 1-2 ): 129 { 154 , 2005 .

[5]

J. V.

Busquets-Mataix ,

J. J.

Serrano ,

Ors ,

Gil , and

Wellings . Adding instruction cache e ect to schedulability analysis of preemptive real-time systems . In Proceedings of the 2nd IEEE Real-Time Technology and Applications Symposium (RTAS) , pages 204 { 212 , 1996 .

[6]

G. C.

Buttazzo . Rate monotonic vs. edf: judgment day . Real-Time Systems , 29 ( 1 ):5{ 26 , 2005 .

[7]

Chandarli ,

Fauberteau , D. Masson, S. Midonnet,

Qamhieh , et al. Yartiss: A tool to visualize, test, compare and evaluate real-time scheduling algorithms . In Proceedings of the 3rd International Workshop on Analysis Tools and Methodologies for Embedded and Real-time Systems , pages 21 { 26 , 2012 .

[8]

Cheramy , A.-M. Deplanche , P.-E. Hladik , et al. Simulation of real-time multiprocessor scheduling with overheads . In International Conference on Simulation and Modeling Methodologies, Technologies and Applications (SIMULTECH) , 2013 .

[9]

Colin and

Puaut . Worst-case timing analysis of the rtems real-time operating system . Rapport N^aUe, PI1277 , IRISA , France, 1999 .

[10]

Cucu and

Goossens . Feasibility intervals for xed-priority real-time scheduling on uniform multiprocessors . In IEEE Conference on Emerging Technologies and Factory Automation (ETFA) , pages 397 { 404 , 2006 .

[11]

Fotsing ,

Singho ,

Plantec ,

Gaudel ,

Rubini ,

Li ,

H. N.

Tran ,

Lemarchand ,

Dissaux , and

Legrand . Cheddar architecture description language . 2014 .

[12]

Gonzalez Harbour ,

J. Gutierrez

Garc a , J. Palencia Gutierrez, and

J. Drake

Moyano . Mast: Modeling and analysis suite for real time applications . In Real-Time Systems, 13th Euromicro Conference on , 2001 ., pages 125 { 134 . IEEE, 2001 .

[13]

Henia , A. Hamann, M. Jersak,

Racu ,

Richter , and

Ernst . System level performance analysis{the symta/s approach . IEE Proceedings-Computers and Digital Techniques , 152 ( 2 ): 148 { 166 , 2005 .

[14] C.-G. Lee , H. Hahn , Y. -M. Seo , S. L.

Min , R.

Ha , S.

Hong , C. Y.

Park , M.

Lee , and C. S.

Kim . Analysis of cache-related preemption delay in xed-priority preemptive scheduling . IEEE Transactions on Computers , 47 ( 6 ): 700 { 713 , 1998 .

[15]

Li ,

Ding , and

Shen . Quantifying the cost of context switch . In Proceedings of the 2007 workshop on Experimental computer science. ACM , 2007 .

[16]

Li ,

Liang ,

Mitra , and

Roychoudhury . Chronos: A timing analyzer for embedded software . Science of Computer Programming , 69 ( 1 ): 56 { 67 , 2007 .

[17]

Lunniss ,

Altmeyer , and R. I. Davis. Optimising task layout to increase schedulability via reduced cache related pre-emption delays . In Proceedings of the 20th International Conference on Real-Time and Network Systems , pages 161 { 170 . ACM, 2012 .

[18]

R. L.

Mattson ,

Gecsei ,

D. R.

Slutz ,

and I. L.

Traiger . Evaluation techniques for storage hierarchies . IBM Systems journal, 9 ( 2 ): 78 { 117 , 1970 .

[19]

Nethercote and

Seward . Valgrind: a framework for heavyweight dynamic binary instrumentation . In ACM Sigplan notices , volume 42 , pages 89 { 100 , 2007 .

[20]

Plantec and

Singho . Refactoring of an ada 95 library with a meta case tool . In ACM SIGAda Ada Letters , volume 26 , pages 61 { 70 . ACM, 2006 .

[21]

Singho ,

Legrand ,

Nana , and

Marce . Cheddar: a exible real time scheduling framework . ACM SIGAda Ada Letters , 24 ( 4 ):1{ 8 , 2004 .

[22]

Staschulat ,

Schliecker , and

Ernst . Scheduling analysis of real-time systems with precise modeling of cache related preemption delay . In Proceedings of the 17th Euromicro Conference on Real-Time Systems (ECRTS) , pages 41 { 48 , 2005 .

[23]

Staschulat ,

Schliecker , and

Ernst . Scheduling analysis of real-time systems with precise modeling of cache related preemption delay . In Euromicro Conference on Real-Time Systems (ECRTS) , Palma de Mallorca, Spain, 2005 .

[24]

H. N.

Tran ,

Singho ,

Rubini , and

Boukhobza . Instruction cache in hard real-time systems: modeling and integration in scheduling analysis tools with AADL . In Proceedings of the 12th IEEE International Conference on Embedded and Ubiquitous Computing , Milan, Italy, 2014 .

[25]

H. N.

Tran ,

Singho ,

Rubini , and

Boukhobza . Addressing cache related preemption delay in xed priority assignment . In Proceedings of the 20th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA) , Luxembourg , 2015 .

[26]

Urunuela ,

Deplanche , and

Trinquet . Storm, a simulation tool for real-time multiprocessor scheduling evaluation . In IEEE Conference on Emerging Technologies and Factory Automation (ETFA) , pages 1 {8 , 2010 .