=Paper= {{Paper |id=Vol-1464/ewili15_16 |storemode=property |title=Abstract Timers and their Implementation onto the ARM Cortex-M family of MCUs |pdfUrl=https://ceur-ws.org/Vol-1464/ewili15_16.pdf |volume=Vol-1464 |dblpUrl=https://dblp.org/rec/conf/ewili/LindgrenLLFPP15 }} ==Abstract Timers and their Implementation onto the ARM Cortex-M family of MCUs== https://ceur-ws.org/Vol-1464/ewili15_16.pdf
                     Abstract Timers and their Implementation
                      onto the ARM Cortex-M family of MCUs
 Per Lindgren, Emil Fresk, Marcus Lindner and Andreas Lindner Luleå University of Technology
            Email:{per.lindgren, emil.fresk, marcus.lindner, andreas.lindner}@ltu.se
                             David Pereira and Luís Miguel Pinho
                CISTER / INESC TEC, ISEP Email: {dmrpe,lmp}@isep.ipp.pt
ABSTRACT                                                                 amendable to both static and run-time verification. The
Real-Time For the Masses (RTFM) is a set of languages and                supporting rtfm-core compiler produces C code that com-
tools being developed to facilitate embedded software devel-             piled together with a RTFM run-time system renders an exe-
opment and provide highly efficient implementations geared               cutable. The RTFM-kernel is an architecture targeting bare
to static verification. The RTFM-kernel is an architecture               metal (single-core) platforms designed to provide highly ef-
designed to provide highly efficient and predicable Stack Re-            ficient and predicable Stack Resource Policy (SRP) based
source Policy based scheduling, targeting bare metal (single-            scheduling by exploiting the underlying interrupt hardware.
core) platforms.                                                            However, in prior work no kernel support was given for
   We contribute beyond prior work by introducing a plat-                asynchronous tasks with timing offsets. In this paper we
form independent timer abstraction that relies on existing               address this problem with the goal to provide a transparent,
RTFM-kernel primitives. We develop two alternative im-                   abstract, and generic way of managing timer queue(s) and
plementations for the ARM Cortex-M family of MCUs: a                     underlying hardware timer(s). Transparent w.r.t its use, i.e.
generic implementation, using the ARM defined SysTick-                   the programmer should not need to think in terms of hard-
/DWT hardware; and a target specific implementation, us-                 ware timers when specifying the application at hand. Ab-
ing the match compare/free running timers. While sacri-                  stract in terms of the RTFM-kernel, (the obligation of the
ficing generality, the latter is more flexible and may reduce            kernel is merely to manage scheduling) thus we seek a solu-
overall overhead. Invariants for correctness are presented,              tion where the kernel itself is free of dependencies both to
and methods to static and run-time verification are dis-                 timer queue implementations and timer hardware specifics.
cussed. Overhead is bound and characterized. In both cases               Furthermore, the solution should be generic enough to cover
the critical section from release time to dispatch is less than          a broad range of embedded platforms with little or no effort
2us on a 100MHz MCU. Queue and timer mechanisms are                      of porting. Additional requirements for robustness, perfor-
directly implemented in the RTFM-core language and can                   mance and predictability are efficient, bound time imple-
be included in system-wide scheduling analysis.                          mentations, complying with the task and resource model of
                                                                         SRP, along with invariants for correctness.
                                                                            In this paper we contribute beyond prior work by intro-
1.    INTRODUCTION                                                       ducing a platform independent timer abstraction that relies
   In the mainstream of embedded programming C-code still                on the existing kernel primitives. The proposed abstrac-
remains the predominant means for software development.                  tion allows application and target specific implementations
To facilitate the development a vast number of light-weight              of timer queues and timer handlers. The timer handlers are
operating systems are available, e.g., FreeRTOS [1], ChibiOS             treated as ordinary tasks in the system, while each queue is
[2], and RIOT [3] and for larger platforms, Linux/POSIX                  managed under protection of a critical section (resource) in
based and Win32 derivates. In common, they provide a                     the system.
thread based concurrency model, where the programmer has                    Requirements to support abstract timers with respect to
to take the full responsibility of coordinating scheduling and           analysis and code generation in the rtfm-core compiler
resource management as very little support is given by the               are discussed along with their performance implications. As
programming models and supporting tools [4].                             a proof of concept, we develop and characterize two alter-
   In this paper, we explore a language based approach. The              native timer implementations for the ARM Cortex-M fam-
reactive programming model of RTFM-core (-core in the fol-               ily of MCUs: a generic (single queue/handler) implementa-
lowing) provides tasks with timing constraints and critical              tion using the ARM defined SysTick/DWT hardware), and a
sections (treated as single-unit resources). As such -core               multi-queue/handler implementation exploiting vendor spe-
provides a model suitable to specify the timely behavior of              cific match-compare/free running timer hardware.
the embedded software, as well as a formal underpinning                     Our experimental results indicate that for both generic
                                                                         and vendor specific timers the critical section from task re-
This work was partially supported by Portuguese National Funds through   lease time to dispatch is less than 2us on a 100MHz MCU.
FCT (Portuguese Foundation for Science and Technology) and by ERDF       We show that the vendor specific timers can be exploited to
(European Regional Development Fund) through COMPETE (Opera-             reduce latency, total overhead and priority inversion in the
tional Programme âĂŸThematic Factors of CompetitivenessâĂŹ), within   system. Furthermore, we discuss the outsets for SRP based
project FCOMP-01-0124-FEDER-037281 (CISTER); and by FCT and EU
                                                                         analysis of programs scheduled by virtual timers under the
ARTEMIS JU, within project ARTEMIS/0001/2013, JU grant nr. 621429
(EMC2)                                                                   RTFM-kernel.
                                                                            Finally we present ongoing and future undertakings and
EWiLi’15, October 8th, 2015, Amsterdam, The Netherlands.                 sum up the presented contributions to conclude the work.
Copyright retained by the authors.
2.     THE RTFM-CORE LANGUAGE                                       Definition We introduce a mapping M from virtual timers
  The RTFM-core language is based on the notions of tasks           V T ’s to physical timers P T ’s, allocated on the target hard-
and resources in correspondence to the Stack Resource Pol-          ware.
icy (SRP) model defined in [5]. For a detailed description
                                                                    A physical timer is shared if M (V Ti ) = M (V Tj ), i 6= j.
on the original work on -core we refer the reader to [6]. Here
                                                                    We have the two edge cases, when M is a 1-1 (complete)
we give a brief overview.
                                                                    mapping between virtual and physical timers, and the case
2.1      RTFM-core programming model                                when we have a single (shared) physical timer.
   In -core tasks execute concurrently and run-to-completion.       Definition For a physical timer P Ti , we denote bw(P Ti ) as
A task may request asynchronous execution of other tasks            the bit-width and f (P Ti ) as the frequency of operation (in
and claim (named) single-unit resource(s) for the duration          Hz), ra(P Ti ) as the range of the timer (in seconds), derived
of critical section(s) in a nested manner. Functionality is         from 2bw /f , and pr(P Ti ) as the precision of the timer (in
expressed using ordinary C-code. In recent work [7] the             seconds), given as pr = 1/f .
-core language has been extended to provide messages (task
execution requests with timing offsets):                            E.g., the range is given by 2bw(P Ti ) /f , e.g. a 32-bit timer
   async after X before Y t(...), where X, defines the              operating a 1MHz gives a range of 232 /1 ∗ 106 Hz = 4295s,
(baseline) offset from the release time of the sender (base-        with a precision of 1 ∗ 10−6 s = 1us.
line); Y , gives the relative deadline and t is the identifier of
the task to execute.                                                3.2     ARM Cortex-M defined timers
                                                                      The Cortex-M range of MCUs share the ARM defined core
2.2      RTFM-kernel design                                         providing a 24-bit SysTick timer and a 32-bit debug timer
   In short each task is implemented directly as an inter-          (defined in the DWT unit).
rupt handler bound to the interrupt vector. Requesting
a task for execution amounts to pending the correspond-             3.2.1    SysTick timer
ing interrupt, while claiming a resource for a critical sec-           The SysTick timer is provided in order to generate peri-
tion amounts to manipulating the interrupt hardware such            odic interrupts. When enabled, it counts downwards, and
to reflect the semantics of the system ceiling under SRP.           when transitioning from 1 to 0 it sets a flag and (option-
The RTFM-kernel encapsulate the operations required for             ally) generates a SysTick interrupt. On zero, it assumes the
SRP based scheduling in a minimalistic API implemented              value of the RELOAD register, hence a periodic behavior can
as C-code macros. Those of interest for the presentation            be achieved with a minimal of programming effort. The cur-
are: RTFM_pend(i), which requests execution of the cor-             rent counter value (CURRENT) can be read, while a write to
responding task i; RTFM_lock(c), which reads and stores             CURRENT, forces CURRENT = RELOAD. The frequency of op-
the old ceiling value on the stack and sets the new ceiling;        eration, is determined by setting the clock source (core/ex-
and finally, RTFM_unlock(c), which restores the old ceil-           ternal). (Some implementations provide the option to pre-
ing value from the stack.                                           scale the core clock, e.g., /8.) The priority of the SysTick
   Currently the scheduling primitives have been implemented        interrupt is programmable, and an interrupt can be pended
for the ARM Cortex-M range of MCUs [8]. The system ceil-            by setting the PENDSTSET bit in the ICSR (Interrupt Con-
ing is enforced either through interrupt masking (M0/M0+),          trol and State Register). The SysTick timer is stopped when
or through (atomic) accesses to the NVIC BASEPRI register           the processor is halted during debug.
(M3 and above).
                                                                    3.2.2    Debug timer
2.3      RTFM-core compiler                                            The debug unit (DWT) provides a 32-bit, free running
  The rtfm-core compiler analyses the declarative (static)          cycle count register (DWT_CYCCNT). However, the DWT is
task, resource and communication structure, and generates           instrumental for providing debugging support, and hence
a C-code output referring to the RTFM-kernel primitives.            not free to arbitrary use. However we can safely enable and
Code generation and kernel primitives can be tailored to C-         read the current DWT_CYCCNT value and use it as a 32-bit
compiler specifics (currently supporting gcc and compcert).         glitch-free time base. When the CPU is halted (e.g., during
                                                                    debugging) the counter is stopped.
3.     TIMER ABSTRACTION
                                                                    3.3     Generic timer implementation
3.1      Definitions                                                   A flow chart is given in Figure 1. Whenever a new message
     We introduce the following definitions:                        enters first in the queue (Fyes ) the timer handler (task) is
                                                                    invoked. In the timer handler, if the release time has already
Definition We denote a task to be postponed if stemming             expired (Eyes ), the queued task is pended for execution, else
from an asynchronous message:                                       (Eno ) the timer is programmed for releasing the the task at
   async after X before Y                                           its time for expire. In case a task is pended, the timer is
   with a defined baseline offset (X > 0). We denote the set        iteratively dequeued until either the queue is empty (Qno ),
of postponed tasks as OT .                                          or the release time not expired (Eno ). In the latter case, the
                                                                    timer is setup to generate an interrupt for next task to be
Definition We have a set of virtual timers {V T1 . . . V Tn }.      released.
Each virtual timer i is associated with a set of postponed             The timer handler is sketched in Listing 1, while the Sy-
tasks ot(V Ti ) ∈ OT , and a timer queue tq(V Ti ) (sorted by       sTick (set timer specific) implementation is outlined in List-
release time).                                                      ing 2, along with a flow chart for its operation Figure 2.
     async                        Idle : T 1    timer                                                                      M yes
                                                                                              set systick      D > max?             set max   exit
                                  Wait : T 2
                          LOCKED(R(tq))                            LOCKED(R(tq))                                    M no
                  F yes                                    E yes
      first?                 enqueue           expired?              pend task                                             E yes
                                                                                                                D ≤ 0?              pend st   exit
           F no                                     E no
                                                                                 Q yes                              E no
                                                                     dequeue?
     enqueue                 enable
                                                                                                                 set D                        exit
                                                                          Q no
                            pend timer         set timer              disable                           Figure 2: SysTick implementation.
                                                    T2                     T3
                                                                                          1     typedef struct TQ {
      exit                     exit              exit                  exit               2       RT_Time bl;
                                                                                          3       RT_Tid id;
                                                                                          4       struct TQ* next;
Figure 1: Timer queue (left) and timer handler                                            5     } TQ;
(right) flow charts.                                                                      6
                                                                                          7     TQ tq[TQ_LEN];           // queue
                                                                                          8     volatile
 1     ISR SysTick_Handler {                                                              9     TQ* tq_h = NULL;         // head pointer
 2       lq = RTFM_LOCK(Q);                                                              10     TQ* tq_f = tq;           // free pointer
 3       while ((T_CURR() - tq_h->bl) >= 0) {                                            11     TQ* tq_n;                // new
 4         // E_yes                                                                      12     TQ* tq_c;                // current
 5         RTFM_pend(tq_h->id);
 6         if (tq_deq() == NULL) {                                                                            Listing 3: tq.h header.
 7           // Q_no
 8           T_DISABLE();
 9           RTFM_UNLOCK(lq);
10           return;
11         }                                                                                  • Idle state:
12         RTFM_UNLOCK(lq);
13         lq = RTFM_LOCK(Q);                                                                         – the timer queue is empty, and
14       }
15       T_SET(tq_h->bl);                                                                             – the timer interrupt is disabled.
16       RTFM_UNLOCK(lq);
17     }
                                                                                              • Wait state:
                  Listing 1: SysTickHandler.core.
                                                                                                      – the timer queue is non-empty, and
                                                                                                      – the timer interrupt is enabled.
  T_CURR is a macro to read the DWT_CYCCNT (debug cycle
counter) while T_ENABLE()/T_DISABLE() are macros to                                        The invariants are upheld by the implementation following
enable/disable the SysTick interrupt.                                                    the (informal) reasoning.
  The SYSTICK_MASK is defined to the max reload value for
the 24-bit counter. For brevity, initialization code is omit-                            Queue correctness.
ted. However, worth to mention is that we read DWT_CYCCNT
                                                                                         Idle Assume the timer is in Idle state (the queue is empty),
to obtain a defined point in time (baseline) for the birth of
                                                                                              and the application emits an
the system. As a proof of concept we have implemented a
simple insertion sort queue (Listings 3 and 4).                                                  async after X ... t (...).
                                                                                                 Since the timer queue is empty we follow the right
3.3.1          Invariants for correctness                                                        branch (Fyes ), i.e., we enqueue (X, t), and enable the
   The invariants concern the logic of the interaction between                                   timer interrupt T_ENABLE() and pend the timer in-
the queue and the timer handler. Figure 3, depicts the over-                                     terrupt T_PEND, which causes the transition T1 to be
all timer operation. The following invariants should hold:                                       taken. At this point, the queue is non-empty, and the
                                                                                                 timer is enabled.
                                                                                         Wait Assume the timer is in Wait state (the queue is non-
 1     #define SYSTIC_MASK       ((1<<24)-1)
 2     void T_SET(RT_Time t) {                                                                empty), and the application emits an
 3       RT_time diff = t - T_CURR();
 4       if (diff > SYSTIC_MASK) {
                                                                                                 async after X ... t (...).
 5         SYSTICK_RELOAD = SYSTIC_MASK;                                                         In this case we take an (implicit) T2 transition and
 6       } else {
 7         if (diff <= 0) {
                                                                                                 remain in Wait state.
 8           PEND_SYSTIC()
 9         }
10         SYSTICK_RELOAD = (diff & SYSTIC_MASK)-1;                                      Timer handler correctness.
11       }
12       SYSTICK_CURRENT = 0; // write to force reload
13     }                                                                                 Idle Assuming the Idle invariant, the timer interrupt is dis-
                                                                                              abled. (Thus, the time-interrupt handler is not invoked
                     Listing 2: SysTickSet.core.                                              even in case a compare match occurs and the interrupt
                                                                                              is raised.)
                                                                                           T1                 T2
 1   void tq_enq(RT_Time t, RT_Tid id) {
 2     RT_lock_t lq;
 3     RT_LOCK(lq,R(tq));                                                        Idle               Wait
 4     if (tq_f == NULL) TQ_panic();                                                       T3
 5     // allocate and fill new entry
 6     tq_n = tq_f;
 7     tq_n->bl = t; tq_n->id = id;
 8     tq_f = tq_f->next;
                                                                   Figure 3: Timer states, Idle is the initial state.
 9     if (tq_h == NULL || tq_h->bl - t > 0) {
10       // F_yes, put first in list
11       tq_n->next = tq_h; tq_h = tq_n;                        concurrently, and potentially preemptively, to other tasks.
12       RT_UNLOCK(lq);
13       T_ENABLE(); T_PEND(); return;                          Hence, we may be exposed to race conditions. To this end
14     }                                                        we may either turn to re-entrant (lock-free) queue imple-
15     // F_no, put in middle or last                           mentations [9] or protect the queue as a resource in the sys-
16     tq_c = tq_h;
17     while (tq_c->next != NULL && tq_c->next->bl < t) {       tem. For this presentation, we turn to the locking mech-
18       tq_c = tq_c->next;                                     anisms provided by the RTFM-kernel. In Figure 1, the
19     }                                                        LOCKED(R(tq)) areas (marked yellow/boxed), indicates the
20     tq_n->next = tq_c->next; tq_c->next = tq_n;
21     RT_UNLOCK(lq);
                                                                critical sections on the resource R(tq). For the implemen-
22   }                                                          tation this amounts to RT_LOCK(lq, R(tq)) on entering
23                                                              and RT_UNLOCK(lq) on exiting. Since the queue opera-
24   TQ* tq_deq() {
25     tq_c = tq_h;
                                                                tions are protected by the resource R(tq), they are from the
26     if (tq_h != NULL) {                                      outset of concurrency safe. While the release of an expired
27       tq_h = tq_h->next;                                     task t [pend task] is executed while holding the resource
28       tq_c->next = tq_f;                                     R(tq), the SRP protocol ensures that t is only dispatched
29       tq_f = tq_c;
30     }                                                        if it has a priority higher than the current ceiling. If t ac-
31     return tq_c;                                             cesses the queue (through an async after X ...), then
32   }                                                          dR(tq)e ≥ p(t) which prevents dispatching t until R(tq) is
                                                                unlocked. (Moreover, under the assumption that the timer
           Listing 4: tq.c implementation.                      handler task is given a priority equal or greater than t, t will
                                                                not be dispatched until the timer handler task finish.)

Wait The time-interrupt handler is invoked when an inter-       3.3.3    Characterization
     rupt occurs and the interrupt is enabled. The interrupt      The presented timer abstraction and its implementation
     has been raised either due to a T1 transition or due       gives the following key characteristics:
     to the timer hardware on a compare match. Assum-
     ing the Wait invariants there is (at least) one element       • Given a bound size queue, tq_enq is a bound time
     in the queue, thus we can safely access tq_h for the            operation,
      check. From there the following cases ap-
     ply:                                                          • tq_deq is a constant time operation (accessing and
                                                                     advancing only the head of the list),
Eno On Eno we program the SysTick timer [set timer],
    and return from the time-interrupt handler [exit].             • timer handling is safe w.r.t. invariants, and it
    This corresponds to a transition T2 where we remain in
    Wait state (waiting for a compare match). This occurs          • allows implementation (and analysis) as part of the
    in the case the timer is programmed first time or on             -core application1 .
    an overflow (when range of the timer is insufficient to
    reach the release time of the queued task). Notice,            Timing characteristics have been determined by measur-
    the latter may occur repeatedly until the          ing the clock cycle count (DWT_CYCCNT) on the current im-
    condition is met.                                           plementations (as presented in the paper). The experiments
                                                                have been conducted on a STM32 F4, running at full speed
Eyes We release the expired task [pend task] and check          (168MHz). The measurements have been repeated and con-
     if more messages are queued . From the           sistent cycle counts have been observed. For the experiments
     following cases apply:                                     we have used gcc v4.8.3 (OL gives the optimization level),
                                                                with the default settings for the target architecture. All
Qno No further messages are queued and we disable the
                                                                measurements include the overhead of the instrumentation
    interrupt [disable]. This corresponds to the tran-
                                                                code, hence safe and pessimistic w.r.t. actual performance.
    sition T3 back to Idle state. At this point, the queue
                                                                   The queue implementation has been characterized, Table
    is empty and the interrupt is disabled.
                                                                1. The Baseline gives the cycle count (including the cal-
Qyes There is still at least one message in the queue, and we   l/return) overhead for inserting last in a queue holding 1
     check the  condition for the next queued         element. The LC gives the Linear coefficient (added cost
     task (a T2 transition).                                    1
                                                                  In particular, the tq_enq is part of the execution time for
                                                                the sending task, and the critical section (holding R(tq)) of
3.3.2    Correctness under Concurrency                          the timer handler is constant time (although the execution
                                                                of the timer handler may involve iterations). Notice here
   The sending task (accessing the timer queue through emit-    the “slight escape” from the critical section when releasing
ting an async after X ...) and the timer handler runs           multiple tasks.
for each element in worst case). As expected for insertion      equivalent and fully programmable 32-bit timers (the latter
sort, we found the coefficient indeed to be linear.             meeting all our requirements suitability criteria). In the case
   Table 2, shows the latency from set release-time to dis-     of the STM32 F407VET (and similar), we find a set of 12
patch in clock cycles. This gives an upper bound to the dis-    16-bit timers and 2 32-bit timers, meeting the requirements
patch overhead, (dispatching multiple queued tasks without      and suitability criteria.
leaving the handler always infer lower latency). The block-       For the implementation, the specialization to a vendor
ing (related to tq) inferred by the timer handler is brought    specific timer is isolated to the [set timer]. The writing
to a constant by escaping the critical section for each iter-   the match compare is always 32-bit under the ARM memory
ation. The Best/Nominal values, give the execution path         model, (the underlying timer hardware merely discards the
when the queued task is not at the end of the queue, while      16 MSBs), hence the characterization applies in all cases. In
the Worst case includes disabling the timer interrupt.          Table 3 gives the overhead for the isolated SetSysTick, while
                                                                Table 4 depicts the overhead of setting a Vendor Specific
  Table 1: Complexity of the queueing algorithm.                (STM32 F407VET 32-bit) timer.
 OL                       Baseline               LC
                                                                Table 3: Characterization of the Timer Set function
 O0                         178                  26.5           for the SysTick Timer.
 O1                          95                   10
 O2                          78                   9              OL                      Best Case            Worst Case
 O3                          78                    9             O0                         54                    67
 Og                          96                  9.5             O1                         35                    49
                                                                 O2                         33                    41
                                                                 O3                         33                    41
Table 2: Latency from set release time to dispatch.              Og                         33                    41
 OL                    Best        Nominal         Worst
                       Case         Case           Case
 O0                     229          298            338         Table 4: Characterization of the Timer Set function
 O1                     122          153            188         for a Vendor Specific timer.
 O2                     123          153            217
 O3                     123          153            217          OL                      Best Case            Worst Case
 Og                     124          155            195          O0                         37                    45
                                                                 O1                         21                    28
   From this we can conclude that the generic implementa-        O2                         21                    22
tion is capable of a low latency dispatch (< 2us, scaled to a    O3                         21                    22
100MHz MCU). We have given the necessary WCET charac-            Og                         21                    22
terization for blocking, useful to SRP based timing analysis
(e.g., response time and overall schedulability).

3.4   Vendor specific timers                                    3.5    Compiler support
   An ARM Cortex based MCU typically comprise an ARM               In order to automatically generate code for the proposed
defined core and a set of vendor specific peripherals (typ-     virtual timers, the -core to C compiler is required to under-
ically including a set of timers/counters). Each counter/-      take the following (additional) steps in the analysis:1) derive
timer has a defined set of features (supporting the intended    the set of postponed tasks OT , 2) associate each postponed
use). The requirements for implementing the abstract timer      task ti ∈ OT to a V Tj , such that p(V Tj ) = p(ti ) (i.e., as-
architecture boils down to the following:                       sign a virtual timer to each priority level in the tasks set
                                                                OT ), 3) derive a mapping M from V T to P T . 4) derive for
   • n-bit width counter (+ for larger n) with                  each tqi , whereP Ti ∈ P T the static queue length (tqi being
                                                                a potentially shared queue for P Ti , M (V Ti ) = P Ti ). 5)
   • interrupt capability (r), programmable priority (+)        associate each tqi to a resource R(tqi ), with a ceiling value
                                                                assigned under SRP (derived from the priorities of the tasks
   • frequency (rate) relation to core-clock defined (r) or
                                                                accessing the queue, and 6) derive a time base tb(P Ti ) for
     programmable (+), and
                                                                each P Ti . 7) Generate C code definitions accordingly.
   • programmable reload (r), match compare register (+).          In the generated C-code, each task has a defined baseline
                                                                set by reading the hardware timer (T_CURR()) for exter-
While (r) this is required/sufficient, the suitability is im-   nally triggered task or given by the sending task). To the
proved (+) by a larger bit width, programmable priority,        kernel we introduce a (queue and timer implementation in-
programmable frequency and match compare functionality.         dependent) macro RTFM_async_i(...) scaling the virtual
  As representative uses cases we have studied two popular      time based (in us) to that of the target P Ii . Our prototype
ARM Cortex MCUs, namely the NXP LPC1769 and the                 -core compiler implementation, assumes the case of a single
STM32 F4. In the case of the NXP LPC1769 (and similar) a        (shared) physical timer. (The evaluation of multiple timers
Repetitive Interrupt Timer (RIT) is provided, and a set of 4    has been conducted by manually.)
3.6   Timing performance                                         for multiple alternative queue implementations. By analyz-
  For scheduling analysis the timer handlers can be seen as      ing the task set, the compiler could chose the best fit (lin-
ordinary tasks, invoked once for the release of each message     ear, heap, etc.) for each queue according to its characteris-
(plus the number of the range overflows present, e.g. in         tics (Section 3.3.3) and overall requirements (w.r.t. timing,
case of SysTick based solutions). With the outset that the       memory, etc.).
mapping M is complete there will be no priority inversion
introduced by the timer handlers (as they operate operate at     5.   CONCLUSIONS
the same priority as the tasks they release). A timer handler       In this paper we have introduced abstract timers to the
th for shared timer, may preempt a task tj (p(th ) > p(tj )),    purpose of platform independent support for postponed tasks.
while p(tr ) ≤ p(tj ), tr being the released task.               The abstraction allows timer tasks (handlers) and queues to
  For vendor specific timers we typically have the option        be statically allocated and included in system wide compile-
to set the frequency f (P Tn ) (increased frequency gives a      time analysis under the task and resource model of RTFM.
improved precision, while at the same time may increase          We have proposed a generic timer implementation that re-
the background load for processing timer overruns). The          lies solely on the ARM defined Cortex-M core and existing
precision occurs as a jitter parameter to the scheduling. (In    RTFM-kernel primitives, and is thus directly applicable to
case the timer operates at the core clock frequency of the       a wide range of commercially available MCUs. Correctness
MCU (e.g., for our SysTick implementation), jitter is 0.)        has been argued from invariants for queue and timer task
                                                                 interactions and queue consistency in a concurrent setting.
3.7   Run-time verification                                      Our experiments validate the feasibility of the abstract timer
   The proof of correctness for the implementation is infor-     architecture and the presented characterizations of queuing
mal. To this end, the T_ENABLE()/T_DISABLE() macros              overhead and generic/vendor specific timer implementations
and tq_eng/tq_deq implementations have been extended             gives concrete bounds, useful as input to further response
to check the invariants. For run-time verification of tim-       time and schedulability analysis.
ing constraints, the code generation for tasks is extended to
check on return of each task ti the condition:                   6.   REFERENCES
   bl_t_i + dl_t_i > T_CURR(), where bl_t_i is the
(dynamic) task release time (baseline) and dl_t_i the spec-       [1] FreeRTOS. (webpage) Last accessed 2015-09-18.
ified (relative) deadline.                                            [Online]. Available: http://www.freertos.org
                                                                  [2] ChibiOS/RT. (webpage) Last accessed 2015-09-18.
3.8   Assumptions                                                     [Online]. Available: http://www.chibios.org
                                                                  [3] RIOT. (webpage) Last accessed 2015-09-18. [Online].
  Following the general -core assumption on schedulability,
                                                                      Available: http://riot-os.org
any message can have at most one outstanding instance.
This allows the required (safe) queue length to be derived        [4] E. A. Lee, “The problem with threads,” Computer,
directly as the sum of tasks associated to the queue In con-          vol. 39, no. 5, pp. 33–42, May 2006.
sequence baseline offsets (after X ...) must be less or           [5] T. Baker, “A stack-based resource allocation policy for
equal to the sender’s inter-arrival time.                             realtime processes,” in Real-Time Systems Symposium,
                                                                      1990. Proceedings., 11th, Dec. 1990, pp. 191 –200.
                                                                  [6] P. Lindgren, M. Lindner, and et.al, “RTFM-core:
4.    RELATED AND FUTURE WORK                                         Language and Implementation,” in
   In the context of light-weight operating systems, neither          ESWEEK/CPSArch 2014, 2014.
ChibiOS[2], RIOT[3] nor FreeRTOS[1] provide official char-        [7] P. Lindgren, M. Lindner, A. Lindner, V. Vyatkin,
acterized queue/timer implementations. TinyOS [10] (TEP               D. Pereira, and L. M. Pinho, “A real-time semantics
102/108) suggest an HAL virtualization layer. However,                for the IEC 61499 standard,” in ETFA 2015,
timers in TinyOS are outside their model of computation               September 8-11, 2015, Luxembourg, 2015.
and treated as any other (arbitrary) event source. Con-           [8] J. Eriksson, F. Häggstrom, S. Aittamaa, A. Kruglyak,
tiki [11] provides the Rtimer library for scheduling real-time        and P. Lindgren, “Real-time for the masses, step 1:
task, however unlike our approach their timer tasks are un-           Programming API and static priority SRP kernel
safe. Hence, our work presented can be considered as a                primitives.” in SIES. IEEE, 2013, pp. 110–113.
baseline for future benchmarking.                                 [9] A. Kogan and E. Petrank, “A methodology for
   Future work includes supporting baseline offsets larger            creating fast wait-free data structures,” ser. PPoPP
than inter-arrival time for the sender. As mentioned in Sec-          ’12. New York, NY, USA: ACM, 2012, pp. 141–150.
tion 3.5, the support for abstract timers is currently limited
                                                                 [10] P. Levis, S. Madden, and et. al., “TinyOS: An
to a single queue/timer handler. The time-base T_CURR is
                                                                      operating system for sensor networks,” in in Ambient
32-bit, defined by the DWT. This limits the absolute time
                                                                      Intelligence. Springer Verlag, 2004.
offsets. Longer offsets can be obtained at application level
                                                                 [11] A. Dunkels, B. Grönvall, and et. al., “Contiki - a
(manually keeping track of number of activations until de-
                                                                      lightweight and flexible operating system for tiny
sired time has expired). Automatic allocation and assign-
                                                                      networked sensors,” in Emnets-I, Nov. 2004.
ment of (potentially multiple) timer handlers according the
requirements of the application can support arbitrary off-
sets, as well as reducing priority inversion and overall over-
head. Besides temporal properties, issues of energy con-
sumption may be considered for multi-domain optimization.
Moreover, the presented abstract timer architecture allows