=Paper=
{{Paper
|id=Vol-3008/paper3
|storemode=property
|title=Towards a Quantum Algorithm for Evaluating WCETs
|pdfUrl=https://ceur-ws.org/Vol-3008/paper3.pdf
|volume=Vol-3008
|authors=Gabriella Bettonte,Stéphane Louise,Renaud Sirdey
|dblpUrl=https://dblp.org/rec/conf/qce/BettonteLS21
}}
==Towards a Quantum Algorithm for Evaluating WCETs==
Towards a quantum algorithm for evaluating WCETs Gabriella Bettontea , Stéphane Louisea and Renaud Sirdeya a Université Paris-Saclay, CEA List Abstract In this paper we propose a quantum-based solution to the problem of counting the cache hits, an impor- tant issue when analyzing real-time embedded applications. This field has already seen the development of “quantum-inspired” classical algorithms which are competitive with the state of the art. We designed a polynomial-time dynamic programming algorithm for computing the lowest number of cache hits realized by a deterministic sequence of memory accesses, in the presence of preemptions. Our contri- bution consists in porting that algorithm to the quantum framework, improving the complexity of the algorithm from 𝑂(𝑁 3 ) to 𝑂(𝑁 2 + 𝑁 ). Keywords WCETs, preemption, quantum computing, dynamic programming. 1. Introduction For the most part, the interest addressed to quantum computing comes from the ability of qubits to store a superposition state that reflects all the possible inputs/outputs of a given algorithm, until a measurement is done. This property is called quantum parallelism and can, in certain cases, give a massive performance boost for quantum algorithms. But the advantages of quantum computing do not come without caveats: only some classes of problems can be solved by quantum computing, with a definite gain in terms of efficiency with respect to classical computing. Indeed, one important research issue related to quantum computing is defining with precision such problems. Yet, within the variety of quantum algorithms we can identify two main design blueprints. The first one is defined by quantum algorithms capable to reach an exponential speedup over classical algorithms on precisely defined and heavily structured mathematical problems, like for instance Shor’s algorithm for integer factorization. The second one is defined by quantum algorithms with a quadratic speedup over classical ones, like for instance Grover’s algorithm for searching an unstructured array. Still, a lot of questions remain open about the real world applications and benefits of quantum computing, especially for more arbitrary problems and fields without any a priori “quantum-friendlyness”. In this paper, we consider the problem of worst-case execution time (WCET) evaluation by static analysis of programs. The WCET analysis is highly relevant to the design of safety- critical real-time systems, which needs to respect all the timing constraints they are specified to meet for the verification of their safety properties. Also, WCET analysis has recently seen the developments of “quantum-inspired” classical algorithms which are competitive (in terms of 2nd Quantum Software Engineering and Technology Workshop | IEEE Quantum Week 2021 © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) CEUR http://ceur-ws.org Workshop ISSN 1613-0073 Proceedings 66 precision and efficiency) with state-of-the-art approaches. Furthermore, the problems arising in WCET evaluation cover a wide-range of complexity classes, from undecidability in the general case down to 𝑁 𝑃-hardness and polynomial-time solvability in some restricted cases [17]. As such, it appears to provide a relevant playground to put the quantum computing promise to the test, although other fields may be as relevant and should be explored as well. In this direction, in this paper, we tackle only a restricted setup with the most simple pro- gram model: the evaluation of the worst-case number of cache misses of programs performing deterministic sequences of memory accesses, in case of arbitrary preemptions. A preemption is the act of interrupting one task to allow the execution of another task on a machine. There is indeed a strong connection between the number of cache misses done by program and its exe- cution time, as uncached memory accesses are highly time-consuming on modern processors. In other words, we consider a single path program (linear sequences of instructions) and the complexity of the model comes from the arbitrary-placed interruptions, due to other programs running on the same system, which create interferences in the cache memory. For our model, we consider 𝐾 preemption points and they can occur at any time in the sequence, making the cache behaviour non-predictable. In this paper, we propose a dynamic programming classical algorithm, with polynomial- time complexity, to compute the minimal number of cache hits in the sequence of memory accesses, as a proxy measure of the WCET for the considered program. Although our model is standard the WCET community, we consider it, as well as the classical polynomial-time algorithm solving it, as the basis to derive a lower complexity (still polynomial-time) hybrid quantum-classical algorithm. In doing so, we demonstrate a first benefit of explicitly using the quantum computing paradigm to the field of WCET calculations, albeit in simplified setting. Dynamic programming is an algorithmic technique to solve a problem consisting in find- ing one optimal solution by solving a family of easier sub-problems. We designed a dynamic programming algorithm to compute the minimal number of cache hits while executing a de- terministic sequence with 𝐾 preemptions. This technique to solve WCETs is consistent with others classical solutions in the literature [20]. Still, we want to emphasize that designing the best classical algorithm to solve the considered problem is not the point of our work: we are focused on the portability of classical algorithms to the quantum framework. This paper is organized as follows: Section II provides a brief overview about cache memories and preemptions. Then, Section III places the paper in the context of the state-of-the-art, and presents our program model. In Section IV we propose a dynamic programming algorithm for evaluating WCET. Section V then contains the quantum version of that algorithm. Lastly, in Section VI we compare the complexity of the two algorithms and in Section VII we provide some perspectives for future works. 2. Background on cache and preemption Cache memories impact the variability of execution times, since the access time between an element stored in the cache memories and an element which is not, can be up to two orders of magnitude. As cache memories are limited in size, the hardware uses the history of previous accesses in the cache to decide which currently stored elements should be replaced when a new 67 one must be stored in the cache memory. For instance, LRU (Least Recently Used) cache policy privileges, as replacement candidate, the oldest used line of memory in a set. In the general case, when an access to an element in memory performed by a program is already in the cache, we call it a “cache hit”, otherwise we say that a “cache miss” occurred. Cache misses impact on the execution time because the missing element needs to be fetched from the main memory, which induces additional delays. The advantage of preemptions is the possibility to make optimal utilization of the computing power. In particular, in fully preemptive systems, as the one we present in this paper, the run- ning task can be interrupted at any time by another task with higher priority and be resumed to continue when all higher priority tasks have completed [6]. When the task is preempted, the memory blocks corresponding to the task are usually considered as flushed from the cache memory1 , between the time the task is preempted and the time the task resumes execution. Therefore a substantial amount of time is spent to restore the previous content of the cache, greatly increasing the task’s execution time [3]. In certain cases, preemption has not a great impact because it happens at a moment of the program’s execution in which data stored into the cache are not useful: for instance, before a cache miss. Still, in general, preemption dam- ages program locality and therefore it causes a degradation of system predictability, making WCETs not easy to characterize and predict [4] [5] [3] [6]. 3. Deterministic memory access with preemptions We consider programs that perform deterministic accesses to the memory, that can be inter- rupted at any time, using the count of cache misses as a first proxy evaluation of the WCET (or as the literature calls it, the Cache memory preemption delay). A “dual” approach has al- ready been explored in the literature, with papers about applying static analysis techniques to quantum algorithms to evaluate their performance and formally analyze their functional properties [18, 19]. On the other hand, the applications of quantum computing to improve the static analysis of programs has been only scarcely explored [1] and even lesser so is the issue of static analysis of cache misses and WCET where only “quantum-inspired” classic algorithm have been proposed [2]. So a lot of questions remain open about the applications and benefits of quantum computing for software engineering issues [1]. Numerous research works exist in the direction of improving classical polynomial algorithms with a quantum inspired approach, allowing to gain a polynomial factor of complexity [7, 8, 9, 10, 11, 12, 13, 14]. In this paper, we propose a dynamic programming algorithm for computing the minimal number of cache hits and improve it into a quantum-classic hybrid version, leading to a lower complexity. As model case, we take into consideration the problem of evaluating the WCET of a deter- ministic sequence of memory accesses, in case of preemptions. We denote the sequence of memory accesses as 𝑎0 , 𝑎1 , … , 𝑎𝑁 −1 and we are supposed to know if each of them is a cache miss or a cache hit in an execution without preemptions (this is done in linear time as a pre- 1 Whilst it is possible that, for a given set of tasks, preemption would preserve some cache lines, the situation when one wants to calculate WCET of a task in the general case requires the hypothesis that no useful cache line would be retained by a preemption. 68 processing step). We denote 𝑋 [𝑖] the 𝑖-th access of the sequence and { 1 if cache miss 𝑋 [𝑖] = 0 if cache hit. 𝑃 is the set of possible contents of those memory accesses 𝑝0 , … , 𝑝𝑀−1 . The number of preemp- tions interrupting the program is a fixed number 𝐾 . Preemption can happen at any time, this leads to non-predictability of the model. We suppose also that when a preemption occurs, all the content of the cache is flushed from it. 4. Our classical starting point We designed a dynamic programming algorithm to compute the minimal number of cache hits while executing a deterministic sequence with 𝐾 preemptions. The minimal number of cache hits corresponds to the maximal number of cache misses. First at all, we scan the sequence of memory accesses, as it would be without any point of preemption, and we store the information of being a cache hit or miss into an array 𝑋 : if the considered access is cache miss 𝑋 [𝑖] = 1, 𝑋 [𝑖] = 0 otherwise. We compute also the 𝑁𝑖 [𝑝𝑡 ] > 𝑖, the table of indexes of the first access to the object 𝑝𝑡 , after the 𝑖-th access, into the sequence of memory access. If there is no other access to 𝑝𝑡 , we write ∞. We suppose to know the solution of the problem for all the sequence 𝑎𝑖+1 , ⋯ , 𝑎𝑁 −1 and the number of preemptions 𝑘 ′ , where 0 < 𝑘 ′ < 𝐾 . In other words, it is known: • 𝑆(𝑖 + 1, 𝑝𝑡, 𝑘 ′ ): the smallest number of cache hits for the object 𝑝𝑡 and 𝑘 ′ preemptions; • 𝑌 (𝑖 + 1, 𝑘 ′ ) > 𝑖 + 1 : index of the next preemption in this solution. Algorithm 1 describes how to determine 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) and 𝑌 (𝑖, 𝑘 ′ ) from the previous data, in case of cache miss (i.e if 𝑋 [𝑖] = 1) Now, Algorithm 2 describes how to determine 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) and 𝑌 (𝑖, 𝑘 ′ ) from the previous data, in case of cache hit (i.e if 𝑋 [𝑖] = 0). After performing the algorithm we obtain the smallest number of cache hits for each object 𝑝𝑡 , when 𝐾 preemptions occur, in symbols 𝑆(0, 𝑝𝑡 , 𝐾 ). ∑𝑝𝑡 𝑆(0, 𝑝𝑡 , 𝐾 ) is the solution for the problem of computing the smallest number of cache hits for the sequence of memory accesses. 4.1. Example of application of the classic algorithm As example, we take into consideration a cache of size 2, with 𝐾 = 2 (i.e two preemptions occur). The set of possible contents of the cache is 𝑃 = {𝑝0 , 𝑝1 , 𝑝2 } = {𝐴, 𝐵, 𝐶}. The sequence of memory accesses is 𝑎0 𝑎1 𝑎2 𝑎3 𝑎4 = 𝐴𝐵𝐴𝐵𝐶. The first two accesses and the last one are cache misses, while the third and fourth accesses are cache hits. It is simple to see that the worst scenario is when the two points of preemption happen immediately before the two cache hits, because in this case the content of the cache is flushed and 𝐴 and 𝐵 has to be restored into the cache when re-called lately. In this scenario the algorithm returns 𝑆0 = 0 as minimal number 69 Algorithm 1: Cache hits counter - cache miss case Result: The choice between A and B, for a fixed 𝑘 ′ , is the one that minimize ∑𝑘 ′ 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) 𝑖 index in the sequence 0 < 𝑘 ′ < 𝐾 number of preemptions; case A no preemption for all 𝑝𝑡 in P for all 𝑘 ′ if 𝑝𝑡 = 𝑎𝑖 then 𝑆(𝑖, 𝑎𝑖 , 𝑘 ′ ) = 1 + 𝑆(𝑖 + 1, 𝑎𝑖 , 𝑘 ′ ) 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑝𝑡 , 𝑘 ′ ) 𝑌 (𝑖, 𝑘 ′ ) = 𝑌 (𝑖 + 1, 𝑘 ′ ) end case B preemption for all 𝑝𝑡 in P for all 𝑘 ′ if 𝑋 [𝑁𝑖 [𝑝𝑡 ]] = 0 and 𝑌 (𝑖 + 1, 𝑘 ′ − 1) > 𝑁𝑖 [𝑝𝑡 ] then 𝐵 = 𝐵1 = 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑝𝑡 , 𝑘 ′ − 1) − 1 else 𝐵 = 𝐵2 = 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑝𝑡 , 𝑘 ′ − 1) 𝑌 (𝑖, 𝑘 ′ ) = 𝑖 of cache hits, meaning that all the accesses of the sequence are cache misses. If a preemption happens before a cache miss it does not impact in the performances. After the algorithm performs the first scan of the sequence of memory accesses we have 𝑋 = 11001. The table 𝑁 is: i \t 0 (A) 1 (B) 2 (C) 0 2 1 4 1 2 3 4 𝑁 = 2 ∞ 3 4 3 ∞ ∞ 4 4 ∞ ∞ ∞ We give below the details only of the two first iterations, being the most representative. In order to obtain the final result, it suffices to execute again the algorithm, until the sequence of accesses is completed. 70 Algorithm 2: Cache hits counter - cache hit case Result: The choice between A and B, for a fixed 𝑘 ′ , is the one that minimize ∑𝑘 ′ 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) 𝑖 index in the sequence 0 < 𝑘 ′ < 𝐾 number of preemptions; case A no preemption for all 𝑝𝑡 in P for all 𝑘 ′ if 𝑝𝑡 = 𝑎𝑖 𝐴 = 𝑆(𝑖, 𝑎𝑖 , 𝑘 ′ ) = 1 + 𝑆(𝑖 + 1, 𝑎𝑖 , 𝑘 ′ ) else 𝐴 = 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑝𝑡 , 𝑘 ′ ) 𝑌 (𝑖, 𝑘 ′ ) = 𝑌 (𝑖 + 1, 𝑘 ′ ) case B preemption for all 𝑝𝑡 in P for all 𝑘 ′ if (𝑋 [𝑁𝑖 [𝑝𝑡 ]] = 0 and 𝑌 (𝑖 + 1, 𝑘 ′ − 1) > 𝑁𝑖 [𝑝𝑡 ]and 𝑝𝑡 ≠ 𝑎𝑖 ) 𝐵 = 𝐵1 = 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑝𝑡 , 𝑘 ′ − 1) − 1 else 𝐵 = 𝐵2 = 𝑆(𝑖, 𝑝𝑡 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑝𝑡 , 𝑘 ′ − 1) 𝑆(𝑖, 𝑎𝑖 , 𝑘 ′ ) = 𝑆(𝑖 + 1, 𝑎𝑖 , 𝑘 ′ − 1) 𝑌 (𝑖, 𝑘 ′ ) = 𝑖 • a4=C t \k’ 0 1 2 0 0 0 0 (A) 𝑆(𝑖 = 4, 𝑝𝑡 , 𝑘 ′ ) = 1 0 0 0 (B) 2 0 0 0 (C) k’ 0 1 2 𝑌 (𝑖 = 4, 𝑘 ′ ) = ∞ 4 4 • a3=B 71 Without preemption: t \k’ 0 1 2 0 0 0 0 (A) 𝐴 = 𝑆(𝑖 = 3, 𝑝𝑡, 𝑘 ) = 1 ′ 1 1 1 (B) 2 0 0 0 (C) With preemption: t \k’ 0 1 2 0 - 0 0 (A) 𝐵 = 𝑆(𝑖 = 3, 𝑝𝑡 , 𝑘 ′ ) = 1 (B) - 1 1 2 - 0 0 (C) Here, note that "-" is equivalent to 0 for our purpose. Final table: t \k’ 0 1 2 0 0 0 0 (A) 𝑆(𝑖 = 3, 𝑝𝑡 , 𝑘 ′ ) = 1 1 1 1 (B) 2 0 0 0 (C) k’ 0 1 2 𝑌 (𝑖 = 3, 𝑘 ′ ) = ∞ 3 3 5. Going quantum In this section we present the “quantum version” of the previous algorithm. We suppose to already know the behaviour of the cache in case of program’s execution without preemptions (i.e the array 𝑋 ) and the table 𝑁 , both deterministic. While searching for the minimal number of cache miss, in the classic algorithm, we perform three nested loops: • over the sequence memory accesses (𝑁 iterations) • over all the possible objects that can be into the cache (𝑁 iterations) • over all the 𝑘 ′ with 0 < 𝑘 ′ < 𝐾 (𝐾 iterations) To compute the solution (i.e. the number of cache hits corresponding to 𝐾 preemption and the last memory access) the algorithm needs all the number of cache hits corresponding to 𝑘 ′ 72 with 0 < 𝑘 ′ < 𝐾 . We reformulated the dynamic programming classic algorithm into an hybrid quantum-classical algorithm in which the third loop (the one over the number of preemptions) is suitable to be executed on a quantum computer. In particular, we will give below the details of the first case (when cache miss happens) of our quantum algorithm, being the second case (when cache hit happens) similar. We build a superposed state represented as |𝑘 ′ , 𝑆⟩ where 𝑘 ′ is the number of considered preemptions and 𝑆 the corresponding number of cache hits. In other words, for each number of preemptions 𝑘 ′ , it is known the minimum possible number of cache hits while performing the sequence of memory accesses. Considering, for simplicity of notation, the case of our example, 𝑆 can be either 0 or 1 so we build the superposed state ∑𝑘 ′ 𝑎𝑝 |𝑘 ′ , 0⟩ + 𝑏𝑝 |𝑘 ′ , 1⟩, for each 𝑝 ∈ 𝑃. We recall that 𝑃 is the set of possible contents of the memory slots. Since each block (a 𝑘 ′ iteration) is independent from the others, we can consider them one by one. Therefore we perform a projection on the 𝑘 ′ , so we represent the qubits as 𝑞𝑝 = 𝑎𝑝 |0⟩ + 𝑏𝑝 |1⟩, with 𝑎𝑝 , 𝑏𝑝 ∈ {0, 1} and 𝑎𝑝 + 𝑏𝑝 = 1. If the considered memory access is a cache miss (i.e 𝑋 [𝑖] = 1), a preemption may (case B) or may not (case A) occur before accessing it. When a preemption does not occur (case A), we simply apply the identity matrix to the superposed state. When a preemption occurs (case B) the situation is more complex and we need to define two different operators, 𝐺 and 𝐷. Those operators are made of the sub-operators 02 , 𝐼2 , 𝑆2 , applied to the single blocks. 0 0 1 0 0 1 02 = 𝐼 = 𝑆 = (1) [0 0] 2 [0 1] 2 [1 0] ⎡02 𝐼2 02 ⋯ 02 ⎤ ⎢ ⎥ 0 0 𝐼 ⋯ 02 ⎥ 𝐷=⎢ 2 2 2 ⎢⋮ ⋮ ⋮ ⋱ 02 ⎥ ⎢ 𝐼 2 02 02 02 02 ⎥ ⎣ ⎦ ⎡02 𝐼2 02 ⋯ 02 ⎤ ⎡𝑆2 02 02 02 ⎤ ⎢ ⎥ ⎢ ⎥ ⎢02 02 𝐼2 ⋯ 02 ⎥ ⎢02 𝑆2 03 02 ⎥ 𝐺= × ⎢⋮ ⋮ ⋮ ⋱ 02 ⎥ ⎢ ⋮ ⋮ ⋱ ⋮⎥ ⎢ 𝐼2 02 02 02 02 ⎥ ⎢02 02 02 𝑆2 ⎥ ⎣ ⎦ ⎣ ⎦ In order to maintain reversibility we need to compute all the more outputs of the algorithm respect to the classical algorithm and keep track of them. Starting from the superposed state, we will apply the identity operator to obtain 𝐴, we apply 𝐺 to obtain 𝐵1 and 𝐷 to obtain 𝐵2 The first oracle 𝑂1 has to choose between 𝐵1 and 𝐵2, always block by block: 𝑂1 ∶ |𝑞1 , 𝑞2 , ⋯ , 𝑞𝑀 , 𝑞𝑀+1 , 𝑞𝑀+2 , ⋯ , 𝑞2𝑀 , 𝑐, 𝑦⟩ ⟶ |𝑞1 𝑓 ⊕ 𝑞𝑀+1 (𝑓 ⊕ 1), 𝑞2 𝑓 ⊕ 𝑞𝑀+2 (𝑓 ⊕ 1), ⋯ , 𝑞𝑀 𝑓 ⊕ 𝑞2𝑀 (𝑓 ⊕ 1), 𝑞1 (1 ⊕ 𝑓 ) ⊕ 𝑞𝑀+1 𝑓 , 𝑞2 (1 ⊕ 𝑓 ) ⊕ 𝑞𝑀+2 𝑓 , ⋯ , 𝑞𝑀 (1 ⊕ 𝑓 ) ⊕ 𝑞2𝑀 𝑓 , 𝑐 ⊕ 𝑓 , 𝑦⟩ where: 𝑞𝑖 = 𝑎𝑝 |0⟩ + 𝑏𝑝 |1⟩ 73 and 𝑐 a boolean. We have also: 𝑦 = 𝑐0 |∞⟩ + 𝑐1 |1⟩ + ⋯ + 𝑐𝑁 |𝑁 ⟩ where 𝑐0 , 𝑐1 , ⋯ , 𝑐𝑁 ∈ {0, 1} and 𝑐0 + 𝑐1 , ⋯ + 𝑐𝑁 = 1. Finally: { 1 if condition is true, 𝑓 = 𝑓 = (𝑞1 , ⋯ , 𝑞2𝑀 ) = 0 otherwise with condition true if: 𝑦 > 𝑁𝑖 (𝑝𝑡 ) . Now, calling 𝐵 the result of the choice between 𝐵1 and 𝐵2, we need to choose between 𝐴 and 𝐵, picking the option that minimizes the number of cache hits. The oracle 𝑂2 is an oracle choosing between A and B. 𝑂2 ∶ |𝑞1 , 𝑞2 , ⋯ , 𝑞𝑀 , 𝑞𝑀+1 , 𝑞𝑀+2 , ⋯ , 𝑞2𝑀 , 𝑐⟩ ⟶ |𝑞1 𝑓 ⊕ 𝑞𝑀+1 (𝑓 ⊕ 1), 𝑞2 𝑓 ⊕ 𝑞𝑀+2 (𝑓 ⊕ 1), ⋯ , 𝑞𝑀 𝑓 ⊕ 𝑞2𝑀 (𝑓 ⊕ 1), 𝑞1 (1 ⊕ 𝑓 ) ⊕ 𝑞𝑀+1 𝑓 , 𝑞2 (1 ⊕ 𝑓 ) ⊕ 𝑞𝑀+2 𝑓 , ⋯ , 𝑞𝑀 (1 ⊕ 𝑓 ) ⊕ 𝑞2𝑀 𝑓 , 𝑐 ⊕ 𝑓 ⟩ where 𝑞𝑖 = 𝑎𝑝 |0⟩ + 𝑏𝑝 |1⟩ and 𝑐 is a boolean. Then, we have: { 1 if condition is true, 𝑓 = 𝑓 = (𝑞1 , ⋯ , 𝑞2𝑀 ) = 0 otherwise with condition true if: 𝑞1 + ⋯ + 𝑞𝑀 < 𝑞𝑀+1 + ⋯ + 𝑞2𝑀 where "+" is the usual sum over qubits, executable by a Quantum Adder [15], [16]. In Fig. 1 it is represented the quantum algorithm’s flow. 5.1. Example of application of the quantum algorithm In this section we give a glimpse of the functioning of our algorithm though a simple example. We suppose to have a sequence of memory accesses of length 𝑁 = 3: 𝐴𝐵𝐵. We have 𝑃 = {𝐴, 𝐵}, then 𝑀 = 2. The choice of this simplified example comes from the fact that the number of qubits needed to perform the quantum algorithm increases quickly with the number of memory accesses. That being said, with such a low 𝑁 and consequently low 𝐾 , it is not possible to appreciate the advantage of the quantum algorithm in terms of complexity. We suppose that the cache has one line. This means that in the best scenario the number of cache hits is 1, while in the worst scenario, if a preemption occurs before the third access, the number of cache hits is 0. Finally, we suppose 𝐾 = 1. We can build the deterministic tables 𝑁 and 𝑌 : i \t 0 (A) 1 (B) 0 2 1 𝑁 = 1 2 ∞ 2 ∞ ∞ 74 Figure 1: Quantum circuit for a block (i.e. for a fixed 𝑘 ′ ). Oracle 𝑂1 chooses between 𝐵1 and 𝐵2, while 𝑂2 chooses between 𝐴 and 𝐵, where 𝐵 is the answer of 𝑂1 . We build the superposition representing all possible number of cache hits and all possible num- ber of preemption |𝑘 ′ , 𝑆⟩ with 𝑘 ′ = {0, 1} and 𝑆 = {0, 1} for one object in 𝑃: 𝛼0 |0, 0⟩ + 𝛼1 |0, 1⟩ + 𝛼2 |1, 0⟩ + 𝛼3 |1, 1⟩ . In this superposition we have that either 𝛼𝑖 = 0 either 𝛼𝑖+1 = 0, with 𝑖 even. Then, ⎡0 0 1 0⎤ ⎢ ⎥ 0 0 0 1⎥ 𝐷=⎢ ⎢1 0 0 0⎥ ⎢0 1 0 0⎥ ⎣ ⎦ and ⎡0 0 0 1⎤ ⎢ ⎥ 0 0 1 0⎥ 𝐺=⎢ ⎢0 1 0 0⎥ ⎢1 0 0 0⎥ ⎣ ⎦ . In order to build a quantum circuit, we can express the matrices 𝐷 and 𝐺 as boolean expres- sion on the qubits 𝑎, 𝑏: 𝐷 ∶ |𝑎, 𝑏⟩ ⟶ |𝑁 𝑂𝑇𝑎, 𝑏⟩ and 𝐺 ∶ |𝑎, 𝑏⟩ ⟶ |𝑁 𝑂𝑇𝑎, 𝑁 𝑂𝑇 𝑏⟩ 75 Figure 2: Choice between 𝐵1 and 𝐵2 Figure 3: 𝐷 and 𝐺 operators Now, the oracle 𝑂2 has to choose between the case 𝐴 and the case 𝐵, where 𝐵 is the result of the first oracle 𝑂1 . case A 𝛼0 |0, 0⟩ + 𝛼1 |0, 1⟩ + 𝛼2 |1, 0⟩ + 𝛼3 |1, 1⟩ case B 𝛼0′ |0, 0⟩ + 𝛼1′ |0, 1⟩ + 𝛼2′ |1, 0⟩ + 𝛼3′ |1, 1⟩ The answer of the 𝑂2 oracle is 𝐴 if 𝛼1 + 𝛼3 < 𝛼1′ + 𝛼3′ , 𝐵 otherwise. 5.2. Post-processing After applying the previous quantum-algorithm, for each block (i.e. for each 𝑘 ′ ), we have a superposed state ∑𝑘 ′ |𝑘 ′ , 𝑆0 ⟩ where 𝑆0 is the minimal number of cache hits if 𝑘 ′ preemptions occur and 0 < 𝑘 ′ < 𝐾 . From this superposed state we need to extrapolate the 𝑆0 corresponding to 𝐾 preemption. The idea is to use a Grover-style searching algorithm over the superposed state to isolate the solution (i.e 𝑆0 for 𝑘 ′ = 𝐾 ). In particular, we expect to have as input for the oracle the superposition: 0 |0, 0⟩ + 0 |0, 1⟩ + ⋯ + 𝛼0 |0, 𝑆0 ⟩ + ⋯ +0 |𝑘 ′ , 𝑁 − 1⟩ + 0 |𝑘 ′ , 0⟩ + 0 |𝑘 ′ , 1⟩ + ⋯ +𝛼𝑘 ′ |𝑘 ′ , 𝑆0 ⟩ + ⋯ + 0 |𝑘 ′ , 𝑁 − 1⟩ + ⋯ +0 |𝐾 , 0⟩ + 0 |𝐾 , 1⟩ + ⋯ + +𝛼𝐾 |𝐾 , 𝑆0 ⟩ + ⋯ + 0 |𝐾 , 𝑁 − 1⟩ 76 We suppose then to build an oracle 𝑂𝐺 that selects the qubits with "index" corresponding to 𝐾 . We suppose that 𝑂𝐺 implements, in a Grover-style, an Amplitude Amplifier that amplifies coefficients 𝛼𝑖 and leaves to zero the zeros coefficients, see Fig.4. If at the end of the algorithm the result is |𝑥, 𝑦⟩ where 𝑥 ≠ 𝐾 , it means that the searching algorithm didn’t worked and we should apply again the algorithm. In Fig. 5 we can see that the oracle takes as input superposed state of the results of the circuit for each 𝑘 ′ in Fig. 1 and returns the minimal number of cache hits in the sequence of memory access in case of 𝐾 preemptions. 6. Complexity analysis In order to compute the final solution, at each iteration of the first loop, we create a table of dimensions 𝑀 × 𝐾 , containing the number of the cache hits, for each item in 𝑃 and each 𝑘 ′ , with 0 < 𝑘 ′ ≤ 𝐾 . Each iteration of the second loop is independent from the others, meaning that we can compute any row of the next table independently from the others rows. Then, the complexity of "updating" a row (most inner loop) is 𝑂(𝐾 ). Therefore the dynamic pro- gramming classical algorithm has a complexity of 𝑂(𝑀 × 𝑁 × 𝐾 ) (see Fig 6a). In the context of porting classical applications to quantum computing, we designed a “hybrid quantum-classical version” of the previous algorithm, whose output is the number of cache hits corresponding to 𝐾 preemption. We consider a row in one table as a superposition of states representing all the numbers of preemptions 𝑘 ′ and all the possible values of the number of cache hits. In this way, the complexity of the act of "updating" a row becomes 𝑂(1) (see Fig. 6b). Therefore the complexity of the resulting quantum algorithm becomes 𝑂(𝑀 × 𝑁 ). As a drawback, we need to post-process the result we obtained at the previous step: superpositions of dimension 𝐾 × 𝑈 , where 𝑈 < 𝑁 is the maximum number of cache hits. However, we only need the value of the number of cache hits when 𝐾 preemptions occur (i.e. only one value per row in the last set of rows). Therefore, for each row, we can use Grover’s algorithm to extract the value of interest √ in the √ associated√ superposition. This means that this post-processing has complexity 𝐾 × 𝑈 < 𝐾 × 𝑁 < 𝑁 2 = 𝑁 . Hence, the complexity of the whole quantum algorithm is decreased to 𝑂(𝑁 2 + 𝑁 ) compared to 𝑂(𝑁 3 ) for the classical algorithm. 7. Conclusions and perspectives In this paper, as a first step to deal with WCET-related problems by means of quantum-classic hybrid algorithms, we worked on a highly simplified program model since we just considered single-path programs, yet in the presence of arbitrary preemptions. A natural perspective would be to attempt to generalize this approach to more complex program models allowing for some non-determinism in the control-flow. However, such a generalisation would require a careful control of the size of the quantum state representing the cache hit counters array in order to avoid a complexity blow up at the post-processing amplification step. Also, in this paper, we have ported a polynomial-time dynamic programming algorithm to the quantum framework. In essence, most such algorithms follow the same regular pattern of several nested loops updating an array data structure with the final result obtained in only one 77 Figure 4: Amplitude amplification performed by the oracle 𝑂𝐺 . The coefficient 𝛼, the one correspond- ing to the solution |𝐾 , 𝑆0 ⟩, is amplified while the coefficients 0s, corresponding to the non-solutions, are not modified by the amplification. In the figure we can see how the 𝛼 is at first reversed and then doubled, leading to an amplification, in a Grover-style fashion. of the entries of the last array. As such, the approach in this paper, consists in turning the inner loop of our algorithm in a “quantum parallel for” associated to a Grover-style amplification on the single result of interest. As a consequence, the approach in this paper may potentially be generalized to other (exponential-time or polynomial-time) dynamic programming algorithms 78 Figure 5: Post-processing circuit. The oracle 𝑂𝐺 takes as input the answer of 𝑂2 for each block, after 𝑁 application of the circuit in Fig.1. It returns as final answer the number of cache hits 𝑆𝑂 for the considered memory accesses sequence, in case of 𝐾 preemptions. in order to derive (polynomial-only) quantum speedups. References [1] R. Hall,"A Quantum Algorithm for Software Engineering Search." 2009. [2] S. Louise, "A First Step Toward Using Quantum Computing for Low-Level WCETs Esti- mations." 2019. [3] C.-G.Lee, J.Hahn, Y.-M.Seo, S.L.Min, R.Ha,S.Hong, C.Y.Park, M. Lee, and C. S. Kim, “Anal- ysis ofcache-related preemption delay infixed-priority preemptive scheduling.” 1998. [4] H. Ramaprasad, F. Mueller, “Tightening the bounds on feasible pre-emption points.” 2006. [5] H. Ramaprasad, F. Mueller, “Bounding worst-case response timefor tasks with non- preemptive regions.” 2008. [6] G. Buttazzo, M. Bertogna, G. Yao, "Limited Preemptive Scheduling for Real-Time Systems. A Survey." 2013 [7] A. Manju, M.J. Nigam, "Applications of quantum inspired computational intelligence: a survey". [8] S. B. Ramezani, A. Sommers, H. K. Manchukonda, S. Rahimi and A. Amirlatifi, "Machine Learning Algorithms in Quantum Computing: A Survey," 2020. [9] E. Tang, "A quantum-inspired classical algorithm for recommendation systems," 2019. [10] H. Talbi, A. Draa and M. Batouche, "A new quantum-inspired genetic algorithm for solving the travelling salesman problem," 2004. 79 (a) Classic algorithm (b) Quantum algorithm Figure 6: In orange: the number of cache hits, for one object in 𝑃 and all 𝑘 ′ [11] D. Jethwani and F. Le Gall and Sanjay K. Singh, "Quantum-Inspired Classical Algorithms for Singular Value Transformation," 2020. [12] G. Ripoll, J. Jose, "Quantum-inspired algorithms for multivariate analysis: from interpo- lation to partial differential equations," 2021. [13] N. Chia and H. Lin and C. Wang. , "Quantum-inspired algorithms for multivariate analysis: from interpolation to partial differential equations," 2021. [14] A. Gilyen and Z. Song and E. Tang, "An improved quantum-inspired algorithm for linear regression," 2020. [15] G. Florio, D. Picca, "Quantum implementation of elementary arithmetic operations," 2004. [16] A.V. Cherkas and S.A. Chivilikhin, "Quantum adder of classical numbers," 2016. [17] R. Wilhelm, J. Engblom, A. Ermedahl, N. Holsti, S. Thesing, D. Whalley, G. Bernat, C. F.d Heckmann, T. Mitra, F. Mueller, I. Puaut, P. Puschner, J. Staschulat, and . Stenström. " The worst-case execution-time problem—overview of methods and survey of tools." 2008 [18] A. J. Abhari, S. Patil, D. Kudrow, J. Heckey, A. Lvov, F. T. Chong et M. Martonosi, "Scaffcc : a framework for compilation and analysis of quantum computing programs." 2014 [19] A. Facon, S. Guilley, M. Lec’Hvien, A. Schaub et Y. Souissi, "Detecting cache-timing vul- nerabilities in post-quantum cryptography algorithms." 2018 [20] J. Cavicchio, C. Tessler and N. Fisher, "Minimizing cache overhead via loaded cache blocks and preemption placement." 2015 80