<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>CPS Summer School PhD Workshop, September</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>PyCacheGen: A Highly Configurable Open-Source Generator for Synthesizable Caches</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Richard Müller</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Konstantin Lübeck</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Michael Kuhn</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Paul Palomero Bernardo</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Oliver Bringmann</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>University of Tübingen</institution>
          ,
          <addr-line>Embedded Systems, Sand 13, 72076 Tübingen</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>22</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Caches are essential components of modern computer architectures, playing a crucial role in bridging the performance gap between fast processors and relatively slow memory. However, designing and integrating highly configurable caches into complex system-on-chip (SoC) designs remains a significant challenge. To address this, we present PyCacheGen, a highly configurable open-source generator for synthesizable caches. PyCacheGen enables the generation of synthesizable Verilog cache modules with a broad range of configurable parameters, including associativity, write policy, allocation policy, write bufer, and multiple request ports. Notably, the caches generated by PyCacheGen support the write-back policy, which can lead to substantial performance improvements over the write-through policy. To showcase the capabilities of PyCacheGen, we generated various cache designs and successfully integrated them into the RISC-V PULPissimo SoC and synthesized the designs using GlobalFoundries' 22FDX+ technology node. Our results demonstrate that caches generated by PyCacheGen can significantly enhance performance for slow memory hierarchies and reduce energy by up to 5.51% for unit latency memories, all while increasing the area of the PULPissimo SoC by only 0.57%.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        As modern processor speeds have increased dramatically over the years, the latency for accessing data
from memories has not kept pace, leading to a bottleneck in overall system performance known as the
processor-memory performance gap or memory wall [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. This gap necessitates the use of a memory
hierarchy, including a cache, to bridge the performance divide. Caches are designed to store frequently
accessed data and allow the processor to access this subset within only a single clock cycle. This enables
faster data access latencies, which in turn improve the overall performance and throughput, allowing
processors to operate close to their theoretical maximum speed.
      </p>
      <p>
        However, caches have a wide array of design parameters, and identifying the optimal parameter
set is highly application-dependent. This requires a configurable cache generator to find the best
cache design parameters for specific application requirements. Therefore, we propose PyCacheGen a
highly configurable open-source cache generator implemented in the Python-based Amaranth
hardware description language (HDL) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] which allows for the generation of synthesizable Verilog code.
PyCacheGen implements the following features: generation of fully associative, set-associative, and
direct-mapped caches, supporting the write-through and write-back policies, with write allocate and
no-write allocate policies, an optional write bufer, configurable block size and data width, single-cycle
latency for read and write requests, diferent replacement policies, multiple request ports, flushing,
and multiple cache levels. To evaluate the performance, power consumption, energy, and area of the
generated caches, we integrated them into the PULPissimo RISC-V SoC platform [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] and synthesized
it using GlobalFoundries’ 22FDX+ technology node and conducted register-transfer level (RTL) and
post-synthesis simulations, executing representative benchmarks from the BEEBS [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] benchmark suite.
      </p>
      <p>HDL</p>
      <sec id="sec-1-1">
        <title>Direct-mapped</title>
      </sec>
      <sec id="sec-1-2">
        <title>Set-associative</title>
      </sec>
      <sec id="sec-1-3">
        <title>Fully associative</title>
      </sec>
      <sec id="sec-1-4">
        <title>Replacement policies</title>
      </sec>
      <sec id="sec-1-5">
        <title>Write-through policy</title>
      </sec>
      <sec id="sec-1-6">
        <title>Write-back policy</title>
      </sec>
      <sec id="sec-1-7">
        <title>Write bufer</title>
      </sec>
      <sec id="sec-1-8">
        <title>Write allocate</title>
      </sec>
      <sec id="sec-1-9">
        <title>No-write allocate</title>
      </sec>
      <sec id="sec-1-10">
        <title>Mult. request ports</title>
      </sec>
      <sec id="sec-1-11">
        <title>License</title>
      </sec>
      <sec id="sec-1-12">
        <title>Eval. technology HDL</title>
      </sec>
      <sec id="sec-1-13">
        <title>Direct-mapped</title>
      </sec>
      <sec id="sec-1-14">
        <title>Set-associative</title>
      </sec>
      <sec id="sec-1-15">
        <title>Fully associative</title>
      </sec>
      <sec id="sec-1-16">
        <title>Replacement policies</title>
      </sec>
      <sec id="sec-1-17">
        <title>Write-through policy</title>
      </sec>
      <sec id="sec-1-18">
        <title>Write-back policy</title>
      </sec>
      <sec id="sec-1-19">
        <title>Write bufer</title>
      </sec>
      <sec id="sec-1-20">
        <title>Write allocate</title>
      </sec>
      <sec id="sec-1-21">
        <title>No-write allocate</title>
      </sec>
      <sec id="sec-1-22">
        <title>Mult. request ports</title>
      </sec>
      <sec id="sec-1-23">
        <title>License</title>
      </sec>
      <sec id="sec-1-24">
        <title>Eval. technology</title>
      </sec>
      <sec id="sec-1-25">
        <title>Verilog</title>
        <p>✓
(✓) two-way only
✓
Counter / LRU
✓
✗
✗
n/a
n/a
✗
n/a</p>
      </sec>
      <sec id="sec-1-26">
        <title>Altera Stratix FPGA</title>
        <sec id="sec-1-26-1">
          <title>OpenCache [8]</title>
          <p>nMigen (Verilog)
✓
✓
n/a
FIFO / LRU / random
✗
✓
n/a
n/a
n/a
✗</p>
        </sec>
      </sec>
      <sec id="sec-1-27">
        <title>BSD 3-Clause License n/a</title>
      </sec>
      <sec id="sec-1-28">
        <title>VHDL n/a</title>
        <p>✓
n/a
LRU
✓
✗
✗
✗
✓
✗</p>
      </sec>
      <sec id="sec-1-29">
        <title>Apache License 2.0</title>
      </sec>
      <sec id="sec-1-30">
        <title>Xilinx &amp; Altera FPGA</title>
        <sec id="sec-1-30-1">
          <title>HPDcache [9]</title>
          <p>SystemVerilog
n/a
✓
n/a
n/a
✓
(✓)a
✓
n/a
n/a
✓</p>
        </sec>
      </sec>
      <sec id="sec-1-31">
        <title>Solderpad HW License v2.1</title>
        <p>(GF 22FDX)b</p>
        <sec id="sec-1-31-1">
          <title>IOb-Cache [7]</title>
        </sec>
      </sec>
      <sec id="sec-1-32">
        <title>Verilog</title>
        <p>✓
✓
n/a
LRU / PLRU (tree &amp; MRU)
✓
✗
✓
✗
✓
✗</p>
      </sec>
      <sec id="sec-1-33">
        <title>MIT License</title>
      </sec>
      <sec id="sec-1-34">
        <title>Xilinx FPGA</title>
        <sec id="sec-1-34-1">
          <title>This Work</title>
        </sec>
      </sec>
      <sec id="sec-1-35">
        <title>Amaranth HDL (Verilog)</title>
        <p>✓
✓
✓
FIFO / PLRU (tree &amp; MRU) / LRU
✓
✓
✓
✓
✓
✓</p>
      </sec>
      <sec id="sec-1-36">
        <title>BSD 3-Clause License GF 22FDX+</title>
        <p>aadded in public repository, bonly planned tape-out mentioned</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>2. Related Work</title>
      <p>
        Several configurable open-source data caches have been proposed. [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] presents a configurable Verilog
cache generator for FPGAs that supports fully associative, direct-mapped, and 2-way set-associative
caches using the write-through policy with counter-based and LRU replacement policies. The authors
provide area and performance results for an Altera Stratix FPGA for all three cache variants.
      </p>
      <p>
        The VLSI-EDA Pile of Cores (PoC) library [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] ofers implementations for various hardware modules
in VHDL, enabling the generation of set-associative caches with write-through and no-write allocate
policies. The library includes synthesis scripts for diferent FPGA vendors.
      </p>
      <p>
        The IOb-Cache [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] is a configurable Verilog cache for direct-mapped and set-associative caches
supporting the write-through policy together with a write bufer and no-write allocate. The authors
present resource and performance results for Xilinx FPGAs using the Dhrystone benchmark [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ].
      </p>
      <p>
        OpenCache [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] is a Verilog cache generator that utilizes the open-source OpenRAM [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] SRAM
compiler for internal cache memory. OpenCache supports generating direct-mapped and set-associative
caches that employ the write-back policy.
      </p>
      <p>
        HPDcache [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is a configurable data cache designed for RISC-V cores in SystemVerilog, utilizing
the write-through policy with a write bufer. It also supports multiple request ports and out-of-order
request processing.
      </p>
      <p>While these caches and generators ofer a range of open-source implementations, they support either
the write-through or write-back policy. In contrast, our configurable cache generator, PyCacheGen,
allows for synthesizable multi-port direct-mapped, set-associative, and fully associative caches using
either the write-through or write-back policy, along with a configurable write bufer and write or
no-write allocation, all within an easy-to-use Python interface. Table 1 provides a detailed comparison
of the diferent open-source caches and generators.</p>
    </sec>
    <sec id="sec-3">
      <title>3. PyCacheGen</title>
      <p>PyCacheGen is implemented in the Python-based Amaranth HDL, which allows for register-transfer
level simulations with the integrated RTL simulator. Additionally, the Amaranth HDL allows for
the generation of synthesizable Verilog code for simulation and synthesis with other electronic
design automation tools. PyCacheGen is available as open-source under the BSD 3-clause license from
https://github.com/ekut-es/pycachegen.</p>
      <p>To generate a cache hierarchy, an object of the CacheWrapper class must be created, which gets
passed a list of CacheConfig objects containing the parameters of each cache level, which also
allow for passing technology-specific memory macros for the caches’ internal memories. An example
for instantiating a CacheWrapper object is presented in Listing 1. Utilizing the Amaranth HDL’s
verilog.convert function, the cache_wrapper object is converted into Verilog code. Fig. 1 shows
a block diagram of the CacheWrapper together with its front and back end interfaces.</p>
      <p>To connect the diferent cache levels to each other and the CacheWrapper to the front and back end,
the MemoryBus is used, which combines several wires into a single bus. The CacheWrapper frontend
has an additional hit_o wire, which is set to high when a request results in a cache hit on the first
cache level. To connect the CacheWrapper back end to a memory, a memory interface adapter is
needed, which translates the MemoryBus signals to the signals of the memory interface.
1 from pycachegen import *
2 from amaranth.back import verilog
3
4 cache_wrapper = CacheWrapper(
5 num_ports=1, address_width=25,
6 cache_configs=[
7 CacheConfig(data_width=32, num_ways=2, num_sets=1, block_size=2,
8 replacement_policy=ReplacementPolicies.PLRU_TREE,
9 write_policy=WritePolicies.WRITE_BACK,
10 write_allocate=True, write_buffer_size=4),
11 CacheConfig(data_width=32, num_ways=2, num_sets=2, block_size=4,
12 replacement_policy=ReplacementPolicies.FIFO,
13 write_policy=WritePolicies.WRITE_BACK,
14 write_allocate=True, write_buffer_size=8)],
15 main_memory_data_width=128, create_main_memory=False)
16
17 with open("cache_wrapper.v", "w") as f:
18 f.write(verilog.convert(cache_wrapper, name="CacheWrapper"))</p>
      <p>Listing 1: Example Python code for generating a cache hierarchy using the CacheWrapper class.
...</p>
    </sec>
    <sec id="sec-4">
      <title>4. Results</title>
      <p>
        Experimental Setup To evaluate caches generated by PyCacheGen, we integrated the
CacheWrapper Verilog module as a data cache into the PULPissimo SoC platform [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], positioned
between the CV32E40P RISC-V processor and the 128 kB main memory. The main memory was
conifgured with read and write latencies of 8 and 12 clock cycles, respectively, to assess the runtime
efects of the data cache for slow memory hierarchies. We also present runtime results using an ideal
memory with unit latencies, representing the optimal performance for each benchmark. Runtime results
were collected through RTL simulations. For power, energy, and area data, the PULPissimo SoC was
synthesized using GlobalFoundries’ 22FDX+ technology node at a clock frequency of 200 MHz, utilizing
standard cells and low-leakage memory macros from Synopsys. The replacement policy employed was
PLRU with a tree data structure. We selected ten benchmarks from the BEEBS suite [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], including cubic,
dijkstra, mergesort, nbody, picojpeg, qrduino, rijndael, sqrt, stb_perlin, and whetstone, chosen for their
memory reliance and diverse memory access patterns.
      </p>
      <p>Performance Fig. 2 (a) presents the runtime results for a 2-way set-associative cache with a block
size of one utilizing the write-back policy and no write bufer. For all benchmarks, the runtime decreases
]
[rW10
m
e
w5
o
P
.vg 0
A
J[]
µ
ryg100
e
n
E 0
cubic
cubic
dijkstra
no cache
mergesort
16B
nbody
32B
64B
picojpeg
128B
qrduino
256B
dijkstra
no cache
mergesort
16B
nbody
32B
64B
picojpeg
128B
qrduino
256B</p>
      <p>rijndael
512B 1024B
sqrt
2048B
stb_perlin
4096B
when the cache size is increased by increasing the number of sets. Especially for caches of size 2048B
and 4096 B, the runtime closely matches the best achievable runtime when using no cache and a memory
with single-cycle access latencies.</p>
      <p>In Fig. 2 (b) runtime results for caches with increasing associativity are shown. All generated caches
use the write-back policy with no write bufer and have a size of 64 B, while each block contains one
data word. Generally, increasing the associativity has only a minor efect on the runtime.</p>
      <p>Fig. 2 (c) depicts the runtime results for 2-way set-associative caches with increasing block sizes and
a cache size of 64 B. For all programs, the runtime increases when the block size is increased, while the
best runtime is achieved with caches whose block size is set to one. This is because the miss penalty
increases with the block size since as many memory read transactions as data words in a block need to
be issued to populate one cache block.</p>
      <p>A runtime comparison between the write-through and write-back policies with increasing write
bufer sizes is shown in Fig. 2 (d). All generated caches are configured as 2-way set-associative, with
a size of 64 B and a block size of one, along with increasing write bufer sizes. For each benchmark,
hatched and unhatched bars of the same color represent the same write bufer size but diferent policies.
The hatched bars indicate runtime with the write-through policy and no-write allocation, while the
unhatched bar to the right uses the write-back policy with write allocation. The write-back policy
generally yields better runtime results than the write-through policy for most benchmarks, except
picojpeg and rijndael, as it issues memory write transactions only upon block eviction, reducing the
memory access frequency. Increasing the write bufer size improves runtime for the write-through
policy, as it allows the cache to handle incoming requests without stalling until the write bufer is full.
However, for the write-back policy, a larger write bufer does not enhance the runtime, since it issues
fewer memory write transactions, leaving the write bufer mostly empty.</p>
      <p>Lastly, Fig. 2 (e) shows runtime results when using a memory with single cycle read and write
latencies and together with 2-way set-associative caches utilizing the write-back policy and no write
bufer with increasing cache sizes. In this case, using a small cache increases the runtime slightly
because of frequent conflict misses. Since each memory transaction is routed through the caches, the
average access latency is increased compared to using no cache. When increasing the cache size, the
runtime almost matches the runtime using no cache across all benchmarks due to a low miss rate.
Power and Energy After presenting runtime results for varying cache parameters, we now provide
power and energy measurements from post-synthesis simulations of the entire PULPissimo SoC,
including memories and I/O cells with integrated caches of diferent sizes compared to no cache.
The memories are configured with single-cycle access latencies, and the generated caches are 2-way
set-associative with a block size of one, using the write-back policy without a write bufer.</p>
      <p>Fig. 3 (top) shows the average power for the PULPissimo SoC with and without integrated caches of
increasing size. A cache size smaller than 2048 B results in lower average power consumption for all
benchmarks due to stalling from frequent cache misses and routing all memory requests through the
cache. However, as the cache size increases to 2048 B or 4096 B, average power consumption exceeds
that of the no-cache scenario due to larger internal cache memory and reduced stalling.</p>
      <p>Fig. 3 (bottom) shows the energy for the PULPissimo SoC with and without integrated cache.
Generally, smaller caches lead to higher energy across all benchmarks. However, a 1024 B cache shows a
0.81% lower energy summed across all benchmarks compared to no cache integration, with a mean
runtime increase of only 0.93%. Notably, for dijkstra with a 1024 B cache, the energy decreases by 5.51%
compared to the no-cache scenario, representing the largest decrease among all benchmarks, with a
runtime increase of 2.04%. These energy improvements are attributed to the reduced number of memory
accesses facilitated by the cache.</p>
      <p>Area Lastly, we present post-synthesis gate-level area results. Fig. 4 shows the whole PULPissimo
SoC area including memories and I/O cells with no integrated cache and with caches of diferent sizes
either employing the write-back policy without a write bufer or the write-through policy with a write
bufer of size 8. The CacheWrapper Verilog module generated by PyCacheGen only slightly increases
the area of the PULPissimo SoC with a maximum of 1.69% when using the write-back policy and 1.56%
when using the write-through policy with a write bufer. For caches with sizes greater than 1024 B,
the area increase, when utilizing the write-back policy, is larger than the area increase when using
the write-through policy with a write bufer. This can be attributed to the more complex write-back
logic, which scales with the cache size. For a 1024 B cache that implements the write-back policy, which
shows the best energy improvements, the area is only increased by 0.57%.</p>
    </sec>
    <sec id="sec-5">
      <title>5. Conclusion</title>
      <p>This paper introduced PyCacheGen, a highly configurable open-source generator for synthesizable
caches. Using the Python-based Amaranth HDL, PyCacheGen generates fully associative, set-associative,
and direct-mapped caches that support both write-back and write-through policies, along with an
optional write bufer for various replacement and allocation strategies. This flexibility provides more
configuration options than existing open-source caches and generators. Our results show that caches
generated by PyCacheGen can be integrated into a SoC platform with only a 0.57% area increase
while achieving energy reductions of up to 5.51%. RTL and post-synthesis simulations indicate that
ifnding an optimal cache configuration for runtime, power, and energy is highly application-dependent,
highlighting the need for a user-friendly and configurable cache generator.</p>
      <p>Going forward, we plan to integrate a cache coherency protocol such as MESI to support multicore
architectures with multiple caches.</p>
    </sec>
    <sec id="sec-6">
      <title>Acknowledgments</title>
      <p>This work has been funded by the German Federal Ministry of Research, Technology, and Space (BMFTR)
under grant numbers 16ME0129 (Scale4Edge) and 01IS22086H (MANNHEIM-FlexKI).</p>
    </sec>
    <sec id="sec-7">
      <title>Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used LanguageTool in order to: Grammar and spelling
check, Paraphrase, and reword. After using this tool, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>D. A.</given-names>
            <surname>Patterson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. L.</given-names>
            <surname>Hennessy</surname>
          </string-name>
          , Computer Architecture:
          <article-title>A Quantitative Approach, 6th edition</article-title>
          , Morgan Kaufmann Publishers Inc., San Francisco, CA, USA,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Amaranth</surname>
            <given-names>HDL</given-names>
          </string-name>
          ,
          <year>2025</year>
          . URL: https://github.com/amaranth-lang/amaranth.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>P. D.</given-names>
            <surname>Schiavone</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rossi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Pullini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A. Di</given-names>
            <surname>Mauro</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Conti</surname>
          </string-name>
          , L. Benini,
          <article-title>Quentin: an Ultra-Low-Power PULPissimo SoC in 22nm FDX</article-title>
          , in: 2018
          <source>IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S)</source>
          ,
          <year>2018</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          . doi:
          <volume>10</volume>
          .1109/S3S.
          <year>2018</year>
          .
          <volume>8640145</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>J.</given-names>
            <surname>Pallister</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. J.</given-names>
            <surname>Hollis</surname>
          </string-name>
          , J. Bennett,
          <source>BEEBS: Open Benchmarks for Energy Measurements on Embedded Platforms abs/1308</source>
          .5174 (
          <year>2013</year>
          ). URL: http://arxiv.org/abs/1308.5174. arXiv:
          <volume>1308</volume>
          .
          <fpage>5174</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>P.</given-names>
            <surname>Yiannacouras</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Rose</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A Parameterized</given-names>
            <surname>Automatic</surname>
          </string-name>
          <article-title>Cache Generator for FPGAs</article-title>
          ,
          <source>in: Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798)</source>
          ,
          <year>2003</year>
          , pp.
          <fpage>324</fpage>
          -
          <lpage>327</lpage>
          . doi:
          <volume>10</volume>
          .1109/FPT.
          <year>2003</year>
          .
          <volume>1275768</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>[6] Chair of VLSI Design, Diagnostics and Architecture</article-title>
          , PoC - Pile of Cores,
          <year>2016</year>
          . URL: https://github. com/VLSI-EDA/PoC.
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J. V.</given-names>
            <surname>Roque</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. D.</given-names>
            <surname>Lopes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. P.</given-names>
            <surname>Véstias</surname>
          </string-name>
          , J. T. de Sousa,
          <string-name>
            <surname>IOb-Cache: A High-Performance Configurable</surname>
          </string-name>
          Open-Source
          <string-name>
            <surname>Cache</surname>
          </string-name>
          ,
          <source>Algorithms</source>
          <volume>14</volume>
          (
          <year>2021</year>
          ). doi:
          <volume>10</volume>
          .3390/a14080218.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>E.</given-names>
            <surname>Dogan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. F.</given-names>
            <surname>Ugurdag</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Guthaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>OpenCache: An</given-names>
            <surname>Open-Source OpenRAM Based Cache Generator</surname>
          </string-name>
          ,
          <year>2025</year>
          . URL: https://github.com/VLSIDA/OpenCache.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>C.</given-names>
            <surname>Fuguet</surname>
          </string-name>
          ,
          <article-title>HPDcache: Open-Source High-Performance L1 Data Cache for RISC-V Cores</article-title>
          , in
          <source>: Proceedings of the 20th ACM International Conference on Computing Frontiers, CF '23</source>
          ,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2023</year>
          , p.
          <fpage>377</fpage>
          -
          <lpage>378</lpage>
          . doi:
          <volume>10</volume>
          .1145/3587135.3591413.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>A. R.</given-names>
            <surname>Weiss</surname>
          </string-name>
          , Dhrystone Benchmark, History,
          <source>Analysis, "Scores" and Recommendations</source>
          , White Paper, ECL/LLC (
          <year>2002</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>M. R.</given-names>
            <surname>Guthaus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. E.</given-names>
            <surname>Stine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ataei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Wu</surname>
          </string-name>
          , M. Sarwar,
          <article-title>OpenRAM: An Open-Source Memory Compiler</article-title>
          , in: 2016 IEEE/ACM International Conference on Computer-Aided
          <string-name>
            <surname>Design</surname>
          </string-name>
          (ICCAD),
          <year>2016</year>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          . doi:
          <volume>10</volume>
          .1145/2966986.2980098.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>