<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Towards MRAM Byte-Addressable Persistent Memory in Edge Database Systems</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Luís Meruje Ferreira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Fábio Coelho</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>José Orlando Pereira</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>INESC TEC, Campus da Faculdade de Engenharia da Universidade do Porto</institution>
          ,
          <addr-line>Rua Dr. Roberto Frias, 4200-465 Porto</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>University of Minho, Campus de Gualtar, Rua da Universidade</institution>
          ,
          <addr-line>4710-057 Braga</addr-line>
          ,
          <country country="PT">Portugal</country>
        </aff>
      </contrib-group>
      <fpage>98</fpage>
      <lpage>111</lpage>
      <abstract>
        <p>There is a growing demand for persistent data in IoT, edge and similar resource-constrained devices. However, standard FLASH memory-based solutions present performance, energy, and reliability limitations in these applications. We propose MRAM persistent memory as an alternative to FLASH based storage. Preliminary experimental results show that its performance, power consumption, and reliability in typical database workloads is competitive for resource-constrained devices. This opens up new opportunities, as well as challenges, for small-scale database systems. MRAM is tested for its raw performance and applicability to key-value and relational database systems on resource-constrained devices. Improvements of as much as three orders of magnitude in write performance for key-value systems were observed in comparison to an alternative NAND FLASH based device.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;MRAM</kwd>
        <kwd>edge databases</kwd>
        <kwd>persistent memory</kwd>
        <kwd>microcontroller</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>the following contributions are provided: systems for these devices is very similar to doing so for
a commodity server.
• Comparison of MRAM and FLASH storage charac- Microcontrollers (MCUs), on the other hand, are
limteristics - We compare the nominal endurance, en- ited to up to 1 MB of memory, and a few MB of internal
ergy expenditure, storage capacity, and monetary FLASH storage, and single core processing units with
frecost of each type of device as advertised by the quencies under 0.5GHz. The capabilities of these devices
corresponding vendors. Moreover, we determine can be extended by adding external memory and
storand discuss the advantages and disadvantages of age. They are mainly used as end devices in IoT systems
each type of device. such as sensors or actuators, and are frequently powered
• Comparison of MRAM and FLASH storage perfor- by batteries, so power consumption is of major concern.
mance - We experimentally compare the through- Furthermore, they do not support full-fledged operating
put of each type of device under varying I/O op- systems, so programming MCUs is a lower-level
experierations sizes, both in terms of their raw per- ence.
formance and for their performance under rel- MCUs constitute the first layer of edge databases, often
evant use cases for resource constrained devices, being the data generators of these systems [16, 9, 10, 13].
namely key-value and relational database sys- Historically, this data was mostly ofloaded to MPUs or
tems. more capable cloud nodes; however, with the increase
in the number of MCU devices, concerns about network</p>
      <p>To evaluate MRAM’s capabilities, a new prototype was overload started to arise, as the increasing number of
developed, combining a state-of-the-art MCU with an parallel connections to centralized processing servers
MRAM memory device. Results show that MRAM is ca- imposes a large load on network resources [19].
Furtherpable of providing full throughput at much smaller I/O op- more, publications have shown that for an MCU, data
erations when compared to FLASH storage, enabling it to transmission can consume more energy than local
storprovide 3 orders of magnitude better performance in key- age and processing [6]. When coupled with the fact that
value applications. For the case of relational databases, the capabilities of MCUs continue to increase and that,
MRAM can forego FLASH specific mechanisms such as compared to MPU systems, MCUs are more afordable
wear-leveling, thus freeing resources that can be used and consume less energy [20], it makes sense to push as
instead by the DataBase Management System’s (DBMS) many processing and storage tasks as possible towards
query engine. MCUs.</p>
      <p>The rest of the paper is organized as follows: Section 2
provides the necessary background on MCUs and MPUs, FLASH storage FLASH storage is the most common
FLASH storage, and MRAM persistent memory; Section 3 type of storage solution used in both MPU and MCU
compares the characteristics of both storage solutions in devices, with the two main types of FLASH used being
terms of vendor-provided information; Section 4 details NOR FLASH and NAND FLASH.
how diferent data management systems were adapted to
work with MRAM memory; Section 5 provides the results
of our practical evaluation; and Section 6 discusses the
results. Finally, Section 7 draws conclusions and provides
possible paths for future work.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Background</title>
      <p>MPUs vs MCUs Microprocessors (MPUs) are
processing units including multiple processing cores (e.g., up to
5 cores in recent oferings [ 15]), with frequencies over
1GHz, and usually coupled to a few GB of memory (under
10) and hundreds of GB of storage (e.g., SD cards). MPUs
usually serve as intermediaries between cloud and end
devices, i.e., IoT or edge gateways [13, 16, 11].
RaspberryPi is a popular MPU based device that is recurrently
used by researchers in IoT and Edge related publications
[17, 18]. MPUs, due to their greater amount of resources
and support for the appropriate primitives, often support
the use of operating systems such as Linux, so developing
• NOR FLASH. NOR FLASH memory has fast read
speeds, but has less storage capacity and higher
cost per byte than NAND. As such, NOR FLASH
memory is often used to store application code in
MCUs since code tends to be small, and fast read
speeds mean faster execution times (although
developers can manipulate unused NOR space
however they want). Overall, NOR FLASH memory
tends to be used in predominantly read-intensive
workloads.
• NAND FLASH. NAND FLASH memory, on the
other hand, because it is cheaper, provides better
write performance and greater storage capacity,
is the most widely used technology of the two.
It is the underlying technology of most digital
storage media, such as SD cards and SSDs.</p>
      <sec id="sec-2-1">
        <title>FLASH storage devices (both NAND and NOR) are</title>
        <p>organized into blocks. Each block contains a series of
pages (e.g., 128 pages), where each page can store, for
example, 8KB of data [21].</p>
        <p>Erase operations are the only way in flash memory and MCUs, whereas Intel Optane is only available for
to convert data bits to 1. Furthermore, the smallest unit more capable computers.
that can be erased is a block, afecting multiple pages. Despite MRAM’s technology being available since the
Generally, NOR FLASH tends to have much slower erase 1980s[23], it was only recently that significant advances
speeds than NAND FLASH. in performance and chip density have made MRAM
at</p>
        <p>Program (i.e. write) operations are done at the page tractive for data management applications. It is important
level, and data bits can only be changed from 1 to 0. This to understand how these new MRAM chips compare to
means that a data section can only be written once after current FLASH storage technologies, in order to
undereach erase operation. Similarly, read operations are also stand the viability of MRAM as either a replacement, or
performed at the page level. complement, to the standard FLASH technologies
cur</p>
        <p>The fact that the lowest unit of control in FLASH stor- rently in use.
age is a page hinders its overall performance. For
example, if an operation afects only a part of a page, the entire
page must still be read or written, and the unwanted data 3. MRAM vs FLASH
will be ignored. In cases where the data to be read or To understand the viability of MRAM as an alternative
written fits within a single page, but the data happens to to FLASH storage, four MRAM devices with increasing
be unaligned such that it is split between two pages, both capacity and read/write performance are compared with
pages must be read or written. Erase operations being NAND FLASH and NOR FLASH devices. The MRAM
deperformed at the block level restrict write operations. For vices chosen were: (M1) AS30040316 [24], (M2) MR4A16
example, for a write operation to be performed over a BMA35 [25], (M3) EMxxLx [26] and (M4) EMD4E001GAS
page which is not erased, and the data in the remaining 2 [27]. The corresponding M1-M4 notations are used for
pages to be kept, all pages in the block must be erased and each of these devices to ease referencing during the rest of
rewritten. This is why erase operations are often delayed this Section. As for FLASH storage, MT29F128G08AJAA
until multiple pages have been marked for deletion. AWP-ITZ:A [21] was selected to represent NAND FLASH</p>
        <p>Furthermore, FLASH storage devices support a rela- and MT28EW512ABA1HPC-0SIT [28] was chosen to
reptively low number of erase-write cycles per storage block, resent NOR FLASH.
after which a block can no longer be modified. Therefore, Furthermore, the following characteristics for each
systems must adopt wear-leveling mechanisms, where device are analyzed:
write operations are carefully spread so that no block is
subjected to substantially more write operations than the
others. Finally, FLASH storage devices provide
asymmetrical performance. Accessing random addresses is slower
than accessing sequential positions, and write operations
are slower than read operations.</p>
      </sec>
      <sec id="sec-2-2">
        <title>MRAM persistent memory Magnetoresistive Random</title>
        <p>Access Memory (MRAM) is a type of persistent memory
where data is truly byte-addressable. For the case of the
devices showcased here, data is organized into 1 or 2
byte cells, with the possibility for each byte to be read
or written independently. Furthermore, data bits can be
freely converted between 0 and 1 by write operations,
forgoing the need for data to be erased.</p>
        <p>Compared to FLASH storage, MRAM provides better
read / write performance, more write cycles per cell, as
well as symmetric performance for sequential and
random accesses. Furthermore, due to being byte-addressable,
reads and writes can reach their maximum throughput
even with very small operations, whereas FLASH
storage only achieves maximum throughput for operations
involving multiple kilobytes of data.</p>
        <p>MRAM presents similar characteristics to 3D XPoint
[22], the byte addressable persistent storage technology
on which Intel Optane is based. Contrary to 3D XPoint,
however, MRAM chips are available for use with MPUs
• Read/Write/Erase throughput - the performance
throughput of a device for read, write and erase
operations. The metric considered was Megabytes
per second (MB/s). Note that as per the
discussion in Section 2, erase operations do not apply
to MRAM devices.
• Capacity - the amount of data a given device is</p>
        <p>able to store, in megabits.
• Endurance - the number of writes or erases that a
particular data cell can endure before the vendor
no longer guarantees correct functioning of the
data cell.
• Energy - the amount of energy required to
perform a write operation. The metric considered
was nanojoules per Byte written.
• Cost - the monetary cost of a given device per
amount of storage capacity. The metric
considered was euros per megabit of storage capacity.</p>
        <p>The values of the characteristics analyzed for each
device are presented in Table 1. Values were calculated
based on information made available by each device’s
datasheet. For performance throughput, the values
presented correspond to the maximum nominal values. The
energy consumption figures are based on either peak
consumption or typical consumption values, depending
on the information made available by vendors.
Performance All considered MRAM devices outper- chip can support, it may be enough for current edge and
form both FLASH devices in write performance, between IoT persistent storage requirements.
2.42× and 1040× . As for read performance, both FLASH
devices are outperformed by the M3 and M4 MRAM de- Cost MRAM has a higher cost per MB than NAND and
vices by a factor of between 1.18× and 11× . Further- NOR FLASH. The M4 MRAM device (the less expensive
more, NAND FLASH is 479× faster than NOR FLASH per byte) is 4.26× more expensive than the
representawhen erasing a block of data. Since MRAM can over- tive NOR FLASH device and 81× more expensive than
ride data without first deleting it, its operations are not the NAND FLASH device. However, there is a
logarithafected by erase performance, which also greatly sim- mic relationship between the capacity of the MRAM chip
plifies the management of data being stored on MRAM, and its price per megabit, i.e., as the density increases, the
when compared to FLASH storage. price decreases significantly. If the MRAM chip density
continues to increase and this relationship is maintained,
Endurance MRAM supports at least 100000× more we can expect the gap between the cost of MRAM and
operations per cell than FLASH memories, and some de- FLASH memory chips to decrease.
vices claim an unlimited number of operations during the
lifetime of the chip. As such, there is no need for
employing wear-leveling mechanisms, meaning less operational 4. Data Systems on MRAM
overhead. This also translates into a longer life for the
device, making it a better choice for scenarios with high
data churn.</p>
        <p>Energy In the case of MRAM, the energy required to
write a single byte has an inverse correlation with its
throughput performance. All MRAM devices show a
lower energy consumption when writing data compared
to FLASH devices, requiring 4× -13× less energy
compared to NAND FLASH, and 42× -127× less energy
compared to NOR FLASH.</p>
        <p>The two major drawbacks of MRAM are capacity and
cost.</p>
        <p>Capacity The most capable MRAM device, M4, has a
storage capacity of 1000 Megabits, which is 2× the
capacity of the NOR FLASH device, but 128× less than the
capacity of the NAND FLASH device. Recent advances
have achieved multi-Gb capacity in single MRAM chips
[29], however, we have not considered these devices for
analysis, as they are not yet widely available for
commercial use, with vendors marketing those devices only
for space-grade applications. Although this is still
significantly less than the hundreds of gigabits that a NAND</p>
      </sec>
      <sec id="sec-2-3">
        <title>Three systems were either implemented or adapted to</title>
        <p>run over MRAM to understand how MRAM memory
can impact each of the two use cases previously
identified for data storage in resource-constrained devices:
key-value stores and relational database systems. Since
MRAM works similarly to common volatile
RandomAccess Memory (RAM), two structures commonly used
for in-memory key-value storage were selected: a
Linear Probing Hash Table (LPHT) and a Cache-Line Hash
Table (CLHT) [30]. Since MRAM is persistent, such data
structures can easily be adapted to provide the equivalent
of a key-value store. For comparison, RocksDB, a
wellestablished persistent key-value store, was selected as a
baseline. Since RocksDB is a more complex system than
the selected hash tables, a more capable computation
unit was assigned, to ofset the increased computational
overhead (see Section 5).</p>
        <p>For the case of relational databases, we needed a
system that could easily be adapted to run on either an MPU
or MCU without changing its core functionality, in
order to provide a fair comparison. With that objective,
SQLite was selected since portability across diferent
operating systems is guaranteed by its separate OS layer,
which allows for custom implementations. Each of these
systems interacts with MRAM through a custom driver
which supports write and read operations in multiples of
1, 2, 4 or 8 bytes. Below, we detail how each system was
adapted to run over MRAM.</p>
      </sec>
      <sec id="sec-2-4">
        <title>Update operations, however, would need further mechanisms to ensure crash consistency. As it is unclear from vendor datasheets whether such elementary operations guarantee atomicity, this issue deserves further study.</title>
      </sec>
      <sec id="sec-2-5">
        <title>Cache-Line Hash Table The CLHT [30] is a dynamic</title>
        <p>Linear Probing Hash Table The LPHT was imple- hash table that increases its size as more pairs are added.
mented from scratch, supporting Insert, Read and Update The table consists of a series of buckets, where each
operations. It separates MRAM’s space into two sections: bucket contains a set of key-value pairs, a lock, and a
one for metadata, which keeps tracks of the occupation pointer to the next bucket. As such, keys are hashed
state for each key-value slot, and a second for data, which into positions of the hash table, where each position is
stores the actual key-value pairs. The size of these pairs composed of a linked list of buckets. CLHT supports
must be set before the hash table is used, and all key- insert, read, and remove operations.
values share the same size. CLHT’s main advantage is the fact that each bucket</p>
        <p>Information on occupied slots is stored in an array of is sized to fit into a cache line, thus greatly accelerating
bits, where each bit keeps the occupation state of a key- consecutive operations in the same bucket, a common
value pair slot. If the bit is set to one, the slot is occupied, occurrence both when inserting and when fetching
keyotherwise it is free. value pairs.</p>
        <p>To run a CLHT on MRAM, a series of modifications
• Insert Operation - Insert operations are per- were applied to the original implementation [31], more
formed through the put(key,value) command. When specifically to the Lock-based version. First, locking was
the put() command is called, the key is hashed disabled, as the prototype developed only has a single
into one of the key-value slots. If the slot is oc- core (see Section 5 for setup details). Although a
Lockcupied, a try is made for the slot that follows free version is also provided, that version of CLHT uses
immediately after, and so on, until an empty slot snapshotting mechanisms to allow concurrent operation,
is found. When an empty slot is found, the key- which incurs computational overhead that is undesirable
value pair is written into that slot, and then, the in an MCU.
bit indicating the slot is occupied is set to 1. If no Secondly, all read and write operations of the hash
slot is found, the hash table is full and the insert table on the underlying storage device are redirected
operation fails. through the MRAM driver. Third, a simple custom heap
• Update Operation - Update operations are also memory area was implemented on MRAM, since the
origperformed when the put() command is called. If inal implementation relied on malloc for space allocation,
during an insert operation, the key is found al- which caused memory fragmentation when enforcing
ready stored in the hash table, the corresponding alignment constraints. By using our own heap
implevalue is replaced with the new one, i.e., update mentation, no memory space is wasted. Our heap
impleoperations replace the old value with a new one. mentation currently only supports allocating more space.
• Read operation - Read operations are performed We leave implementing deallocation and
defragmentathrough a get(key) command. Similarly to an in- tion operations to future work.
sert operation, a read is performed by hashing the Finally, the size of the bucket and key-value pair was
key to a slot, and traversing the corresponding adjusted to fit the cache line size of the MCU selected
and successive slots until either the key is found to interface with the MRAM device. Each key or value
in an occupied slot, in which case the value is occupies 4 bytes, and a bucket is set to a size of 32 bytes,
returned; or until an empty slot is found, or all holding 3 key-value pairs and additional metadata. The
slots are traversed, returning a null value in that rest of the codebase remained unchanged.
case.</p>
      </sec>
      <sec id="sec-2-6">
        <title>Although not implemented, removing a key-value pair</title>
        <p>is as simple as flipping the occupation bit corresponding
to the afected pair to 0.</p>
        <p>Each operation in the MRAM memory is split into
16-bit or 8-bit operations, which are performed one at a
time over the memory. Assuming that these operations
are atomic, insert, remove (if implemented), and read
operations are crash-consistent, meaning that in case of
failure, the hash table would guarantee a consistent state.
SQLite SQLite is a highly portable embedded relational
database. However, it is more commonly used in MPUs,
since previous MCUs were not able to run this database
system [7]. Even so, with advances in MCU
capabilities, and by augmenting an MCU with MRAM, we were
able to successfully run SQLite on an STM32 (a popular
line of MCUs). To do so, a custom OS portability layer
is required [32]. The OS layer establishes how SQLite
interacts with the underlying file system and OS calls.</p>
      </sec>
      <sec id="sec-2-7">
        <title>It includes functions for retrieving random values, and</title>
        <p>current time; and also functions for opening, reading,
writing, and closing files.</p>
        <p>To build the custom OS layer, three components were
required: the OS layer implementation itself; LittleFS
[33], a file system for MCUs; and the MRAM driver.
The MRAM driver performs low-level read and write
operations on the MRAM. LittleFS, in turn, provides a
lightweight file system that requires only a handful of
functions to be implemented, such as writing and reading
data to the storage medium. In this case, this
functionality is provided to LittleFS through the MRAM driver.
Finally, the custom OS layer makes use of LittleFS to
implement file operations, while OS functions such as
random number generation are implemented using
functions provided by native STM32 libraries.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Experiments</title>
      <p>circuit board had to be designed and produced to interface
the STM32 with the MRAM device.</p>
      <p>We perform a series of experiments to assess the
viability of MRAM as a suitable alternative, or complement, to
current FLASH based storage. First the raw performance
of the considered devices is evaluated, and then their
performance is compared under key-value and relational
database scenarios.</p>
      <sec id="sec-3-1">
        <title>For the experimental setup, two devices were used: an</title>
        <p>STM32 MCU with MRAM memory, and an MPU, more
specifically a Raspberry Pi 3B, with an SD card as its
storage medium (i.e., NAND FLASH storage). The main
characteristics of each are described in Table 2.</p>
        <p>The STM32H743ZI microcontroller (MCU) [34] is a
single core, 32 bit, 480MHz processing unit that comes 5.1. Raw performance evaluation
with 2MB of NOR FLASH memory and 1MB of RAM
memory. The MCU connects to an AS3004316 MRAM The read and write throughput capabilities for the storage
memory [24], with 4Mb of storage capacity, and 35ns mediums in each device are evaluated, both in sequential
access time both for read and write operations of either and random access scenarios. Furthermore, the relation
8 or 16 bits. This MCU has 16 Kilobytes of L1 cache for between I/O block size and throughput performance is
instructions, and 16 Kilobytes of L1 cache for data. By evaluated.
default, both caches are disabled. For the tests depicted
here, the instruction cache is always enabled, however Testing methodology For MRAM, a random string
the data cache is set depending on the test being run. with length equal to the desired operations size was
genWhenever data cache is used, it is set as write-through, erated, and written to the device, either to sequential or
so that any write to the cache is immediately persisted random addresses. As for reads, blocks of data of the
deto MRAM memory. sired size were read, from random or sequential addresses.</p>
        <p>The RaspberryPi 3B is driven by a 64 bit BCM2837 The addresses were selected before the test was run. In
microprocessor (MPU), boasting 4 cores at 1.2Ghz. It has the case of random addresses, duplicates are allowed, so
1GB of RAM memory, and uses a SanDisk Extreme SD a particular location may be overwritten multiple times.
Card, with 32GB of storage capacity. As for sequential addresses, if the maximum address is</p>
        <p>Notice that the MRAM uses between 10 × − 100× less reached, operations wrap around the initial address. All
energy than the SD Card, and that the Raspberry Pi has tests run until 500MB are read or written. The STM32’s
considerably more computational power and memory L1 data cache is disabled for this test. In the case of the
resources than the STM32H743ZI MCU. SD Card, fio, an open-source I/O tester [40], was used.</p>
        <p>For easy reference, the names STM32 (as well as MRAM), Each test runs for 20 seconds, with a ramp up time of 2
and RPi (or one of NAND FLASH or SD Card setup) are seconds. We chose the following settings for fio: the
used throughout this section to describe the MCU and engine chosen was libaio; iodepth is set to 20; the
the MPU based setups, respectively. direct option is set to 1; and there is only 1 job running</p>
        <p>It is possible to interface both NAND and NOR FLASH, at a time. Results were averaged over 5 independent runs.
as well as MRAM, with both MPUs and MCUs. However, The direct option only allows operation sizes equal or
this specific setup was selected as it was the option with
the greatest potential for success, given that a custom
aEstimation based on: [36, 37]
bEstimation based on: [38, 39, 37]</p>
        <p>211 216
Operation size (bytes)
221</p>
        <p>226
MRAM-sequential (write)
MRAM-sequential (read)
RPi-sequential (read)
RPi-sequential (write)</p>
        <p>MRAM-random (write)
MRAM-random (read)
RPi-random (read)</p>
        <p>RPi-random (write)
greater to the page size of the device, so operation sizes
for the SD Card start at 512 bytes.</p>
        <p>Figure 1 shows the performance of both MRAM and
RPi’s NAND FLASH under read/write sequential/random
workloads with varying request sizes. MRAM is able
to achieve its maximum throughput with I/O blocks as
small as 4 bytes for writes, and 512 bytes for reads, due
to its byte-addressability, being able to maintain that
level of throughput as block size increases. Maximum
speeds of 34MB/s for writes and 29MB/s for reads were
observed. Both random and sequential read/write
patterns presented identical performance (notice the
overlapping lines in Figure 1). The SD card achieved 22MB/s
for reads, and 26MB/s for writes at block sizes of multiple
kilobytes. Furthermore, random accesses present lower
performance than sequential accesses. In conclusion, the
MRAM device is able to provide higher throughput than
the SD Card storage on the Raspberry Pi for all block
sizes, especially at I/O operation sizes under 4KB. We
confirm that, in the case of MRAM, random or
sequential accesses have no impact on performance, with the
results for both types of accesses being almost exactly
the same. However, we notice that there is a diference
in performance between write and read operations, with
write operations outperforming the latter. We leave to
future work determining the cause for this discrepancy.</p>
        <p>In the case of the SD card, we note that the read speed
is also higher than write speed, which is uncommon for
FLASH storage. This is, however, inline with the results
found in previous tests for SD card performance with
Raspberry Pis [41].</p>
        <sec id="sec-3-1-1">
          <title>5.2. Impact on key-value systems</title>
          <p>Being one of the identified use cases for data management
systems in IoT and edge related systems, where resource
constrained devices are used to store and process data,
the impact of using MRAM for key-value systems is
evaluated. In this experiment, I/O operations of varying sizes
are executed over diferent key-value systems. The
objective of the experiment is to evaluate how the previously
identified advantage in raw performance afects these
systems. Both an LPHT, and a Hash Table previously
adapted to work with Intel Optane [42], CLHT, are
implemented on the STM32 over MRAM. Since data stored in
MRAM is persistent, both Hash Tables provide a similar
service to a persistent key-value store, although with
less functionality. We compare them with RocksDB, a
popular persistent key-value, running on the RPi. We run
RocksDB both with and without fsync, a configuration
which when turned on guarantees persistence for each
write operation. Single and multi-threaded execution is
also considered for the case of RocksDB. We acknowledge
that RocksDB is a more complex system than a simple
hashtable, but the RPi’s MPU gives it a significant
computational advantage over the hashtables running on the
STM32. We also include results without fsync, giving
RocksDB the advantage of not having to persist its wal
log on every single write operation.</p>
        </sec>
      </sec>
      <sec id="sec-3-2">
        <title>Testing methodology For the key-value scenarios,</title>
        <p>a series of string arrays were generated separately, in
order to ensure that the operations submitted to each of
the evaluated systems is identical, and that the data
generation process does not afect performance estimation.
Datasets composed of arrays of randomly generated 2, 4,
8, 16, 32, 64, 128, and 256 byte strings were built. String
deduplication was not performed, making it possible to
have multiple put operations for the same key. The size
of each dataset is equal to roughly 50% of the storage
capacity of the MRAM memory (i.e., 2Mb). For the 2, 4, 8,
16, 32, 64, 128, and 256 byte datasets 65365, 32768, 16384,
8192, 4096, 2048, 1024, and 512 entries were generated,
respectively. For each byte size, 5 diferent arrays were
generated.</p>
        <p>In the case of RocksDB versus LPHT (Section 5.2.1)
the write experiment progresses as follows. One of the 5
datasets with 2 byte strings is selected. For each string A
in the dataset, an operation of the type put(A,A) is
performed, using the same string for both the key and value
ifelds. The performance of each system is then averaged
over the 5 diferent data sets for the same byte size. The
same procedure is followed for the remaining byte sizes,
and a similar procedure is followed for the read
workload, but with get(A), instead of put(A,A) operations. For
the case of RocksDB, diferent combinations of fsync (on
or of) and number of client threads are tested, as they
MRAM Hashmap (write)
RocksDB-nofsync-1thread (write)
RocksDB-fsync-1thread (write)</p>
        <p>RocksDB-nofsync-6threads (write)
RocksDB-fsync-6threads (write)
21
22
23 24 25 26
Key/value size (bytes)
27
28
MRAM Hashmap (read)
RocksDB-1thread (read)</p>
        <p>RocksDB-6threads (read)
have a significant impact in system performance. When
multiple client threads are used, the elements of each
dataset are split as equally as possible amongst them.
Since RocksDB with fsync turned on performs
significantly slower, tests targeting this setup are limited to
5000 put operations per data set.</p>
        <p>In the case of RocksDB versus CLHT (Section 5.2.2),
the key and value field sizes are fixed to 4 bytes each, so
only the 4 byte data sets are used, since the size of buckets
must align with the size of a cache line. Furthermore,
each run is fixed to 20000 put operations, due to the added
space occupied by CLHT’s additional structures. Similar
to LPHT, CLHT is initialized with space to fit 2 times the
amount of data that is inserted in each test.</p>
        <p>For both LPHT and CLHT, at the end of each run, a
consistency check is performed, where each of the stored
values is retrieved from the Hash Table, and checked for
correctness. We highlight that for the specific case of
LPHT, we observed up to 0.002% of pairs missing pairs
from the table when checking for consistency in scope of
a run. We consider this to be due to a problem with our
circuit board design for the MRAM memory chip, since
slowing down the speed of the memory solves these
errors.</p>
        <p>Data L1 cache was enabled for all tests involving
keyvalue systems, with the cache policy set to write-through.
Since the results of the experiments resulted in
disparities of multiple orders of magnitude, a logarithmic scale
is used for the y-axis of all diagrams, which represent
operations per second.
5.2.1. LPHT vs RocksDB
Figure 2 compares the number of operations per second
that RocksDB and the LPHT are able to perform, when
the size of a single key and corresponding value increases.
The size represented on the horizontal axis, in bytes,
corresponds to the size of a single key, or a single value.
This experiment’s conclusion is that the MRAM setup
outperforms the NAND FLASH alternative on almost all
scenarios. For write operations (left side of Figure 2),
MRAM outperforms all RocksDB setups. However, as
the key/value size increases, the diference between the
STM32 setup, and the RocksDB setups where fsync is
turned of, shrinks. At a key/value size of 4 bytes, MRAM
is able to perform 35× more operations per second than
RocksDB with 1 thread and no fsync. But, when the size
is increased to 256 bytes, the ratio between the two is
only 1.4× (still in favor of MRAM).</p>
        <p>The LPHT running on MRAM memory guarantees
persistence on each write operation, so the RocksDB
setups that more closely resemble it are the ones where
fsync is enforced. When guaranteeing persistence at each
operation, the multithreaded RocksDB setup is vastly
outperformed by the STM32’s Hash Table, with LPHT
performing between 134× and 3837× more operations
per second.</p>
        <p>For the case of read operations (right side of Figure 2),
LPHT is able to outperform the multithreaded RocksDB
for key/value sizes under 32 bytes, by a factor of between
1.64× and 6.69× more operations per second. For the
case of the single threaded RocksDB, the Hash Table is
able to outperform RocksDB for key/value sizes under 128
bytes, with up to 20× more read operations per second.
of reads, MRAM outperforms RocksDB with 6 threads by
9× .</p>
        <p>We conclude that the raw performance advantage of
MRAM over NAND Flash translates into a significant
advantage in key-value systems, especially for smaller
key-value sizes. For this use case, trading computational
power for storage performance is the correct approach,
indicating that the main bottleneck of these systems is
indeed the FLASH storage device.</p>
        <sec id="sec-3-2-1">
          <title>5.3. Impact on relational database system</title>
        </sec>
      </sec>
      <sec id="sec-3-3">
        <title>Finally, on a more complex scenario, SQLite’s perfor</title>
        <p>mance is compared when running on the STM32 over
MRAM, and on the RPi. This allows for a comparison
of the same exact system across the two platforms. The
results of running SQLite on the STM32 with a custom
OS layer are compared against SQLite running on the RPi
with NAND FLASH (using the default UNIX OS layer).
Writes</p>
        <p>Reads</p>
        <p>Key/value size: 4 bytes
CLHT on MRAM
RocksDB-nofsync-1 thread
RocksDB-nofsync-6 threads
RocksDB-fsync-1 thread
RocksDB-fsync-6 threads</p>
      </sec>
      <sec id="sec-3-4">
        <title>Testing methodology For SQLite, a schema consist</title>
        <p>ing of a single table representing a sensor is used. The
Figure 3: Comparison between CLHT running on MRAM and table consists of four columns of the integer type:
timesRocksDB running on NAND FLASH in RPi. tamp, device_id, zone, and pressure. Each insert operation
inserts a new record which increments the timestamp
of the previously inserted record, and generates random
It is interesting to note that RockDB’s performance for values for the remaining columns. Each run reads or
keys/values with 2 bytes is significantly better than for writes a total of 5000 rows, but the number of inserted
the remaining sizes. The most likely reason for this is the values or selected rows per transaction varies. For
exambig number of duplicate values present in the dataset used ple, in the first test 5000 transactions are executed, with
for this particular experiment, which allows RocksDB to a single read or write operation being performed in each
easily keep all values in its block cache. We also note that transaction. For the last test, however, only 50
transacfor key/value sizes above the previously stated, RocksDB tions are performed, with 100 values being selected or
is able to perform more read operations per second than inserted in each transaction. Results depict an average
LPHT. of five independent runs, and all SQLite files are deleted</p>
        <p>In both cases, the performance of LPHT declines at between runs. SQL queries are generated prior to the
a faster rate than RocksDB as key/value size increases. test, so that throughput estimation is not afected by the
This can be due to the fact that as keys get bigger, the time spent generating those queries. STM32’s L1 data
computational efort to compute its hash value increases. cache is enabled for all SQLite experiments enforcing a
Since the STM32 has less computational power than the write-through policy.</p>
        <p>RPi, this efect will be more noticeable. Figure 4 shows the number of rows either inserted
or read per second, in relation to the number of rows
afected by a single transaction. Unlike in the case of
key5.2.2. CLHT vs RocksDB value stores, it is not enough to perform a direct swap of
Figure 3 shows how CLHT compares to diferent con- the storage medium from NAND Flash to MRAM for the
ifgurations of RocksDB while inserting key/value pairs STM32 to outperform the MPU based (RPi) in a relational
with 4 bytes each. When compared to RocksDB running database scenario. That is because relational databases
without the fsync option and a single thread (the best impose a greater computational overhead, thus giving the
scenario for the no fsync configuration), CLHT is able advantage to the more capable RPi. Even so, the greater
to perform 11× more write operations per second. Com- performance of MRAM for small write operations
enpared to RocksDB’s best scenario with fsync being en- ables the STM32 to achieve a performance that is close to
forced, where RocksDB uses 6 threads, which is also the that of the RPi for insert transactions afecting very few
scenario that most closely resembles the persistence that rows. For the experiment where each write transaction
MRAM provides with each write, CLHT is able to per- performs only two insert operations, the RPi outperforms
form 1827× more write operations per second. In terms
in sensor networks, i.e., smart health care or industrial</p>
        <p>IoT, might benefit considering this technology.
104 MRAM imposes less computational overhead on
systems, as it does not require wear-leveling, batching, or
sequential ordering mechanisms which are often used by
ce FLASH based systems. This opens the way to lowering
/ssw103 systems’ complexity when using MRAM.
oR This can be of special importance for relational database
systems in resource constrained devices. In such
computationally limited devices, MRAM allows forgoing FLASH
focused mechanisms, freeing computational capacity that
102 can instead be used by the DBMSs query engine. This
can help support the ongoing efort to enable more
fea0 20 40 60 80 100 tures in MCU relational databases, since current options
Rows/transaction have to severely limit the number of supported features</p>
        <p>in order to fit available resources [ 7, 6]. Furthermore,
STM32-MRAM (insert) STM32-MRAM (read) these MCUs provide additional resources such as: Direct
RPi-NAND (insert) RPi-NAND (read) Memory Access controllers (DMAs) that enable data to
Figure 4: Comparison between SQLite running on: STM32’s be moved between storage devices without CPU
intervenMRAM and RPI’s NAND FLASH. tion; and dedicated hashing controllers that can calculate
hash values also without CPU intervention; which can be
explored to further increase database performance while
the STM32 by only 1.02× . As the number of insert state- putting less load on the CPU. In the case of key-value
ments per transaction increases, SQLite performs bigger systems MRAM could enable more functionality to be
I/O operations which decreases the performance gap of shifted from MPU devices to MCU devices, while still
the two storage media, and allows the RPi to perform improving MCU battery lifetime. For example, in the
up to 1.48× more insert operations per second than the case of wearable sensors, data has to be uploaded in its
STM32. In the case of select operations, the RPi is able entirety to a more capable MPU to calculate statistics on
to read around 2.21× more rows per second than the the gathered data (e.g., [11, 43]) due to lack of CPU power,
STM32 across all types of transactions. We conclude which coincidentally increases the amount of data
transthat for relational databases MRAM can assist an MCU mitted, increasing the rate at which the sensors’ battery
achieve a similar performance to an MPU for basic op- is drained. By consuming a lower amount of
computaerations while consuming less energy and having less tional capacity MRAM can allow the MCU to make those
computational resources. However, a direct substitution calculations locally, thus only transmitting the already
of the storage media is not suficient for the MCU to out- processed data. This data will be smaller, and allow the
perform the MPU. One possible way to further improve MCU to conserve more energy by moving the load on the
performance for SQLite in the MCU would be to shed MCU towards computation in place of data transmission.
the additional computational overhead that is imposed MRAM will be specifically appealing in scenarios where
by the FLASH oriented mechanisms such as the wear- key-values are small (i.e., small write operations), the
leveling mechanism in LittleFS, and the Write-Ahead Log most common occurrence in key-value systems [14], and
in SQLite. where said data needs to be persisted. This may be a
requirement for critical systems such as those involving
medical scenarios or public services management (e.g.,
6. Discussion smart grid applications).</p>
        <p>Both solutions share the same price bracket, however
The first conclusion to draw from this work is that MRAM in our approach CPU is traded for memory performance.
provides a big advantage in small I/O operations. MRAM We believe this to be the correct choice for the case of
adoption can be particularly interesting for key-value edge databases, since storage is the primary bottleneck.
applications, such as edge Time-Series Databases (TS- However, we must take into account that MRAM has a
DBs) and Key-Value Stores, which often handle small significantly lower storage capacity per chip (up to 8Gb
key/value pairs [14]. Furthermore, MRAM can provide per chip [44]) and a greater price per space unit. In total,
strong consistency guarantees, since all write operations the STM32 used in this work could directly support up
are immediately persisted. As depicted in Figure 2, the to 512Mb of MRAM memory. As such, the main
contriimpact of using fsync (i.e., persisting every write) with bution given by MRAM to edge systems, at the moment,
FLASH memory is significant. Thus, critical applications is not in storage capacity, which is the case for FLASH,
but rather in performance, energy expenditure, and en- battery, requiring frequent recharging of the medical
sendurance. As such, a hybrid approach could provide the sor device. If instead MRAM storage was used, the MCU
best of both worlds (i.e., MRAM and FLASH). MRAM could potentially have enough processing power left to
could be combined with more conventional FLASH stor- extract the ECG data locally, and only relay relevant
inage, e.g, an SD Card, to achieve both better performance formation to the MPU, extending operational lifetime of
and durability, while still ensuring a large amount of the charge cycle.
storage space. With the perspective of decreasing prices The conducted experiments used an M1 MRAM device
(see Section 3), MRAM only storage may also be a possi- (Table 1), as it allowed to create a prototype in a shorter
bility in the future. MRAM memory may also pave the time frame. Employing faster M3 or M4 devices could
way to instant recoverability if used as an alternative potentially increase the observed performance, which we
to non-persistent program memory. Energy-wise, the reserve for future work.
considered MRAM setup has a power profile 10 × smaller Similar to MRAM, there are a series of other persistent
when compared with the NAND FLASH which provides memory technologies which can be considered for use
a positive impact for edge applications. with database systems. We consider the comparison of</p>
        <p>We hypothesize two use cases for MRAM use, to bet- MRAM against other types of persistent memory to be
ter clarify how this technology can benefit edge data outside the scope of this work, but we encourage
intermanagement systems. ested parties to check on related work which provides
that analysis [45]. As for how previous work with the
Relational database use case - Picture a scenario popular Intel Optane persistent memory can be applied
where each sensor runs its own relational database over to MRAM, we believe there are multiple reasons why
FLASH (e.g., [6]). At any given moment, a sensor may such work may not be applicable here. The Intel Optane
be queried for its data, however it is limited to only a line is composed of more complex devices which are
comfew operations, such as select, update, delete and insert posed of multiple data storage chips, with non-persistent
operations, or simple join operations. More complex caching mechanisms and capability for concurrent
operoperations, such as nested queries are not supported, due ations. Related work in Intel Optane enabled key-value
to a lack of CPU power which would make the time to stores, for example, focuses on providing consistency
complete the query unacceptable. Thus, the client must guarantees given non-persistent write operations (i.e.,
issue only the innermost select query, and process the involving caching) and maximizing concurrency related
received data locally, possibly requiring further queries performance [14, 42]. Some optimizations are also based
to complete the original query. This means that more on optimizing the use of the libraries provided for Intel
data will be transmitted to the client than the data needed Optane access. In contrast, databases for MCUs, as
anato answer the original query, therefore more energy will lyzed here, have a single execution thread. Furthermore,
be used by the MCU. the targeted MRAM device does not support concurrent</p>
        <p>Now, replace the storage device for either MRAM only, operations and does not provide caching mechanisms.
or a hybrid MRAM and FLASH solution. MRAM having a The MRAM memory is accessed in the same way as
norlower management complexity frees up part of the com- mal memory: through a pointer to a particular address
putational budget, which can now be used by the query which is mapped to a location in the MRAM memory.
engine to support faster processing. Furthermore, faster As such, the set of problems for systems targeting Intel
performance means less I/O waiting time, equating to Optane is not the same as for MRAM systems.
less unused processor cycles. With the extra
computational budget attributed to the query engine we are now 7. Conclusion
able to support nested queries. By executing the entire
query in one go only the minimum required amount of
data is transmitted to the client, optimizing the amount
of energy used.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Research for the use of persistent byte-addressable mem</title>
        <p>ory for database systems has been focused on data
centerscale applications, namely supported by Intel Optane
products, [46, 47]. Results show, however, that byte
adKey-value use case - Picture a scenario where a pa- dressable persistent memory should also be explored for
tient wears an MCU based and battery powered sensor, use in resource constrained data management systems.
that takes heart related measurements. Storing and pro- This paper shows that MRAM provides several
advancessing the data locally using FLASH storage would be tages over NAND FLASH alternatives. At the hardware
too computationally expensive for the MCU, so instead level, MRAM enables 5 orders of magnitude more write
those measurements are transmitted in raw form to a operations per cell, thus making it practically
impervimore capable MPU, where ECG data is extracted from ous to cell wear-out. Furthermore, random and
sequenthe raw data. This transmission of data drains the MCUs tial accesses have identical performance, and maximum
throughput is achieved with writes as small as 4 bytes,
and reads of 512 bytes.</p>
        <p>MRAM shows a throughput advantage on all I/O block
sizes when compared to FLASH, particularly for block
sizes under 32KB. This was observed in the Raw
Performance tests, but also for the Hash Table tests, despite
being a more complex workload and with the exception
that for key/value sizes greater than 32 bytes, RocksDB
evaluation with the NAND Flash alternative outperforms
MRAM’s LPHT. The relational database test with SQLite
showed that although MRAM can help MCUs reach a
performance close to that of an MPU for a relational database,
a direct replacement of NAND FLASH for MRAM is not
suficient for the MCU to outperform the MPU.
However, MRAM allows for a lot of the mechanisms that are
currently used to accommodate FLASH to be avoided,
opening new architectures directed specifically at MRAM
to outperform MPUs.</p>
        <p>In a nutshell, MRAM presents a big advantage over
NAND FLASH in small I/O operations, being able to
achieve full throughput at operation sizes of just a few
bytes. Furthermore, performance is not afected by
random access patterns. The virtually infinite endurance of
MRAM memory avoids the need for any wear-leveling
mechanisms, and its low power consumption contributes
to extend the lifetime of battery powered MCUs. Nominal
values also point to MRAM being able to achieve a
significantly higher peak throughput than FLASH storage.</p>
        <p>Thus, MRAM can allow for systems which are simpler to
implement, have higher performance, and consume less
energy.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Acknowledgements</title>
      <sec id="sec-4-1">
        <title>This project has received funding from the European</title>
        <p>Union’s Horizon 2020 research and innovation programme
under grant agreement No 857237. The sole
responsibility for the content on this publication lies with the
authors. It does not necessarily reflect the opinion of
the European Commission (EC). The EC are not
responsible for any use that may be made of the information
contained therein. It is also funded by National Funds
through the FCT — Fundação para a Ciência e a
Tecnologia (Portuguese Foundation for Science and Technology)
PhD grant (PD/BD/151402/2021).</p>
        <p>spruj40c.pdf?ts=1687780680209, rev. C. technology.com/products/discrete-mram/space.
[16] M. Li, D. Ganesan, P. Shenoy, Presto: Feedback- [30] T. David, R. Guerraoui, V. Trigonakis,
Asynchrodriven data management in sensor networks, nized concurrency: The secret to scaling concurrent
IEEE/ACM Transactions on Networking 17 (2009) search data structures, ACM SIGARCH Computer
1256–1269. Architecture News 43 (2015) 631–644.
[17] C. Wang, X. Huang, J. Qiao, T. Jiang, L. Rui, J. Zhang, [31] E. Distributed Computing Laboratory, Clht, 2013.</p>
        <p>R. Kang, J. Feinauer, K. McGrail, P. Wang, et al., URL: https://github.com/LPD-EPFL/CLHT.
Apache iotdb: Time-series database for internet of [32] SQLite, The sqlite os interface or "vfs", 2023. URL:
things, Proc. VLDB Endow. 13 (2020) 2901–2904. https://www.sqlite.org/vfs.html.
[18] F. Wu, C. Qiu, T. Wu, M. Yuce, Edge-based hybrid [33] littlefs project, littlefs, 2023. URL: https:
system implementation for long-range safety and //github.com/littlefs-project/littlefs.
healthcare iot applications, IEEE Internet of Things [34] STMicroelectronics, Stm32h743zi, 2023. URL:
Journal 8 (2021) 9970–9980.
https://www.st.com/en/microcontrollers[19] S. Alamouti, F. Arjomandi, M. Burger, Hybrid microprocessors/stm32h743zi.html.
edge cloud: A pragmatic approach for decentralized [35] SanDisk, Sandisk extreme® microsdxc™ uhs-i
cloud computing, IEEE Communications Magazine card, 2023. URL: https://www.westerndigital.com/
60 (2022) 16–29.
products/memory-cards/sandisk-extreme-uhs-i[20] aws, What’s the diference between micropro- microsd#SDSQXAF-032G-GN6MA.
cessors and microcontrollers?, 2023. URL: https: [36] G. P. Perrucci, F. H. P. Fitzek, J. Widmer,
Sur//aws.amazon.com/pt/compare/the-diference- vey on energy consumption entities on the
smartbetween-microprocessors-microcontrollers/. phone platform, in: 2011 IEEE 73rd Vehicular
Tech[21] M. Technology, Mt29f128g08ajaaawp-itz:a, nology Conference (VTC Spring), 2011, pp. 1–6.
nand flash memory, rev. h, 2014. URL: doi:10.1109/VETECS.2011.5956528.
https://pt.mouser.com/datasheet/2/671/ [37] SanDisk, Sandisk® industrial microsd card
micron_technology_micts06235-1-1759187.pdf . datasheet, 2016. URL:
https://images-na.ssl-images[22] Intel, 3d xpoint™: A breakthrough in non- amazon.com/images/I/91tTtUMDM3L.pdf .
volatile memory technology, 2015. URL: [38] S. Crawford, How secure digital
memhttps://www.intel.com/content/www/us/en/ ory cards work, 2011. URL: https:
architecture-and-technology/intel-micron-3d- //computer.howstufworks
.com/secure-digitalxpoint-webcast.html. memory-cards.htm.
[23] J. Heidecker, MRAM Technology Status, Technical [39] Samsung, Microsd pro endurance, 2023. URL:</p>
        <p>Report, NASA, 2013.
https://semiconductor.samsung.com/consumer[24] A. Technology, As3004316, parallel per- storage/memory-card/micro-sd-pro-endurance/.
sistent sram memory, rev. t, 2022. URL: [40] J. Axboe, Fio-flexible io tester, 2014. URL: https:
https://pt.mouser.com/datasheet/2/1122/ //github.com/axboe/fio.</p>
        <p>1Mb_32Mb_Parallel_x16_MRAM_2-1949428.pdf . [41] A. Piltch, Best microsd cards for raspberry pi 2023,
[25] E. Technologies, Mr4a16b, rev. 11.7, 2018. 2023. URL:
https://www.tomshardware.com/bestURL: https://pt.mouser.com/datasheet/2/144/ picks/raspberry-pi-microsd-cards.</p>
        <p>MR4A16B_Datasheet-1511254.pdf . [42] S. K. Lee, J. Mohan, S. Kashyap, T. Kim, V.
Chi[26] E. Technologies, Emxxlx, expanded serial pe- dambaram, Recipe: Converting concurrent dram
ripheral interface (xspi) industrial stt-mram per- indexes to persistent-memory indexes, in:
Proceedsistent memory, rev. 2.9, 2022. URL: https:// ings of the 27th ACM Symposium on Operating
www.everspin.com/supportdocs/all, rev. 2.9. Systems Principles, 2019, pp. 462–477.
[27] E. Technologies, Emd4e001gas2, 1gb non-volatile [43] H. Dubey, J. Yang, N. Constant, A. M. Amiri, Q. Yang,
st-ddr4 spin-transfer torque mram, rev. 1.2, 2020. K. Makodiya, Fog data: Enhancing telehealth big
URL: https://www.mouser.com/datasheet/2/144/ data through fog computing, in: Proceedings of
EMD4E001GAS2_1_2_08252020-1923803.pdf . the ASE bigdata &amp; socialinformatics 2015, 2015, pp.
[28] M. Technology, Mt28ew512aba1hpc-0sit tr, 1–6.</p>
        <p>parallel nor flash embedded memory, rev. i, [44] A. Technology, As308g208, space-grade
2018. URL: https://media-www.micron.com/ high performance dual-quad serial
per-/media/client/global/documents/products/ sistent sram memory, rev. e, 2023. URL:
data-sheet/nor-flash/parallel/mt28ew _mt28fw/
https://www.avalanche-technology.com/wpmt28ew_qlkp_512_aba_0sit.pdf .
content/uploads/1G-8Gb-Dual-QSPI-Space[29] A. Technology, Avalanche technology - products Grade-Serial-E-01_10_2023.pdf .
- space grade, 2023. URL: https://www.avalanche- [45] S. Kargar, F. Nawab, Challenges and future
directions for energy, latency, and lifetime
improvements in nvms, Distributed and Parallel Databases
(2022) 1–27.
[46] A. Shanbhag, N. Tatbul, D. Cohen, S. Madden,
Largescale in-memory analytics on intel® optane™ dc
persistent memory, in: Proceedings of the 16th
International Workshop on Data Management on</p>
        <p>New Hardware, 2020, pp. 1–8.
[47] Y. Wu, K. Park, R. Sen, B. Kroth, J. Do, Lessons
learned from the early performance evaluation of
intel optane dc persistent memory in dbms, in:
Proceedings of the 16th International Workshop
on Data Management on New Hardware, 2020, pp.
1–3.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>