=Paper= {{Paper |id=Vol-2853/paper53 |storemode=property |title=Data Structures and Lookup Algorithms Investigation for the IEEE 802.15.4 Security Procedures Implementation |pdfUrl=https://ceur-ws.org/Vol-2853/paper53.pdf |volume=Vol-2853 |authors=Viktor Melnyk |dblpUrl=https://dblp.org/rec/conf/intelitsis/Melnyk21 }} ==Data Structures and Lookup Algorithms Investigation for the IEEE 802.15.4 Security Procedures Implementation== https://ceur-ws.org/Vol-2853/paper53.pdf
Data Structures and Lookup Algorithms Investigation for the
IEEE 802.15.4 Security Procedures Implementation
Viktor Melnyk a,b
a
    John Paul II Catholic University of Lublin, Konstantynow str., 1H/313, Lublin, 25-708, Poland
b
    Lviv Polytechnic National University, 12, S. Bandery str., Lviv, 79013, Ukraine


                 Abstract
                 This paper investigates data structures and lookup algorithms that can be used in the security
                 procedures to be executed in ZigBee and other IEEE 802.15.4-based Systems-on-Chip. The
                 goal of the study is to find an effective approach to execute the security lookup procedures in
                 resource-constrained embedded device suitable for application in wireless personal area net-
                 work. Alternative lookup solutions are evaluated and compared by following criteria:
                 speed/time complexity, memory needs, scalability, and hardware realisation efficiency.
                 Lookup procedures functional analysis is performed, and security data characteristics are es-
                 timated. Three data structures are overviewed – trees, tries, and hash-tables. Their usage for
                 keeping data for lookups is analysed are algorithms for lookup procedures implementation
                 are proposed. Presented data structures are evaluated based on the mentioned criteria.

                 Keywords 1
                 IEEE 802.15.4 Security, Data Structures, Lookup Algorithms

1. Introduction
    In ZigBee, and other IEEE 802.15.4-based Systems on Chip (SoCs) security procedures execution
usually splits between programmable microprocessor or microcontroller and a specialized processor.
While the last is used to execute most intensive security-related computations, i.e., AES-CCM* en-
cryption and authentication, the first normally performs some not intensive but quite time-consuming
legacy (or security) checks, which are executed in a form of lookup procedures. Information, related
to these check executions, are placed in dedicated tables stored in a SoC memory and have to be cor-
respondingly structured.
    Security checks in the ZigBee and other IEEE 802.15.4-based SoCs have to be performed in a very
short time, but also provide compactness and low power consumption, that is extremely important for
resource-constrained devices operating in personal area networks (PANs). Since security procedures
are based on keys and data stored in various tables, a very fast and effective lookup algorithm is nec-
essary for ensuring that security checks do not have a significant negative impact on the performance
and power consumption of the whole system. In this regard, the efficiency of lookup algorithms, over
their own construction, depends also on the way data that they operate on is organized. Actually, the
data structures used provide the first constraint on the best performance achievable by any algorithm
using them. Therefore, prior to choosing a lookup algorithm, it is very important to select the most
adequate data structure for the data to be acted upon.
    The paper is structured as follows. We start in Sect. 2 with the overview of related works. In
Sect. 3 we describe the criteria for the lookup algorithms that we used to evaluate their efficiency. In
Sect. 4 we provide the analytical description of the IEEE 802.15.4 security lookup procedures. Securi-
ty Data Characteristics and Contextual Information are presented in Sect. 5. Further, in Sect. 6, we

IntelITSIS’2021: 2nd International Workshop on Intelligent Information Technologies and Systems of Information Security, March 24–26,
2021, Khmelnytskyi, Ukraine
EMAIL: email1@mail.com (V. Melnyk)
ORCID: 0000-0002-1176-2831 (V. Melnyk)
            © 2021 Copyright for this paper by its authors.
            Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
            CEUR Workshop Proceedings (CEUR-WS.org)
analyse the keys used in the security lookup procedures in order to understand which of the mentioned
data structures serves better our objectives. Moreover, we present trees, tries and hash tables – we do
not discuss arrays and lists since they are basic structures easily outperformed by the former three
ones – and describe the lookup algorithms that can be used with them. A comparison based on our
criteria is given in Sect. 7 and conclusions are drawn at the end of the paper.

2. Related Work
   Numerous scientific works are devoted to the implementation of IEEE 802.15.4 transceivers in
general, as well as information security systems in them. In particular, in paper [1] the IEEE 802.15.4
software implementation is considered for typical sensor node systems, the available implementations
concerning the supported features, flexibility issues and the efficiency of the implementations are
discussed. In paper [2] the design and implementation of IEEE 802.15.4 software stack in Embedded
Linux kernel is proposed. In papers [3, 4] hardware implementations of the IEEE 802.15.4 protocol in
FPGA are proposed, whereas in [5, 6] – implementations of information security subsystem for this
standard and of AES-CCM* cryptographic engine. However, to the best of the authors knowledge, in
the literature there is no available information that would relate to the study of efficient implementa-
tion, hardware or software, of data retrieval operations in the IEEE 802.15.4 security procedures, but
which, according to the author’s opinion, based on his practical experience in developing security
means for IEEE 802.15.4 systems, is quite important.

3. Efficiency Criteria
    Below the criteria used for evaluating and comparing alternative lookup solutions are briefly de-
scribed.
    Speed/Time complexity
    Speed is the most important metric in the evaluation of the algorithms. Since prior to the imple-
mentation we can only have estimates about the speed, we usually use time complexity estimates
available from the literature. These estimates are obtained theoretically or empirically, but usually in
both ways. This metric cannot provide an absolute duration estimate for a lookup algorithm. It can,
however, be used to estimate and compare the efficiency of the algorithms of interest prior to their
implementation in universal or specialized processors [7].
    Memory needs
    Given that the algorithms will run on small, embedded devices, the usable memory is also very
limited. The lookup algorithms will use indices or arrange data in a particular way to speed up data
access. Therefore, here we aim at minimizing the memory needed to store these indices and rear-
ranged data. In our estimations we assume usage of conventional Random Access Memory (RAM),
however, application of alternative memory types are promising in this regard, like the one proposed
in [8].
    Scalability
    The time that can be allocated to lookups is extremely short; therefore, it is important to know how
the raising number of entries affects the performance of the algorithms.
    Hardware realization
    Hardware implementation of the algorithms in question may also be required. Therefore, it is im-
portant to know how appropriate each of these alternatives is for this purpose as well.

4. IEEE 802.15.4 Security Lookup Procedures
   Although the lookup algorithms and data structures we present in this paper have not been explicit-
ly designed for security data, in order to select the most appropriate ones and to optimize their per-
formance we need to take into account the data’s characteristics. Thus, we need to know the type and
size of the data used as a key for the lookups, as well as the number of keys (i.e., records) involved in
the lookups at the same time. Furthermore, some knowledge about the hardware architecture, like the
availability of a cash memory beside the RAM and the size of these memories, is also necessary.
While key type and size information are defined in the standard [9] with a good precision, hardware
specification may differ from one SoC to another. Therefore, in this latter case we use some assump-
tions; eventual corrections and adaptations should be made for certain SoC architecture.
   There are four security lookup procedures [10] that need to be implemented in the IEEE 802.15.4-
based SoC. These are the key descriptor lookup, incoming security level checking, blacklist checking
and incoming key usage policy checking procedures.

4.1.    Lookup Procedures Functional Analysis
    Each device in the PAN can communicate with one or more another devices. Depending on which
keying model [11] is used in the PAN, one or more encryption keys are used by device for communi-
cation. The information determining how to provide the security is found in the security-related PIB
(PAN Information Base), which contains PIB security material. All keys are contained into
KeyDescriptor entries of the macKeyTable of the MAC PIB. Besides the key, KeyDescriptor contains
key-related information that is used during frame securing/unsecuring process, and auxiliary infor-
mation that is used for KeyDescriptor identification.
    Selection of appropriate key and key-related information performs KeyDescriptor lookup proce-
dure. It compares the pair of input values KeyLookupData and KeyLookupSize with contained into
the KeyDescriptor pair of values. Their coincidence means that communicating pair of devices uses
exactly this KeyDescriptor (i.e. exactly this key and key-related information). The KeyDescriptor
lookup procedure is applied either during securing and unsecuring of the frame. The number of
KeyDescriptor entries that macKeyTable contains is equal to the number of keys that this device uses.
One key may be used by device to communicate with one or more remote devices within the PAN.
    During input frame unsecuring device has to determine whether the sending remote device belongs
to the devices that use identified key. For this reason, each KeyDescriptor contains the list of devices,
for which this key is used. It is contained into the KeyDeviceList entry of the KeyDescriptor. Verifi-
cation of the device in the KeyDeviceList entry performs blacklist checking procedure (with embed-
ded DeviceDescriptor lookup procedure). Also, during input frame unsecuring, device has to deter-
mine whether identified key may be used to unsecure the frame of present type. The frame types
which unsecuring with identified key is supported are determined in the KeyUsageList entry of the
KeyDescriptor. Verification of the frame type in the KeyUsageList entry performs key usage policy
checking procedure.
    These three lookup procedures operate in the same MAC PIB security table: macKeyTable. During
input frame unsecuring device has to check whether the frame to be unsecured corresponds to the
minimum security requirements that are applied to this type of MAC frame. For this reason, MAC
PIB contains macSecurityLevelTable with the set of SecurityLevelDescriptor entries. Each Secu-
rityLevelDescriptor determines minimum security level used for appropriate frame type. If the frame
is MAC Command frame then security minimum may differ depending on the command type. Verifi-
cation of the security minimum requirements in macSecurityLevelTable performs incoming security
level checking procedure. Each lookup procedure is described in following subsections.

4.2.    KeyDescriptor Lookup Procedure
    Following actions shall be performed in regard to the macKeyTable attribute. The macKeyTable
attribute represents a table of key descriptor entries, each containing keys and related information
required for secured communications. Each key descriptor (KeyDescriptor) entry comprises infor-
mation which is shown in Table 1. The inputs of this procedure are KeyLookupDataValue and
KeyLookupSizeValue. The output of this procedure are KDL_KeyDescriptor and KDL_STATUS
values. In order to identify an appropriate KeyDescriptor the KeyIdLookupList element of
KeyDescriptor is analyzed. Each KeyIdLookupList element contains a list of KeyIdLookupDescriptor
entries used to identify current KeyDescriptor. Consequently, each KeyIdLookupDescriptor compris-
es information which is shown in Table 2.
Table 1
Fields and estimated sizes of KeyDescriptor record
       Field name                  Type            Size (bits)                              Description
 KeyIdLookupList              List of KeyId-    8*8+8=72/entry      List of KeyIdLookupListEntries entries used to identify
                             LookupDescript            or           this KeyDescriptor, each containing a LookupData and
                                 or etries      4*8+8=40/entry      a LookupSize fields. LookupData (64 or 32 bits) and
                                                                    LookupSize (8 bits) are used as the key for the out-
                                                                    going frame key retrieval lookup procedure.
 KeyIdLookupListEntries          Integer                8           The number of entries in KeyIdLookupList. Implemen-
                                                                    tation specific. 8 bits are enough for 256 entries.
 KeyDeviceList                 List of Key-     8+1+1=10/entry      List of KeyDeviceDescriptor entries indicating which
                                DeviceDe-                           devices are currently using this key, including their
                             scriptor entries                       blacklist status (see Table 5), each containing a De-
                                                                    viceDescriptorHandle – an implementation specific
                                                                    integer – a UniqueDevice and a Blacklisted indicator,
                                                                    each of boolean type.
 KeyDeviceListEntries            Integer                8           The number of entries in KeyDeviceList. Implementa-
                                                                    tion specific.
 KeyUsageList                     List of        8+8=16/entry       A list of KeyUsageDescriptor entries indicating frame
                              KeyUsageDe-                           types this key may be used with, each containing a
                             scriptor entries                       FrameType and a CommandFrameIdentifier.
 KeyUsageListEntries             Integer              8             The number of entries in KeyUsageList.
 Key                         Set of 16 octets      16*8=128         The actual value of the key.


Table 2
Fields of KeyIdLookupDescriptor entry
    Name                    Type                                               Description
LookupData          Set of 5 or 9 octets   Data used to identify the key.
LookupDataSize            Integer          A value of 0x00 indicates a set of 5 octets; a value of 0x01 indicates a set of 9
                                           octets.


   A lookup process involving the macKeyTable, KeyDescriptor, KeyIdLookupList, and KeyId-
LookupDescriptor is illustrated in Figure 1. The algorithm of KeyDescriptor determination is shown
by a pseudo code; it implies comparison of input values KeyLookupDataValue and
KeyLookupSizeValue with LookupData and LookupSize elements of each KeyIdLookupDescriptor,
respectively.




Figure 1: KeyDescriptor lookup process
4.3.     Incoming Security Level Checking Procedure
    Following actions shall be performed in regard to the macSecurityLevelTable attribute of the MAC
PIB. This attribute contains a table of SecurityLevelDescriptor entries, each with information about
the minimum security level expected depending on incoming frame type and subtype. The elements
of the SecurityLevelDescriptor are shown in Table 3.

Table 3
Fields and estimated sizes of SecurityLevelDescriptor record
        Field name          Type     Range      Size                                              Description
                                               (bits)
 FrameType                 Integer   0x00–       3          The Frame Type subfield is 3 bits in length and shall be
                                     0x03                   set to one of the nonreserved values as follows (values
                                                            100 – 111 are reserved): “000” – beacon, “001” – data,
                                                            “010” – acknowledgement, “011” - MAC command.
                                                            Used as a key in the incoming security level checking
                                                            procedure together with CommandFrameIdentifier and
                                                            SecurityMinimum.
 CommandFrameIdentifier    Integer   0x00–       8          The CommandFrameIdentifier subfield encodes the
                                     0x09                   command name (defined in the standard [9]).
 SecurityMinimum           Integer   0x00–       8          The minimal required/expected security level for incom-
                                     0x07                   ing MAC frames with the indicated frame type and, if
                                                            present, command frame type.
 DeviceOverrideSecurity-   Boolean   TRUE or     1          Indication of whether originating devices, for which the
 Minimum                              FALSE                 Exempt element (belongs to elements of DeviceDe-
                                                            scriptor) is set, may override the minimum security level
                                                            indicated by the SecurityMinimum element. If TRUE,
                                                            this indicates that for originating devices with Exempt
                                                            status, the incoming security level zero is acceptable, in
                                                            addition to the incoming security levels meeting the
                                                            minimum expected security level indicated by the Secu-
                                                            rityMinimum element.


    The inputs of this procedure are SecurityLevelSubfield, FrameTypeSubfield, and, depending on
whether the frame is MAC command frame, the first octet of the MAC payload (i.e. Command Frame
Identifier subfield). The output of this procedure is status (ISLC_STATUS), which can take PASSED,
CONDITIONALLY_PASSED, or FAILED value.
    A lookup process in the macSecurityLevelTable is shown in Figure 2. It operates as it is shown by
a pseudo code. The elements of SecurityLevelDescriptor are enclosed into brackets, like Secu-
rityLevelDescriptor(FrameType) is a FrameType element of SecurityLevelDescriptor of macSecu-
rityLevelTable attribute.
                                                   1. for i = 0, i = macSecurityLevelTableEntries -1 do
                                                   2.    if (FrameTypeSubfield /= “011” and
                                                   3.       FrameTypeSubfield = SecurityLevelDescriptor[i](FrameType)) then
                                                   4.       execute sec_level_compare(SecurityLevelValue, SecurityLevelDescriptor[i](SecurityMinimum));
                                                   5.       if sec_level_compare <= FAILED then
                                                   6.         if (SecurityLevelDescriptor[i](DeviceOverrideSecurityMinimum) = true
                                                   7.            and SecurityLevelSubfield = “000”) then
                                                   8.             ISLC_STATUS <= CONDITIONALLY_PASSED;
                                                   9.         else ISLC_STATUS <= FAILED;
                                                   10.        end if:
                                                   11.      else ISLC_STATUS <= PASSED;
                                                   12. else if (FrameTypeSubfield = “011” and FrameTypeSubfield = SecurityLevelDescriptor[i](FrameType)
                                                   13.           and CommandFrameIdentifierSubfield = SecurityLevelDescriptor[i](CommandFrameIdentifier)) then
                                                   14.      execute sec_level_compare(SecurityLevelValue, SecurityLevelDescriptor[i](SecurityMinimum));
                                                   15.      if sec_level_compare <= FAILED then
                                                   16.        if (SecurityLevelDescriptor[i](DeviceOverrideSecurityMinimum) = true
                                                   17.           and SecurityLevelSubfield = “000”) then
                                                   18.            ISLC_STATUS <= CONDITIONALLY_PASSED;
                                                   19.        else ISLC_STATUS <= FAILED;
                                                   20.        end if:
                                                   21.      else ISLC_STATUS <= PASSED;
                                                   22. end if;
                                                   23. end for;


Figure 2: SecurityLevelDescriptor lookup process
4.4.      Blacklist Checking Procedure
    Following actions shall be performed in regard to the KeyDescriptorValue, which comprises in-
formation as it is shown in Table 1. The procedure inputs are KeyDescriptorValue, De-
viceLookupDataValue,         and     DeviceLookupSizeValue.     The     procedure      outputs     are
BLC_DeviceDescriptor, BLC_KeyDeviceDescriptor, and BLC_STATUS values.
    In order to identify an appropriate DeviceDescriptor and KeyDeviceDescriptor the KeyDeviceList
element of KeyDescriptorValue is analyzed. Each KeyDeviceList element contains a list of Key-
DeviceDescriptor entries indicating which devices are currently using this key, including their black-
list status. Each KeyDeviceDescriptor contains a DeviceDescriptorHandle, which identifies a De-
viceDescriptor corresponding to the current device. The DeviceDescriptor comprises information
which is shown in Table 4.

Table 4
Fields and estimated sizes of DeviceDescriptor record
  Field name        Type        Size (bits)                                    Description
 PAN ID         Device PAN ID       16        PAN identifier of the device in this DeviceDescriptor. May be used as a
                                              key in the blacklist checking procedure together with ShortAddress.
 ShortAddress   Device short        16        Short address of the device in this DeviceDescriptor. A value of 0xFFFE
                  address                     indicates that this device is using only its extended address. A value of
                                              0xFFFF indicates that this value is unknown.
                                              May be used as a key in the blacklist checking procedure together with
                                              PAN ID.
 ExtAddress     IEEE address        64        IEEE extended address of the device in this DeviceDescriptor. This ele-
                                              ment is also used in unsecuring operations on incoming frames.
                                              May be used as a key in the blacklist checking procedure.
 FrameCounter      Integer          32        The incoming frame counter of the device in this DeviceDescriptor. This
                                              value is used to ensure sequential freshness of frames.
 Exempt           Boolean           1         Indication of whether the device may override the minimum security
                                              level settings defined in Table 3.


    A lookup process involving the KeyDescriptorValue, KeyDeviceList, and KeyDeviceDescriptor is
illustrated in Figure 3. The pseudo code aside shows the DeviceDescriptor and the KeyDeviceDe-
scriptor lookup process.
                                                                   1. for i = 0, i = KeyDeviceListEntries – 1, do
                                                                   2.    BLC_DeviceDescriptorValue <=
                                                                   3.       KeyDeviceDescriptor[i](DeviceDescriptorHandle(DeviceDescriptor));
                                                                   4.   if KeyDeviceDescriptor[i](Blacklisted) = false then
                                                                   5.       if KeyDeviceDescriptor[i](UniqueElement) = true then
                                                                   6.          BLC_KeyDeviceDescriptor <= KeyDeviceDescriptor[i];
                                                                   7.          BLC_DeviceDescriptor <= DeviceDescriptorValue;
                                                                   8.          BLC_STATUS <= PASSED;
                                                                   9.       else if KeyDeviceDescriptor[i](UniqueElement) = false then
                                                                   10. -- DeviceDescriptor lookup
                                                                   11.          if ((DeviceLookupSizeValue = 4 and
                                                                   12.              DeviceLookupDataValue =
                                                                   13.                  BLC_DeviceDescriptorValue(PANID)&&
                                                                   14.                  BLC_DeviceDescriptorValue(ShortAddress)) or
                                                                   15.             (DeviceLookupSizeValue = 8 and
                                                                   16.              DeviceLookupDataValue =
                                                                   17.                  BLC_DeviceDescriptorValue(ExtAddress))) then
                                                                   18.             BLC_KeyDeviceDescriptor <= KeyDeviceDescriptor[i];
                                                                   19.             BLC_DeviceDescriptor <= DeviceDescriptorValue;
                                                                   20.             BLC_STATUS <= PASSED;
                                                                   21.          else BLC_STATUS <= FAILED;
                                                                   22.          end if;
                                                                   23.        end if;
                                                                   24.    else BLC_STATUS <= FAILED;
                                                                   25.    end if;
                                                                   26. end for;


Figure 3: DeviceDescriptor and KeyDeviceDescriptor lookup process
4.5.    Incoming Key Usage Policy Checking Procedure
    Following actions shall be performed in regard to input KeyDescriptorValue, which comprises in-
formation as it is shown in Table 1. The procedure inputs are KeyDescriptorValue, FrameTypeSub-
field, and CommandFrameIdentifierSubfield values. The procedure output is IKPC_STATUS value.
    In order to check incoming key usage policy a KeyUsageList element of KeyDescriptorValue is
analyzed. Each KeyUsageList element contains a list of KeyUsageDescriptor entries indicating which
frame types this key may be used with. Consequently, each KeyUsageDescriptor comprises infor-
mation which is shown in Table 5.

Table 5
Fields and estimated sizes of KeyUsageDescriptor record
       Field name          Type     Size (bits)                               Description
 FrameType                Integer        3        See Table 3. Used as a key in the incoming key usage policy check-
                                                  ing procedure together with CommandFrameIdentifier.
 CommandFrameIdentifier   Integer       8         8-bit value encoding MAC command name. Used as a key in the
                                                  incoming key usage policy checking procedure together with
                                                  FrameType.


    A lookup process involving the KeyDescriptorValue, KeyUsageList, and KeyUsageDescriptor is
illustrated in Figure 4. A pseudo code aside explains incoming key usage policy checking procedure.
                                                  1. for i = 0, i = KeyUsageListEntries – 1, do
                                                  2.   if (FrameTypeSubfield /= 0x03 and
                                                  3.       FrameTypeSubfield = KeyUsageDescriptor[i](FrameType)) then
                                                  4.       IKPC_STATUS <= PASSED;
                                                  5.   else if (FrameTypeSubfield = 0x03 and
                                                  6.       FrameTypeSubfield = KeyUsageDescriptor[i](FrameType)) and
                                                  7.       CommandFrameIdentifierSubfield =
                                                  8.            KeyUsageDescriptor[i](CommandFrameIdentifier)) then
                                                  9.       IKPC_STATUS <= PASSED;
                                                  10.    else IKPC_STATUS <= FAILED;
                                                  11.    end if;




Figure 4: KeyUsageDescriptor lookup process

5. Security Data Characteristics and Contextual Information
    After looking up the standard [9], we understand that the security data used in the lookup proce-
dures are as presented in Tables 1, 3, 6, and 7. The keys are all bit strings, the longest being an ex-
tended device address of 64 bits (8 bytes).
    Let us summarize now minimum total amount of memory needed to implement the lookup proce-
dures.
    •     The minimum total amount of bits needed to store one key descriptor record (given in Ta-
    ble 1), when in each of its lists there is only 1 entry, is 72+8+10+8+16+8+128=250.
    •     The total amount of bits to store one SecurityLevelDescriptor record (see Table 3) is 20.
    •     The total amount of bits needed to store one DeviceDescriptor record (see Table 4) is 137.
    •     The total amount of bits needed to store one KeyUsageDescriptor record (see Table 5) is 11.
    As can be seen from the fields enumerated in Table 1, there are various lists involved in the lookup
procedures. The number of entries in these lists adds to the memory occupation of the SoC, which is
quite limited. Let us assume the size of the SoC’s memory (i.e., RAM) is 4 KB. In order to estimate
the number of keys that can be present in the security information database of each device (i.e.,
macKeyTable) we make some simple computations for which we assume that one quarter of the RAM
(i.e., 1KB =1024 bits) is used on this purpose. Assuming also that the lists in the KeyDescriptor (Ta-
ble 1) contain only 1 entry each, the amount of memory used for 1 key in all of the lookup procedures
is at least 250+20+137+11=418 bits=52.25 bytes – this results from adding up the total number of bits
in Tables 1, 3, 6, and 7. Thus, about 19 such keys can be fitted into 1 KB of memory at the same time,
without counting the memory used by the index data structure (if one is used). This is a very optimis-
tic calculus; therefore, it may turn out that not even 19 key descriptors can be fitted in 1KB of
memory. On the contrary, it may be proved that in practice there is no need to store 19 keys at the
same time. Anyway it be, for this research we reduce this number a little bit and fix it to be 16, just to
leave some space for the indexing data structure.
    Depending on the model of microcontroller unit (MCU) used, a cache memory may or may be not
available during the lookup procedures and the execution time of lookup procedures depends heavily
on this. Storing the indexing data structure in a fast-access cache memory constitutes a big advantage,
because the lookups can be performed with fast cash accesses almost exclusively. Only one access is
necessary to the slower RAM to retrieve the data record from the position obtained from the index. If
the MCU without a cache memory is considered (e.g., Intel 8051 MCU), each operation on the index-
ing data structure will consist in a slow memory access.
    The above considerations regarding the data that the lookup algorithms are to operate on as well as
the mentioned hardware issues should be taken into account in the selection of the data structure used,
in order to improve performance.

6. Data Structures and Algorithms
    In this section the three main data structures – trees, tries and hash tables – that we consider for the
security data indexing, are presented. Moreover, for each data structure we present the necessary algo-
rithms for performing operations on them. These algorithms can be used to insert or delete a node
into/from the data structure and to find a desired node in it.

6.1.    Trees
    A tree is a data structure consisting of nodes, containing data, linked in a particular way. There is a
root node at the origin or top of the tree structure and it has other nodes, called child nodes, linked to
it. On turn, each child node has its own children and so on. This way the tree structure is constructed
on many levels (a level meaning the nodes at the same number of links far from the root node). Each
node has a subtree underneath, except if it has no children at all. Such nodes are called leaf nodes,
while all the other nodes are called internal nodes. In many cases the useful information is stored in
the leaf nodes and the internal nodes only serve for structuring information.
    One of the most popular variants of trees is binary trees. Binary trees comply with the additional
rule that their nodes can have 0, 1 or 2 children only. This characteristic implies that when traversing
the binary tree (i.e., inspecting/visiting its nodes one-by-one) a binary decision (i.e., ‘yes’ or ‘no’;
‘true’ or ‘false’; ‘0’ or ‘1’; ‘bigger’ or ‘smaller’, etc) has to be made based on some criterion. Such
property renders binary trees very adequate for indexing and search activities, since every decision
made as going down the levels of the tree structure reduces the possible locations of the searched key
to the subtree below the current node. Often, when the tree is used on lookup purposes, data is orga-
nized so that all data in the left subtree of the node precedes any data in the right subtree of the same
node, again, based on some criterion. Such ordered trees are called binary search trees (BST). Fig-
ure 5 shows an example where the ordered list of numbers 2, 8, 10, 24, 56, 73, 74, 79, 84, 99 is struc-
tured into a BST.




Figure 5: Binary search tree
    In Figure 6 we present the pseudo codes of the Insert(), Lookup() and Delete() algorithms that can
be used with BSTs. In these algorithms each node is assumed to have a key field, containing a pointer
(e.g., of 2 bytes, depending on hardware) to the data that uniquely identifies a record among all possi-
ble security information data that can be stored in the BST. Moreover, each node has a left and right
field containing pointers to the children of the node. The currentNode variable is used to indicate the
node under inspection at each time. It represents the point of access to the data structure. newNode is
the node containing the key to insert into the tree. The ValueOf(pointer) function returns the data
pointed by the pointer passed to it as an argument. Remove(pointer) releases the memory occupied by
a node pointed by the function’s argument. Next we describe the operation of each of these algorithms
(see Figure 6).
    The Insert() function takes as an argument the node to insert into a tree and a currentNode pointer
initialized with the root node. The algorithm recursively compares the key to insert with the key
pointed by the currentNode. If the new key is smaller than the current key (line 4), steps to the left
child of currentNode (line 5), otherwise continues with the right child (line 7). The insertion point is
found when currentNode assumes a null value (line 2). This is where newNode is added to the tree.
    The Lookup() function operates similarly to the Insert() one. It takes as arguments the key to
search and the currentNode pointer initialized to the root. The function compares recursively the keys
pointed by currentNode.key and if the searched key is smaller, the lookup continues at the left child of
currentNode (line 13), otherwise at the right child (line 15). When a match is found then the pointer to
the security information belonging to key is returned. If it happens that currentNode takes up the null
value, it means that the key has not been found in the BST.

    Initialization: Each BST node has a key field and a left and right child. key is a pointer to the record of security infor-
mation. newNode is initialized with the key to insert and its left and right pointers are null. currentNode is initialized with
the root.

   1. Insert(newNode, currentNode) // Inserts a key into the tree structure
   2.     if (currentNode == null) then
   3.               currentNode  newNode;
   4.     elseif (ValueOf(currentNode.key) > ValueOf(newNode.key)) then
   5.               Insert(newNode, currentNode.left);
   6.     else
   7.               Insert(newNode, currentNode.right);
   8.     endif;
   9. end Insert();

   10. Lookup(key, currentNode) // Searches a key into the tree structure
   11. if (currentNode == null) then return “Key not found!”;
   12. if (ValueOf(currentNode.key) > key) then
   13.             return Lookup(key, currentNode.left);
   14. elseif (ValueOf(currentNode.key) < key) then
   15.             return Lookup(key, currentNode.right);
   16. else // keys are equal, key found
   17.             return currentNode.key; // returns the pointer to the desired record
   18. endif;
   19. end Lookup();

   20. Delete(key, currentNode) // Deletes key from the tree
   21. find the node pertaining to key and assign it to currentNode. Now ValueOf(currentNode.key)==key;
   22. if (currentNode.left== null and currentNode.right==null) then
   23.              Remove(currentNode);
   24. elseif (currentNode.left== null) then
   25.              tmp currentNode;
   26.              currentNode currentNode.right;
   27.              Remove(tmp);
   28. elseif (currentNode.right==null) then
   29.              tmp currentNode;
   30.              currentNode currentNode.left;
   31.              Remove(tmp);
   32. else
   33.              tmpcurrentNode.left;
   34.              while (tmp.right != null) do tmptmp.right; endwhile;
   35.             currentNode.key tmp.key;
   36.             Remove(tmp);
   37. endif;
   38. end Delete();


Figure 6: Algorithms for the binary search tree approach

   Finally, the Delete() function removes from the tree the node pointing the security information be-
longing to its key argument. First the location of the node to remove is looked up and then, based on
the node’s links to its parent and child nodes, the appropriate operations are performed to exclude the
node from the tree. There can be four different cases. First, both children are null (line 22); second
and third, when one of the children is null (line 24 and 28); and finally, when none of the nodes is null
(32). In this latter case, the left child will take the eliminated nodes place in the BST.

6.2.    Tries
    A trie is a tree-shaped data structure in which keys are stored by the position of the nodes in the
tree, instead of inside the nodes themselves as in the case of binary trees. If in a trie nodes can have at
most 2 children for storing bits then the trie is called a binary digital search tree (BDS-tree). Thus,
BDS-trees are appropriate for structuring bit strings. The bits are stored one-by-one on the links and
not inside the nodes. Each bit string is encoded as a series of ‘0’ and ‘1’ links between level 0 of the
trie (i.e., the root node) and a leaf node. Thus, each subtree in the trie has the same prefix. For exam-
ple, in Figure 7a all keys in the right branch have the prefix 1. Therefore, the keys encoded in the right
branch are 10001, 1011, and 11. If instead we wanted all of the keys with prefix 000, we would have
00000, 00001 and 0001. This structure is very efficient when it comes to looking up bit string keys.
While looking up a key in a BST takes O(m* n) time (n being the number of levels in the tree and m
the length (i.e., the number of bits) of the key), the same operation with a trie is just O(m) in the worst
case. An additional advantage of tries is that comparisons are made on bits, instead of strings, which
makes operations cheaper to perform. This significantly increases the search speeds. Tries have lower
memory needs as well, since they share prefixes. Trie algorithms, though, can be somewhat more
complex than BST ones.




Figure 7: Trie types

    It can be noted in Figure 7a that some of the nodes of the trie have one single child only. If there
are too many single-child nodes present, the trie can become very deep, which causes performance
loss in information retrieval systems [12]. There are two ways to avoid the formation of very deep
tries. The first method consists of collecting keys with the same prefix into a group, called bucket.
When a key has to be looked up, the leaf node with the corresponding prefix is identified and the
presence of the searched key is checked in the bucket pointed by the prefix leaf node. If the prefix leaf
node is missing, it means that the key has not been found. Note that by using buckets all subtrees that
are built of all single-child nodes can be compressed into one node that points to the bucket contain-
ing the key. In other words, this way there will be no parent of a leaf node that has one child only.
    The second method for reducing the depth of tries is to eliminate the nodes that have only one
child. These tries are called Patricia tries [13] and discussed in more detail later in this section.
    Implementing tries with pointers is inefficient, since the pointers used to store each node in a tree
would consume much more memory than the bits of the key themselves. Thus, Jonge et al [14] pro-
posed compressing the binary trie into a compact bit stream, called CB tree.
6.2.1. CB Tree
    The basic idea behind CB trees is that the nodes of the trie are organized in a so-called tree map in
the form of a bit stream corresponding to the pre-order traversal of the trie. This means that the trie is
traversed inspecting the current node first followed by the left and finally the right child, emitting ‘0’
for an internal node and ‘1’ for a leaf. In order for this representation method to be functional, it is
necessary for the trie to be full, meaning that each internal node must have exactly 2 children. In any
full (sub)tree the number of leafs is one more than the number of internal nodes and this property is
indispensable for the algorithms that operate on such a tree representation. On this purpose those
nodes that have only one child are completed with an extra child, which contains no information.
Such empty nodes are called dummy nodes. For keeping track of the dummy nodes, a second bit
stream is employed, called leaf map. In the leaf map for each dummy node a ‘0’ bit is emitted while a
’1’ is used otherwise, again, in pre-order. Finally, the number of ‘1’ bits in the leaf node is used to
match the key to a bucket using a bucket table, BTBL. The bucket, on turn, should contain a pointer
to the data record (e.g., security information) belonging to the key. In Figure 8a an example is given
for transforming the trie in Figure 7b into a CBtrie representation. In Figure 7 the bucket numbers are
marked in square brackets beside the non-dummy leaf nodes. The procedure of matching bucket ad-
dresses to table positions can be perceived as a hashing operation (see section about hashing later in
this paper). Indeed, this operation is known as trie hashing [15]. Although adding dummy nodes is
very important for this compact trie representation, handling such nodes still adds an unwanted bur-
den on the algorithms operating on CB (Compact Binary) digital search tries. This is why in [12] a
method that eliminates dummy nodes is proposed. The proposed method is based on Patricia tries.

6.2.2. Patricia Tries
    Patricia (Practical Algorithm to Retrieve Information Coded in Alphanumeric) tries [13] were de-
fined to eliminate one-child nodes from tries, so that to reduce their spread and depth. In order to keep
track of the eliminated links, their values are stored in the remaining nodes. For example, the trie in
Figure 7b can be converted into the Patricia trie in Figure 7c, where a total of 5 nodes have been elim-
inated from the tree structure. In order to avoid false matches in a search operation, the eliminated
links are memorized at their parent nodes. Therefore, during search operation these eliminated nodes
have to be taken into consideration.
    Patricia tries are very efficient in indexing a small number of long keys, or keys with long shared
prefixes and the efficiency of key lookups does not depend on the number of keys, but on their length.
Also, since they do not contain one-child nodes or dummy nodes, they have been used to reduce the
size of CB tries [12]. CB tries based on Patricia tries are called CPat (Compact Patricia) tries.

6.2.3. CPat Tries
    CPat tries combine the compactness of Patricia and CB tries in two different planes. While Patricia
tries save memory by reducing the number of nodes in the tree structure, CPat tries take advantage of
the array-based representation, which uses a bit stream instead of pointers, for structuring keys. Simi-
lar to CB tries, CPat tries also use two bit streams. The tree map is the same as for CB trees, but leaf
map is useless in this case because there are no dummy nodes in CPat tries. A node map is used in-
stead, which accounts for the eliminated nodes. Thus, in the node map for each node that contains no
eliminated nodes a ‘0’ is emitted. On the other hand, for each eliminated node contained in a node a
‘1’ is emitted and the string of ‘1’s is terminated by a ‘0’. In the example of Figure 8b the Patricia trie
from Figure 7c is represented in a compressed format. Note that in the node map only the internal
nodes are represented. There is no need to include here the leaf nodes as well, since even nodes con-
taining eliminated nodes match to the same bucket.
Figure 8: Compact trie representation

   This combined representation results in a very reduced memory requirement for CPat tries. How-
ever, in cases when the key is situated at the right-bottom side of a Patricia trie, lookup times are
somewhat increased because most of the bits in the bit stream representation are to be inspected. Fur-
thermore, in case of insert and delete operations a large number of bits have to be moved. This has a
negative impact on the speed of algorithms operating on CPat tries. To overcome this problem, a hier-
archical approach, called HCPat (Hierarchical Compact Patricia) tries, has been proposed [16].

6.2.4. HCPat Tries
    HCPat tries decompose the CPat trie in many smaller tries and link them together with pointers. In
a HCPat trie the depth (i.e., the number of levels) of the tree is fixed and if the data can not fit into the
desired number of levels, a new subtrie is created and the root node of the new subtrie is linked to one
of the leaves of the old subtrie. This way each subtrie has its own treemap. Therefore, similar to a
tree, as the search for a key progresses, some of the subtries are excluded from the search, reducing
this way the number of inspected bits. As an example, the HCPat trie composed of 3 subtries in Fig-
ure 7d is compressed in Figure 8c. In Figure 7d bucket numbers are marked in square brackets while
internal nodes are numbered in round brackets. Negative numbers in the BTBLs (Figure 8c) are used
as pointers to the node with the corresponding internal node number (i.e., the absolute value of the
pointer). These pointers link a leaf node of one subtrie to the root of another.
    HCPat tries provide a good compromise between the performance of Patricia and CPat tries. While
their memory requirements are about a third of Patricia tries, they provide a speed of about 40 times
faster than CPat tries [16]. This performance renders HCPat tries very attractive for lookup of keys in
large key sets.
    Each of the presented trie types provide some improvement on the performance of the basic BDS-
tree. However, with the performance, the computational complexity of these data structures increases
as well. In this paper we aim at finding a data structure that can guarantee high speed with lookups
with a relatively small number of keys and on devices with reduced computation resources. Therefore,
while HCPat tries can provide a great compromise on speed and memory needs for many tens of
thousands of keys, for our application with several tens of keys it may turn out that the additional
computational complexity implied by this data structure outweigh the benefits. Furthermore, our ap-
plication can not benefit from the strong sequential key retrieval property of tries either, since our
keys will be retrieved one-by-one. Thus, even though the various types of tries can be very effective
in certain application domains, it should be noted that security key lookups for ZigBee and other
IEEE 802.15.4-based SoCs do not really share the assumptions that tries have been designed for.
Therefore, the usage of tries for these lookups is not very promising.
   In Figure 9 we present high level pseudo codes of the Insert(), Lookup() and Delete() functions for
the CPat trie. For these algorithms we use the nodepos, treepos and keypos counter variables to indi-
cate the position of the current inspection in the nodemap, treemap and key bit strings, respectively.
Also, MAX_BUCKET_SIZE is a constant indicating the maximum number of keys that a bucket can
hold at a time.
   In the Insert() function first the position where the new key from the function’s argument has to be
inserted is searched for. On this purpose the logic described in the Lookup() function can be used.
When the bucket is identified, there are two cases that can show up. Either the bucket is partially full
– in which case the key is simply added to it (line 4) – or it is completely full. In this latter case the
bucket has to be split in two new buckets and the contents of the old bucket must be distributed
among the two new ones (line 7). The new key has to be then added to the bucket with the corre-
sponding address and all the buckets in the trie renumbered (line 8), such that to maintain the corre-
spondence between the bucket numbers and the number of ‘1’ bits in the node map.

   Initialization: nodepos 1; treepos  1; keypos 1; indices in nodemap, treemap and key;
   1. Insert(key) // Inserts a key into a CPat trie;
   2.     find insertion point (as in Lookup() );
   3.     if (required bucket partially filled) then
   4.                add key to bucket;
   5.     else // bucket full
   6.                split: convert bucket into an internal node with 2 buckets;
   7.                distribute the keys, including key, among the 2 new buckets;
   8.                renumber all buckets;
   9.     endif;
   10. Insert(key)//

   11. Lookup(key)// Finds a key in a CPat trie;
   12. repeat
   13. if (key[keypos] == 1) then
   14. Skipping left subtree in treemap: advance treepos until nr 1 bits in treemap is one more than nr 0 bits;
   15. Skipping left subtree in nodemap: advance nodepos until nr 0 bits in nodemap passed the same times as the nr of 0
bits skipped in treemap;
    16. else
   17.                       treepos++;
   18.                       if (treemap[treepos] == 0) then
   19.                                 keypos++; nodepos++;
   20.                                 if (nodemap[nodepos] == 1) then
   21.                                 Advance keypos and nodepos until nodemap[nodepos] == 0
   22.                                 endif;
   23.                       else
   24. count nr 1 bits in treemap from position 1 to treepos and obtain bucket address;
   25. if the key is found in the bucket, return ‘Success, return pointer!’, otherwise return ‘Key not found! Return null.’;
   26.                       endif;
   27.            endif;
   28. until treemap[treepos] != 1;
   29. end Lookup();

   30. Delete(key) // deletes a key from the trie
   31. find bucket containing key to delete (as in Lookup() );
   32. remove pointer from bucket;
   33. if (nr keys in bucket < MAX_BUCKET_SIZE/2 and nr keys in sibling bucket < MAX_BUCKET_SIZE/2) then
   34.              merge bucket with its sibling;
   35. endif;
   36. end Delete();


Figure 9: Algorithms for the Patricia trie based approach

    The Lookup() function cycles through the bits of the key passed to it as argument until a leaf node
is reached in the trie. If the currently inspected bit of the key is ‘1’, the treepos and nodepos variables
are incremented so that to point to the right subtrie. In treemap this can be achieved by passing one
more ‘1’ bits than ‘0’ bits, since the number of leaf nodes in a full subtrie are one more than internal
nodes and this is encoded into the treemap by its creation based on the pre-order traversal of the trie
(line 14). In the nodemap the number of ‘0’ bits indicate nodes effectively represented in the trie.
Therefore, the number of ‘0’ bits passed should be equal to the number of ‘0’ bits passed in the
treemap (line15). However, if the inspected key bit is ‘1’ (line 16), the next bit in the treemap is eval-
uated. If the treemap bit ‘0’, keypos and nodepos are advanced, accounting for eliminated nodes as
well (lines 19-22) and the search continues with a subsequent cycle. If, however, the treemap bit is ‘1’
(line 13) then the search arrived to a leaf node and the corresponding bucket number is retrieved by
counting the passed ‘1’ bits (leaf nodes) in the treemap. The bucket address is obtained from the
BTBL (line 24) and the contents examined. If the searched key is found in the bucket then a pointer is
returned to the data of interest, otherwise the key is not in the trie and the search is unsuccessful (line
25).
    The Delete() function removes the key passed to it as an argument from the trie. The function starts
with identifying the bucket that should contain the key (line 31) and removes it. The deletion opera-
tion could end at this point, however, this would lead to the formation of buckets that contain too few
keys compared to their maximum capacity, MAX_BUCKET_SIZE. This could increase the depth and
spread of the trie uselessly. To avoid having almost empty buckets in the trie after the deletion, the
number of keys in the bucket from which a key has just been removed and its sibling bucket (i.e., the
bucket sharing the same parent node) are counted. If both of these buckets are less than half full (line
33), they are merged (line 34). Merging involves copying the contents of the less full bucket into its
sibling bucket and replacing their parent node with this latter. The two old buckets are then removed
from the trie. This way an internal node and two buckets are compressed into one single bucket.
    Although the above functions were designed for CPat tries, they can be easily adapted for CB or
HCPat tries as well, using as reference [12, 14, 16].

6.3.    Hash Tables
    In many applications the data used to identify a particular record (i.e., the key) can have an ex-
tremely wide range of values, even though only a small subset is ever used together. In such situations
a function can be defined, that converts the keys so that to fit into an acceptably sized range. This op-
eration produces a table that matches some value from the narrow range to each useful key value in
the wide range, resulting in pairs of narrow- and wide-range keys. We can look at this table as an ar-
ray indexed by the narrow-range key and valued with the wide-range keys, or even valued with point-
ers to the record that contains the wide-range key. Such tables and arrays are known as hash tables
and the function that performs the wide-to-narrow range translation is called a hash function. The verb
”to hash” is used for the translation operation.
    Hash tables provide a very fast way for looking up data. As a long key is fed into the hash func-
tion, the corresponding hash table index is obtained and at the same time the searched record (or
pointer to its location in the memory) is instantaneously found. Therefore, each lookup operation
takes only the evaluation of the hash function. It is an additional value of this data structure that it can
maintain this performance even if the number of keys grows very high. Thus, its scalability is unparal-
leled.
    Although the lookup velocity enabled by hash tables is hard to overtake, this data structure can
provide such a high performance only in favourable conditions. This is not the case when the number
of keys that have to be stored in the hash table is unpredictable. First, since hash tables inherit all of
the arrays’ properties, after their dimension is fixed (i.e., the memory had been allocated) it is costly
to re-dimension them. Second, if the array is over-dimensioned to let space for growth and fluctua-
tion, very often the memory is used up uselessly. Third, if keys have to be looked up sequentially,
extra time-consuming operations have to be performed in order to ensure that hashed and original
keys are ordered in the same way. Finally, given that by hashing a wide range of keys is squeezed into
a much smaller range, it can often happen that two different long keys are hashed into the same short
key. Such collisions can be resolved at a cost. Usually, hash tables full at 70% perform well, but as
their load factor (i.e., the ratio of the used and total number of hash table positions) grows beyond that
point (i.e., 0.07), the performance starts degrading quickly.
    There are two widely used methods for resolving collisions: separate chaining and open address-
ing. With separate chaining at each position of the hash table, where a collision occurs, a linked list is
created and each key that hashes to that position is appended to the list. At lookup this list is searched
element-by-element. On the other hand, with open addressing a collision is resolved by searching an
empty position for the second key, according to some rule. One of the simplest such rules is to assign
the upper adjacent index in the hash table to the key. If that index is also taken, then the next is taken
and so on until a free index is found. This rule is known as linear probing. According to another rule
– known as quadratic probing – the new empty index is looked for at a distance from the collision
equal to the square of the step number. In other words, with linear probing probes follow as x+1, x+2,
x+3, x+4 and so on, while with quadratic probing the rule is x+1, x+4, x+9, x+16, and so on. Quadrat-
ic probing is more efficient in avoiding the formation of clusters, which slow down lookups [17].
However, they still can’t avoid long sequences from forming, because the long keys that hash to the
same position will all follow the same sequence of probing. To avoid this problem, double hashing (or
rehashing) can be used, which hashes the key with a different function for the second time and uses
the result as the step size.
    In two of the 4 lookup procedures that we are targeting we use either a 16 bit short address or a 64
bit extended address. These addresses are randomly distributed and they do not contain redundant
information. Therefore we hash them entirely. The range of these addresses is very wide and only 16
such keys will be used at the same time. On this purpose we use a hashTable of HTDIMENSION=23
positions. We use 7 extra positions to reduce the probability of collisions and to ensure a maximum of
70% usage even when the maximum number of keys (i.e., 16) is present in the hash table. Moreover,
it is good practice to use a prime number for the hash table dimension to avoid reducing the modulo
operation to the selection of the last several bits of the keys [18]. Thus, to squeeze the huge space of
keys in the 23-position hash table we use the modulo operator ⊕ as follows:
                                      𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 = 𝑘𝑘𝑘𝑘𝑘𝑘 ⊕ 23,                               (1)
where position indicates the cell in the hash table that will hold a pointer to the security information
record containing key. In Figure 10 the value of position is obtained with the function HashFunc-
tion(). In order to prevent clustering and long probe sequences from forming due to repeated colli-
sions, we use double hashing. We use the following function for rehashing:
                                  𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 = 5 − (𝑘𝑘𝑘𝑘𝑘𝑘 ⊕ 5).                              (2)
    This type of rehashing function has been proved to work efficiently [19]. The constant can be any
prime number smaller than the hash table’s dimension. The value of stepSize is obtained with the
function DoubleHash() in Figure 10. With double hashing, if we denote the load factor by 𝛼𝛼 =
16/23 = 0.7, the number of probes for unsuccessful lookups will be 1/(1 − 𝛼𝛼) = 3.33, while for
successful ones it will be 1/𝛼𝛼ln (1/1(1 − 𝛼𝛼)) = 1.7. This is a worst-case estimate.
    In Figure 10 we present the pseudo codes of the Insert(), Lookup() and Delete() algorithms that can
be used with hash tables. Moreover, in Figure 10 the pseudo code of two auxiliary functions, namely
ResolveInsertCollision() and ResolveLookupCollision(), is also presented. Beside the functions and
constants described above, in these pseudo codes we use the position variable to indicate the hashTa-
ble index obtained through hashing and newPosition for the step size calculated with the DoubleHash-
ing() function, in case of collisions. AddressOf() and ValuesOf() are two functions that return the ad-
dress of a key and the key pointed by a pointer, respectively.
    The Insert() function takes as an argument a key and it inserts it into the hashTable. First the key is
hashed (line2) then its insertion into the hashTable to the obtained index is attempted. If the index
position is empty (line 3), the address of the security information belonging to key is assigned to the
corresponding position of the table. Otherwise a collision happened and ResolveInsertCollision() is
called to resolve it (line 6). ResolveInsertCollision() takes as argument the position of the collision
and the original key and all it does is to repeat probing the hashTable for an empty position using
DoubleHashing() (line 34). When it succeeds, it returns the newPosition and this will be used as an
index for the security data belonging to key.

    Initialization: Allocate memory for the array hashTable[HTDIMENSION], containing pointers initialized with null;
HTDIMENSION 23, a constant, is the dimension of the hash table. After the insertion of a new key, the corresponding
element of this hashTable will point to the to the address of the security information belonging to the key.
   1. Insert(key) // Inserts a key into the hashTable
   2.     position  HashFunction(key);
   3.     if (hashTable[position] == null) then
   4.                hashTable[position]  AddressOf(securityInfo.key);
   5.     else
   6.     hashTable[ResolveInsertCollision(position, key)]  AddressOf(securityInfo.key);
   7.     endif;
   8. end Insert();

   9. Lookup(key)// Returns security info belonging to key if found; returns null otherwise;
   10. position  HashFunction(key);
   11. if (ValueOf(hashTable[position]).key == key) then
   12.             return hashTable[position];// ValueOf() could be returned as well
   13. else
   14.             collision  ResolveLookupCollision(position, key);
   15.             if (collision == HTDIMENSION) then
   16.                         return “Key not found!”;
   17.             else
   18.                         return hashTable[collision]; // ValueOf() could be returned as well
   19.             endif;
   20. endif;
   21. end Lookup();

   22. Delete(key) // Deletes key from hashTable
   23. position  HashFunction(key);
   24. if (ValueOf(hashTable[position]).key == key) then
   25.              hashTable[position] null; // ValueOf() can be used as well
   26. else
   27.              collision  ResolveLookupCollision(position, key);
   28.              if (collision != HTDIMENSION) then
   29.                          hashTable[position] null; // ValueOf() can be used as well
   30.              endif;
   31. endif;
   32. end Delete();

   33. ResolveInsertCollision(newPosition, key)
   34. repeat
   35.             newPosition  newPosition + DoubleHash(key);
   36. until hashTable[newPosition] != null;
   37. return newPosition;
   38. end ResolveCollision();

   39. ResolveLookupCollision(newPosition, key)
   40. repeat
   41.            newPosition  newPosition + DoubleHash(key);
   42. until (ValueOf(hashTable[newPosition]).key != key and
   43.            hashTable[newPosition] != null;
   44. if (hashTable[newPosition] == null) then
   45.            return HTDIMENSION; // symbolic value returned: value not found;
   46. else
   47.            return newPosition;
   48. endif;
   49. end ResolveCollision();


Figure 10: Algorithms for the hash table approach

   The Insert() function takes as an argument a key and it inserts it into the hashTable. First the key is
hashed (line2) then its insertion into the hashTable to the obtained index is attempted. If the index
position is empty (line 3), the address of the security information belonging to key is assigned to the
corresponding position of the table. Otherwise a collision happened and ResolveInsertCollision() is
called to resolve it (line 6). ResolveInsertCollision() takes as argument the position of the collision
and the original key and all it does is to repeat probing the hashTable for an empty position using
DoubleHashing() (line 34). When it succeeds, it returns the newPosition and this will be used as an
index for the security data belonging to key.
    Also the Lookup() function begins with hashing the key received as an argument, in order to obtain
the position of that key in the hashTable. If at the obtained position it finds the desired key (line 11),
Lookup() returns the pointer to the appropriate security data (line 12). Otherwise it calls the Re-
solveLookupCollision() function to find the position where the key had been stored. Re-
solveLookupCollision() uses the DoubleHashing() function to calculate the step size used for probing
the hashTable (line 41). The probing continues until the desired key is found or an empty hashTable
position, meaning that the key is not present in the table, is encountered. (Note that probing happens
circularly in the hashTable, i.e., if the length of the table is exceeded, the probing continues at its be-
ginning.) To indicate that the key has not been found, the HTDIMENSION is returned (line 45). This
is interpreted correspondingly in the calling Lookup() function (line 15). In case of success the posi-
tion where the key is found is returned to the Lookup() function (line 47), which, on turn, returns the
pointer to the security data belonging to the key in question (line 18).
    The Delete() function works very similarly to Lookup(), except that when it finds the position of
the desired key in the hashTable, it empties that position (lines 25 and 29). If desired, an additional
instruction can be added to delete also the security data belonging to the key.

7. Efficiency Comparison
    In order to understand which of the three presented data structures serve better our needs, in this
section we evaluate them based on the criteria enumerated in Sect. 2. Our evaluation reflects more of
a worst-case scenario, even though we provide some considerations on average values as well. To
help the understanding of the explanations on the CPat approach we also included performance esti-
mations on the CB trie.
    Below, a brief summary of the five symbols generally used for comparing the rates of growth of
functions is provided [20]. On this purpose functions f(x) and g(x) are used and the growth of f is ex-
pressed in function of g. In this paper only the 𝑂𝑂, 𝛺𝛺 and 𝛩𝛩 symbols were used. 𝑓𝑓(𝑥𝑥) = 𝑜𝑜(𝑔𝑔(𝑥𝑥))
means that f grows slower than g does when x is very large. 𝑓𝑓(𝑥𝑥) = 𝑂𝑂(𝑔𝑔(𝑥𝑥)) means that f certainly
doesn't grow at a faster rate than g. 𝑓𝑓(𝑥𝑥) = 𝛩𝛩(𝑔𝑔(𝑥𝑥)) means that f and g are of the same rate of
growth, only the multiplicative constants are uncertain. 𝑓𝑓(𝑥𝑥) = 𝛺𝛺(𝑔𝑔(𝑥𝑥)) means that the calculation of
f takes at least as much as the calculation of g. 𝑓𝑓(𝑥𝑥) ≈ 𝑔𝑔(𝑥𝑥) means that not only do f and g grow at
the same rate, but that in fact f/g approaches 1 as 𝑥𝑥 → ∞.
    To insert, lookup or delete a key into/in/from a balanced BST the number of keys that need to be
evaluated is one per each level in the worst case and half the number of levels, h, on average. Thus, in
each of the three cases the time complexity is 𝑂𝑂(𝑚𝑚 ∗ ℎ), where m is the number of bits composing the
key and is used here to help comparison with trie-related performance. However, if the tree is ex-
tremely unbalanced and takes the shape of a linked list, the number of evaluated nodes for these oper-
ations is equal to the number of nodes, n, in the worst case. Operation on such a tree have the time
complexity of 𝛺𝛺(𝑛𝑛 ∗ 𝑚𝑚). Node evaluations are performed on bit strings and not on bits as in the case
of tries. This also adds to the time complexity of the operations.
    When looking up a CB trie, the worst case is when the rightmost bit of the tree map is searched
for. This requires scanning of the whole tree and hence its complexity is O( 2h ) , with h being the
maximum depth of the complete trie. Since in the case of CPat tries the dummy nodes are excluded
from the tree map, the same operation requires Dx100 times less operation, where D ∈ [0,1] is the
ratio of dummy leaves over leaf nodes in the CB trie [12]. Thus, the lookup time complexity with
CPat tries is O( D * 2h ) . The situation is similar with the insert and delete operations as well. The
worst-case scenario is when the leftmost bit of the tree map has to be acted upon, since then all of the
bits in the tree map have to be shifted. The time complexity in this case is O( 2h − h ) for the CB trie
and O( D * ( 2h − h )) for the CPat trie. Note that the complexities of CB and CPat algorithms differ
only by the constant D. Thus, the growth of the CPat algorithm execution times with respect to the
growth of the CB algorithm times is Θ(D). Also note that node evaluations in tries mean operations
on bits, which are much simpler than operations on long bit string keys. Therefore, if we want to
compare the performance of tree- and trie-based solutions, the number of nodes in trees must be mul-
tiplied with the length of the keys. However, while it is not straightforward how to compare the time
complexities of trie- and tree-based algorithms, it can be clearly seen that hash tables-based algorithms
promise the highest speeds, with reduced memory requirements as well. Since speed is the most im-
portant parameter of our security information lookup application, it is clear that the hash table-based
solution is the most attractive one for us. With this latter approach, any node can be looked up, inserted
or deleted by simply calculating the value of the hash function. This complexity is O(1) and it does not
depend on the number of keys, but it can degrade if collisions happen. However, if the hash table is di-
mensioned so that it never gets loaded over 70% of its capacity, the performance loss is not significant.
    The memory needs of BSTs can be calculated as follows. Each BST node is made of 3 pointers of
2 bytes each (i.e., key, left child, right child). This requires 48 bits to store each node. Thus, to store
the whole tree 48*k bits are necessary, where k is the number of keys in the BST. For the CB and
CPat tries the total number of bits required is given by the size of tree map, leaf or node map and
BTBL. In a CB representation the tree map is of 2h +1 − 1 bits, leaf map is 2h and BTBL is K*16 bits
long, where h is the maximum depth of the complete trie and K is the maximum number of keys that
can be stored in the BTBL. This gives a total of 2h +1 − 1 + 2h + 16k bits. On the other hand, in a CPat
representation the tree map is ( 1 − D ) * 2 h+1 − 1 , node map is 2h +1 − 1 and BTBL is again k*16 bits
long, totalling ( 1 − D ) * 2h +1 + 2h −1 + 16k − 1 bits. Such size complexities can result in quite high
memory requirements. This is due to the fact that with tries not only a 16-bit pointer is stored for each
key, but the nodes themselves are stored bit-by-bit. Therefore, 64 bit keys would occupy 4 times more
memory here than in a tree. The memory needed to store a key in a hash table is 16 bits, since it repre-
sents a pointer. Thus, the total amount of memory needed to store the entire hash table depends on the
maximum number of keys, K, that it can store at the same time and is 16*K.
    For example, for storing 16 keys, 64 bits each, setting D=0.3, with the above data structures the
following amounts of memory would be necessary. With BST trees and hash tables 48*16=768 bits
and 16 ∗ 23 = 368 bits would suffice, respectively. With tries the maximum depth of the trie is re-
quired for the calculation. Assuming that about 30% of all leaf nodes in the CB trie are dummies, then
16 keys (meaning 16 leaf nodes) imply about 8 dummies. Further we assume that we have 23 internal
nodes in the trie. This means that the size of the tree map will be 24+23=47 bits, while the leaf map
for the CB trie will be of 24 bits and the node map for the CPat trie will be of 23 bits. The BTBL in
both cases will be of 16*16=256 bits. Thus the total memory used for the CB and CPat tries will be
47+24+256=327 and (1-0.3)*(47+23+256)=228.2 bits, respectively. This is a very good performance,
however, keep in mind that these calculations highly depend on the actual values of the keys, since
this has a major impact on the sparseness and depth of the tries.
    The scalability of a BST trees depends on their depth, which depends on the number of nodes in
the tree (many nodes result in many levels), but also on the shape of the tree (an unbalanced tree can
have significantly more levels than a balanced one, with the same number of nodes). Since balancing
the tree has its own cost as well and because operations are performed on bit strings and not bits, on a
scale from 1 to 5 we assign the grade 3 to the scalability of BSTs, meaning a medium scalability.
    Trie-based solutions are expected to provide higher scalability than tree-based ones for two rea-
sons. First, tries operate on bits and not bit strings as trees do. Second, in tries common prefixes are
represented only once, while in trees they are repeated each time they occur. On the other hand, trie-
based algorithms are significantly more complicated than those used with trees. Altogether, on a scale
from 1 to 5 we assign the grade 4 to the scalability of tries, meaning good scalability.
    Given that time complexity does not depend on the number of keys in the case of hash tables and
also because memory needs are low, the scalability of hash table-based algorithms is considered the
highest among the data structures presented.
    Hash tables are favourite also from the hardware implementation point of view, because algo-
rithms are enough simple and the main characteristic operation that has to be performed is the compu-
ting of hash function values. Many hash functions have been specially designed to be effective for
hardware implementation. Tree based algorithms are also simple, however they require fast handling
of bit strings representing keys. Although this may not be too difficult for a resource-contained device
either, it may somewhat slow down lookup times. This is why on a scale from 1 to 5 we assign trees
the grade 4, meaning that the tree-based solution is good for hardware implementation.
    Finally, despite the fact that tries operate on bits instead of bit strings, we consider them the most
inappropriate for hardware implementation among the presented approaches, since the algorithms that
operate on them are quite complicated for a resource-limited embedded device. This is why on a scale
from 1 to 5 we assign them the grade 2, meaning that tries should possibly be avoided when it comes
to implementing them in hardware.
    Table 6 summarizes the above considerations on the efficiency of lookup solutions discussed in
this paper. Note though, that they should be compared with care because the primitives of the various
data structures are not equivalent. Thus, the number of levels in a tree should not be compared with
the number of levels in a trie. However, from the perspective delimited by the information provided in
this paper, the formulas of Table 6 can still provide solid grounds for understanding which of these
solutions can serve better the needs of the security information lookup application.
Table 6
Various data structures efficiency summary for lookup procedures execution
          Criteria           BST Tree            CB Trie                          CPat                   Hash
                                                                                                         Table
 Lookup Complexity           𝑂𝑂(𝑚𝑚 ∗ ℎ)          O( 2h )                      O( D * 2h )                𝑂𝑂(1)
 Insert/Delete Complexity    𝑂𝑂(𝑚𝑚 ∗ ℎ)        O( 2h − h )                 O( D * ( 2h − h ))            𝑂𝑂(1)
 Memory needs for index        48*k        2h +1 − 1 + 2h + 16k    ( 1 − D ) * 2h +1 − 1 + 2h −1 + 16k   16k
 (bits)
 Scalability                 Medium              Good                            Good                    Very
                                                                                                         good
 Hardware implementation       Good            Not good                        Not good                  Very
                                                                                                         good


8. Conclusions
    In this paper an analysis on the possibilities for implementing lookup algorithms for ZigBee and
other IEEE 802.15.4-based SoCs security lookup procedures is presented. An initial description about
the security data as well as the security lookup procedures and the criteria used for evaluating lookup
algorithms are described at the beginning of the paper. Based on this information the actual needs of
the mentioned security application were loosely delimited and the search for lookup algorithms has
been conducted from this perspective.
    The search for lookup algorithms leads to the recognition that, first of all, it is important to find an
adequate data structure for indexing security information. Once the data structure has been selected,
lookup algorithms can be identified more easily, since they are bound to the data structures. Thus,
three main data structures have been identified: binary search trees, tries and hash tables. For each of
these data structures a general presentation is provided, followed by the pseudo code of insert, lookup
and delete algorithms and, finally, their efficiency is estimated based on the selected criteria. The
analysis shows that a hash table-based solution can highly outperform the other two data structures,
especially on the most important metric, the lookup speed. Furthermore, the hash table-based ap-
proach has low memory needs as well, and it does not require complicated algorithms either. On the
other hand, tree-based approaches are simple, but they compare many bit string keys until they find
the desired one, which slows them down. Tries aim at improving on this by requiring bit comparisons
only. However, they achieve this with additional lines of algorithm code, which renders them less
attractive for implementation. In conclusion, based on the analysis in this document we suggest that
the described hash table-based approach be implemented and the tree and trie based approaches
should be considered for implementation only if execution time permits.

9. References
[1] T. Basmer, H. Schomann and S. Peter, Implementation analysis of the IEEE 802.15.4 MAC for
    wireless sensor networks, in: Proceedings of 2011 International Conference on Selected Topics
    in Mobile and Wireless Networking (iCOST), Shanghai, China, 2011, pp. 7-12. doi:
    10.1109/iCOST.2011.6085840.
[2] L. Feng, H. Chen, T. Li, J. Chiou and C. Shen, Design and Implementation of an IEEE 802.15.4
     Protocol Stack in Embedded Linux Kernel, in: Proceedings of 2010 IEEE 24th International
     Conference on Advanced Information Networking and Applications Workshops, Perth, WA,
     Australia, 2010, pp. 251-256. doi: 10.1109/WAINA.2010.72.
[3] G. Cornetta, A. Touhafi, D. J. Santos, J. M. Vázquez, Field-programmable gate array implemen-
     tation of an all-digital IEEE 802.15.4-compliant transceiver, International Journal of Electronics,
     97:12, (2010) 1361-1378, doi: 10.1080/00207217.2010.488909
[4] N. S Bhat, Design and Implementation of IEEE 802.15.4 Mac Protocol on FPGA, in: IJCA Pro-
     ceedings on Innovative Conference on Embedded Systems, Mobile Communication and Compu-
     ting (ICEMC2), 2011, pp. 1-5.
[5] P. Hämäläinen, M. Hännikäinen, T. Hämäläinen, Efficient hardware implementation of security
     processing for IEEE 802.15.4 wireless networks, In: Proceedings of the 48th IEEE Int. Midwest
     Symp. on Circuits and Systems (MWSCAS 2005), Cincinnati, OH, USA, 2005, pp. 484–487.
[6] M. L. Chávez, F. R. Henríquez, E. L. Trejo, AES-CCM implementations for the IEEE 802.15.4
     devices, in: IFAC Proceedings Volumes, Volume 40, Issue 22, 2007. doi:10.3182/20071107-3-
     FR-3907.00030.
[7] M. Cherkaskyy and A. Melnyk, Complexity of specialized processors, The Experience of De-
     signing and Application of CAD Systems in Microelectronics, 2003. CADSM 2003. In: Proceed-
     ings of the 7th International Conference., Slavske, Ukraine, 2003, pp. 209-210, doi:
     10.1109/CADSM.2003.1255033.
[8] A. Melnyk. “Computer Memory with Parallel Conflict-Free Sorting Network-Based Ordered
     Data Access.” Recent Patents on Computer Science, volume 8, Issue1, 2015. pp. 67–77. doi:
     10.2174/2213275907666141021234845.
[9] IEEE Standard 802.15.4, “Wireless Medium Access Control (MAC) and Physical Layer (PHY)
     Specifications for Low-Rate Wireless Personal Area Networks (WPANs)”, Sept 2006.
[10] Viktor Melnyk. “Implementation Options of Key Retrieval Procedures for the IEEE 802.15.4
     Wireless Personal Area Networks Security Subsystem.” Advances in Cyber-Physical Systems, 4,
     1 (2019) 42–54.
[11] N. Sastry and D. Wagner, “Security considerations for IEEE 802.15.4 networks”, In: Proceedings
     of the 2004 ACM workshop on Wireless security, Philadelphia, PA, USA, 2004, pp. 32–42, doi:
     10.1145/1023646.1023654.
[12] M. Shishibori, M. Okuno, Kazuaki Ando and Jun-ichi Aoe, An Efficient Compression Method
     for Patricia Tries, in: IEEE International Conference on Systems, Man, and Cybernetics. Compu-
     tational Cybernetics and Simulation, Orlando, FL, USA, 1997, pp. 415-420 vol.1, doi:
     10.1109/ICSMC.1997.625785.
[13] Donald R. Morrison, PATRICIA – Practical Algorithm To Retrieve Information Coded in Alpha-
     numeric, Journal of the ACM, Vol 15, No. 4, 1968, pp. 514-334.
[14] Wiebren De Jonge, Andrew S. Tanenbaum and Reind P. Van De Riet. “Two Access Methods
     Using Compact Binary Trees.” IEEE Transactions on Software Engineering, vol. 13, no. 7 (Jul
     1987), pp. 799-810, doi: 10.1109/TSE.1987.233491.
[15] W.Litwin, Trie Hashing, In: Proceedings of the ACM-SIGMOD International Conference on
     Management of Data, Ann Arbor, MI, 1981, pp.19-29.
[16] M. Jung, M. Shishibori, A. Tanaka and Jun-ichi Aoe, "A Dynamic Construction Algorithm for
     the Compact Patricia Trie Using the Hierarchical Structure." Information Processing & Manage-
     ment, Vol.38, No.2, 2002, pp.221-236, doi: 10.1016/S0306-4573(01)00031-0.
[17] M. Płaza, S. Deniziak, M. Płaza, R. Belka, P. Pięta, Analysis of parallel computational models
     for clustering, in: Proceedings SPIE 10808, Photonics Applications in Astronomy, Communica-
     tions, Industry, and High-Energy Physics Experiments 2018, 108081O (1 October 2018). doi:
     10.1117/12.2500795.
[18] Juris Viksna, Construction and Analysis of Efficient Algorithms, Course Handouts, 2020. URL:
     http://susurs.mii.lu.lv/juris/courses/alg2020.html.
[19] M. Waite, R. Lafore, Data Structures & Algorithms in Java, Waite Group Press, Corte Madera,
     CA, 1998.
[20] H. S. Wilf, Algorithms and Complexity, A K Peters/CRC Press, 2nd edition, 2002.