Decoding Superpositions of Bound Symbols Represented by Distributed Representations Michael Hersche1,2 , Zuzanna Opala1 , Geethan Karunaratne1,2 , Abu Sebastian1 and Abbas Rahimi1,* 1 IBM Research-Zurich, Säumerstrasse, 4, 8803 Rüschlikon, Switzerland 2 ETH Zürich, Rämistrasse 101, 8092 Zürich, Switzerland Abstract Vector-symbolic architectures (VSAs) express data structures with an arbitrary complexity and per- form symbolic computations on them by exploiting high-dimensional distributed representations and associated key operations. VSAs typically use dense random vectors, aka hypervectors, to represent atomic symbols that can be combined into compound symbols by multiplicative binding and additive superposition operators. For instance, a VSA-based neural encoder can bind two atomic symbols, and further superpose a set of such bound symbols—all by distributed vectors that have the same dimension. Nevertheless, decoding such an additive-multiplicative vector, to the atomic symbols from which it is built, is not a trivial task. Recently, a solution based on resonator networks was proposed to iteratively factorize one of the bound symbols. After finding the factorization, it is explained away by subtracting it from the superposition. This explaining away, however, causes noise amplification that limits the number of symbols that can be reliably decoded in large problem sizes. Here, we present novel methods that efficiently decode VSA-based data structures consisting of multiplicative binding and additive super- position of symbols. We expand the pure sequential explaining away approach by performing multiple decodings in parallel using a dedicated query sampler. Compared to the baseline resonator network, this mix of sequential and parallel decoding retrieves up to 8× more additive components from larger problems in synthetic and real-world experiments. Keywords Vector-symbolic architectures, hyperdimensional computing, resonator networks, decoding neural structures, explaining away 1. Introduction Vector-symbolic architectures (VSAs) [1, 2, 3, 4, 5] are a family of computational models that create an elegant formalism to encode, manipulate, and importantly bind symbols while keeping the size of the distributed representation fixed, regardless of whether they represent one single entity or a nested structure of multiple entities. VSAs achieve it by encoding base codewords as high-dimensional random vectors, aka hypervectors. As such, they represent data in a distributed manner and often serve as a mediator between the rule-based symbolic reasoning and connectionist models that include neural networks. Recent work [6] has shown how VSA, Siena’22: NeSy 2023, 17th international workshop in Neural-Symbolyc Learning and Reasoning, July 03–05, 2022, Siena, Italy * Corresponding author. $ abr@zurich.ibm.com (A. Rahimi) © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) 1 Michael Hersche et al. CEUR Workshop Proceedings 1–10 Figure 1: (a) We encode multiple objects as the superposition of the individual high-dimensional symbol binding. (b) Commonly used pure sequential explaining away decoding with resonator networks results in noise amplification (indicated with grey arrows). (c) Our mixed decoding performs multiple decodings in parallel which allows finding the correct solution. as a common language between neural networks and symbolic AI, can overcome the neural network binding problem and exhaustive symbolic search. While VSAs efficiently encode single or multiple objects into complex structures, decoding is often challenging. Let us focus on two core operations used by VSAs. Additive superposition (⊕), to which we will refer in short as a superposition, creates a sum vector similar to all of the input vectors. In contrast, multiplicative binding (⊙), or simply binding, produces a product vector dissimilar to its factors. To check whether a given query vector is available in the sum vector, we can calculate the similarity between the sum vector and the query. For binding, however, we would need to check all the possible combinations of factors. This makes the decoding of bound symbols a hard combinatorial problem [7]. An elegant solution is resonator networks [7, 8] which iteratively decode the binding by exploring only a fragment of the exponential search space. The resonator network also demon- strated to decode the superposition of more than one bound symbol by repeatedly invoking the resonator network decoding and applying an explaining away operation (subtraction) of the result from the query. There were, however, two limitations with the resonator network. First, its dynamical system suffers from a relatively low operational capacity (i.e., the low ratio between the exponential search space and the required vector dimensionality to represent bound symbols). This is mainly due to limit cycles, deterministic computation, and lack of nonlinearity during each iteration. A stochastic sparsely-activated resonator [9] recently addressed this limitation by exploiting the stochasticity of memristive devices and by proposing nonlinear sparse activations. These new dynamics enhance the operational capacity by five orders of magnitude and reduce the spatial and time complexity associated with the factorization task. The second limitation of the resonator network is the explanation away decoding, which we address in this work (see Fig. 1). More precisely, when decoding the superposition by explaining 2 Michael Hersche et al. CEUR Workshop Proceedings 1–10 away the resonator’s product estimates, a mistake is probable with more added bound symbols. It is because a query that contains several superposed vectors can be seen as noisy—each added term can be interpreted as noise from the standpoint of decoding the other. After decoding the wrong bound symbols, the resonator network runs into a noise amplification problem: subtracting bound symbols that were not present adds more noise to the query, making it even harder to decode the remaining bound symbols. In this work, we present a configurable decoding method to efficiently extract a set of bound symbols—each represented by a Hadamard product distributed vector—from their fixed-width superposition. As such, we detail how naive sequential decoding can be enhanced by generating multiple queries in parallel by means of a sampling mechanism (see Fig. 1c). By combining sequential and parallel decoding methods, our mixed decoding approach mitigates the risk of noise amplification and increases the number of bound symbols that can be successfully decoded by up to 8× at the same vector dimensionality. 2. Background 2.1. Vector Symbolic Architectures (VSAs) Vector symbolic architectures (VSAs) represent base entities as random high-dimensional vectors. There are different VSA variants (see [10, 11] for a review). This work focuses on the Multiply-Add-Permute (MAP) coding scheme [2] where vectors are bipolar: x ∈ {−1, 1}𝐷 . The similarity of two vectors is defined as their cosine similarity. Usually, we assume that the code vectors are generated randomly and stored in a lookup memory, often called codebook. The algebra on such vectors is defined using three basic operations. Multiplicative binding (⊙) is the elementwise multiplication (i.e., the Hadamard product) of vectors. The resulting vector is dissimilar to the bound factors. The inverse operation is the unbinding, which is also the elementwise multiplication in the MAP architecture. Bundling (⊕) is the additive superposition of a set of vectors. Optionally, the result can be biploarized by setting positive values to “+1” and negative ones to “-1”; zero values (ties) are randomly set to “+1” or “-1.” The resulting vector is similar to all the superposed terms. Finally, permutation (Π) shuffles the elements in a vector according to the given permutation. It is suitable for encoding sequences. 2.2. Encoding Nested Structures with VSAs Here, we formalize the representation of nested data structures using VSAs. Let us first focus on the VSA-based symbol binding with 𝐹 symbols. The encoding starts with defining 𝐹 𝑀𝑓 separate codebooks (X1 , X2 , ..., X𝐹 ). The codebooks are defined as X𝑓 := {x𝑓𝑖 }𝑖=1 , where 𝑓 each x𝑖 ∈ {−1, 1} , 𝑓 ∈ {1, 2, ..., 𝐹 } is a randomly drawn bipolar vector. Each codebook 𝐷 contains as many codewords (𝑀𝑓 ) as the number of values the corresponding symbol can have1 . For example, we could have one color codebook to store three random vectors representing red, blue, and green colors. In a second codebook, we could store the possible shapes with three 1 In most cases, we assume that all codebooks contain the same number of codewords, i.e., 𝑀𝑖 = 𝑀𝑗 ∀𝑖, 𝑗 ∈ {1, 2, ..., 𝐹 }. 3 Michael Hersche et al. CEUR Workshop Proceedings 1–10 code vectors for square, triangle, and circle. A product vector p = ⊙𝐹𝑓=1 x𝑓𝑐𝑓 is then formed by binding the selected symbols from the codebooks. As a next step, the combination of binding and superposition can describe more complex structures: 𝑁 𝑁 𝐹 ⊙x . ∑︁ ∑︁ q= p𝑖 = 𝑓 𝑐𝑖,𝑓 (1) 𝑖=1 𝑖=1 𝑓 =1 Overall, we superpose 𝑁 different products (i.e., p𝑖 ̸= p𝑗 , ∀𝑖 ̸= 𝑗). In most applications, it is assumed that the number of superposed terms is known. The goal of the decoding is to find all indices ^𝑐𝑖,𝑓 , ∀𝑖 ∈ {1, ..., 𝑁 }, 𝑓 ∈ {1, ..., 𝐹 }. 2.3. Resonator networks To decompose a single symbol bound representation (p), we aim to find the estimated factors ^ 𝑓 ∈ X𝑓 , ∀𝑓 ∈ {1, 2, ..., 𝐹 } that satisfy x 𝐹 p= ⊙ x^ . 𝑓 (2) 𝑓 =1 A naive brute-force approach would compare the product vector (p) to all possible combinations spanned by the product space 𝐹 ⊙ X := x ⊙ x ⊙ ... ⊙ x , x ⊙ x ⊙ ... ⊙ x , ..., x 𝑓 . (3) {︀ 1 2 𝐹 1 2 𝐹 1 2 𝐹 }︀ P= 1 1 1 2 1 1 𝑀1 ⊙ x𝑀2 ⊙ ... ⊙ x𝑀𝐹 𝑓 =1 (︁∏︀ )︁ 𝐹 This results in a combinatorial search problem requiring 𝑓 =1 𝑀 𝑓 similarity computations. Alternatively, resonator networks [7, 8] reduce the decoding complexity by iteratively search- ing in superposition through all possible solutions. Recently, a nondeterministic and nonlinear variant [9] introduced nonlinear activations and stochastic behavior, notably improving the factorization convergence and the solvable problem size. The decoding begins with initializing the estimate factors (x ^ 𝐹 (0)) by bundling all ^ 1 (0), ..., x vectors from the corresponding codebooks. Then, each estimate factor (x ^ 𝑓 (𝑡)) at iteration 𝑡 is updated through the following steps. The estimate factors from the previous iteration are unbound from the product vector using the Hadamard product: ⎛ ⎞ 𝐹 x̃𝑓 (𝑡) = p ⊙ ⎝ ⊙ x^ (𝑡)⎠ 𝑖 (4) 𝑖=1,𝑖̸=𝑓 Next, we query the associative memory, which contains the codebook X𝑓 for the factor 𝑓 , with the unbound factor estimates. At iteration 𝑡, this yields an 𝑀𝑓 -dimensional vector of 4 Michael Hersche et al. CEUR Workshop Proceedings 1–10 similarity scores (a𝑓 (𝑡)) for each factor 𝑓 . The 𝑖-th element in a𝑓 (𝑡) is computed using the cosine similarity (cos): a𝑓 (𝑡)[𝑖] = cos(x̃𝑓 (𝑡), x𝑓𝑖 ). (5) The nondeterministic and nonlinear resonator network variant [9] processes the similarity vector by adding additive random noise and passing it through a threshold function: (︁ )︁ a′𝑓 (𝑡)[𝑖] = thresh a𝑓 (𝑡)[𝑖] + 𝑛; 𝑇 (6) where 𝑛 is an i.i.d. Gaussian random variable with zero mean an standart deviation 𝜎, and thresh (·; 𝑇 ) propagates similarity values that are larger than the threshold 𝑇 , whereas lower ones get zeroed out. Finally, we generate the next factor estimate x^ 𝑓 (𝑡+1) as the weighted bundling of the factor’s codevectors together with an elementwise sign activation (sign): (︁ )︁ x^ 𝑓 (𝑡 + 1) = sign (X𝑓 )𝑇 a′𝑓 (𝑡) (7) The iterative decoding is repeated until the resonator converges or a predefined maximum number of iterations (𝑁 ) is reached. The convergence detection mechanism is based on an additional, fixed threshold. Decoding is stopped as soon as all similarity vectors contain an element that exceeds a predefined detection threshold value [9]. 3. Related Works The decoding of the nested structures in Eq. (1) can be posed as a pure superposition decoding task when unrolling the entire product space. A simple approach is to compute the similarity between the superposition (q) and each vector in the product space (P). Instead of the bipolar product matrix, an alternative readout matrix can be optimized to minimize the mean-squared error (MMSE) between the estimated and ground-truth indices, which enables the retrieval of more vectors from the superposition [12]. Another approach is to do iterative decoding using explaining away steps, where the current estimate is subtracted from the superposition, effectively reducing the interference. The iterative decoder is beneficial in robust short-package communication [13], and has been improved with a soft-feedback alignment and an MMSE readout for better retrieval, reaching to the state-of-the-art capacity of 1.2 bits/dimension [14]. The approaches above are able to retrieve many elements from a superposition; however, they cannot leverage the implicit factor structure that the product space provides. As a result, they need to search through all possible combinations in each decoding step, which requires maintaining explicitly all product vectors in the memory and computing the similarity between each product vector and the query. To this end, resonator networks have not only demonstrated to factorize single bound symbols, but also the superposition of multiple bound symbols. In an experiment [7], the colored MNIST digits were positioned at nine different positions (three vertical and three horizontal). Each colored digit was then described as the binding of four symbols: vertical position with three values, horizontal position with three values, color with 5 Michael Hersche et al. CEUR Workshop Proceedings 1–10 Figure 2: Configurable mixed decoding of superpositions of bound symbols. The variable width (𝑊 ) describes the number of parallel decodings per step using query sampling and resonator-based decoding. The explaining away procedure is repeated for 𝐿 steps. seven values, and digits with ten values. This yielded a product space with 630 combinations. A vector dimension of 𝐷 = 500 was sufficient to distinguish between all combinations. The resonator network was queried in multiple sequential steps to retrieve multiple bound symbols. In each step, the resonator network’s estimation was subtracted from the input query. This explaining away approach reduced the interference from all the elements in the initial query. Despite limited problem size (630), the resonator network faced challenges to reliably retrieve more than one bound symbol from the superposition. The main problem is the noise amplifica- tion in the pure sequential decoding: an additional vector is added to the superposition if the prediction is wrong (see Fig. 1b). This additional vector can be interpreted as noise. In this work, we enhance the sequential decoding by performing multiple decodings in parallel in each step. We generate multiple queries for the resonator network by performing a dedicated sampling. Our mixed decoding approach can retrieve up to 8× more bound symbols from a larger problem space (12,100) and a smaller dimensionality (𝐷=256). 4. Mixed Approach to Superposition Decoding This section presents the main contribution of the paper: we propose a mixed decoding approach to retrieve superpositions of bound symbols. Fig. 2 depicts the architecture of our mixed decoding. It relies on sequential decoding by explaining away steps, and extends it with parallel decodings in each step. The number of parallel decodings (denoted as the width 𝑊 ) and the number of sequential steps (depth 𝐿) can be configured depending on the problem at hand, as we will show in the experimental results (see Table 1). 4.1. Parallel Decoding The parallel decoding makes 𝑊 predictions per step by decoding 𝑊 different sub-queries. The sub-queries can be generated by random sampling. For example, if the input query is a superposition of an even number of bound bipolar symbols, the zeros (ties) in the query vector can be randomly replaced with either a “+1” or a “-1”. Suppose no ties are present in the superposition (e.g., due to an odd number of superpositions). In that case, we use an alternative 6 Michael Hersche et al. CEUR Workshop Proceedings 1–10 sampling strategy that samples each element based on its magnitude: {︃ +1, w. p. (𝑁 + q[𝑖])/(2𝑁 ) q𝑤 [𝑖] = (8) −1, otherwise, where 𝑁 is the number of superpositions and 𝑤 is the index of the sub-query. At step 𝑙, the parallel decoding yields 𝑊 possible solutions. Finally, we select the most promising solution based on the highest similarity between the predicted bound symbol (r 𝑙 ) and the original query. ^𝑤 The selected predicted bound symbol (r^𝑙 ) is then passed to the next step for the explaining away. 4.2. Sequential Explaining Away As the next step, the selected predicted bound symbol is subtracted from the input query, yielding q𝑙+1 = q𝑙 − sim(q, ^ r𝑙 ) · ^ r𝑙 , (9) where sim(·, ·) is the cosine similarity. We scale the subtraction with the similarity between the selected estimate and the original query, which serves as a confidence indicator [14]. The explaining away step attenuates predictions with low confidence, whereas highly confident predictions are fully subtracted. 4.3. Final Result Selection After performing 𝐿 decoding steps, we select the 𝑁 distinct predictions with the highest confidence based on the cosine similarity between their bound symbol representation and the input query. 5. Experimental Results We evaluate the proposed mixed decoding on the synthetic superposition of bound symbols and query examples generated by a neural network. We use the state-of-the-art stochastic sparsely-activated resonator [9] for optimal factorization performance. Following the practice by [8], the number of iterations is limited to 1/1000 of the product space and at least 2000. Retrieval Capacity on Synthetic Queries. We start with the evaluation of our approach to synthetically generated data. We choose a problem with 𝐹 =2 factors, an equal codebook size of 𝑀1 =𝑀2 =110, and a dimension 𝐷=256. This yields a total problem size of 𝑀1 · 𝑀2 =12,100 which, e.g., could be encountered in the extreme-label text classification task of Amazon-12K [15]. Compared to [7], our proposed synthetic problem is notably larger (12,100 vs. 630) while the dimension is reduced (256 vs. 500). To evaluate the performance, we conduct 1000 experiments. We randomly select 𝑁 symbol pairs in each experiment, bind them, and superpose all the bound symbol pairs. The average decoding accuracy is determined as the average ratio between correctly factorized symbol pairs and the total number of superposed pairs (𝑁 ). Finally, we 7 Michael Hersche et al. CEUR Workshop Proceedings 1–10 Table 1 Retrieval capacity from synthetic examples. The stochastic sparsely-activated resonator is configured with 𝐹 =2 factors, codebook size of 𝑀1 =𝑀2 =110, dimension 𝐷=256, and maximum of 2000 iterations. 𝑁 is the number of superposed bound symbols. Method Width Depth Total decodings Retrieval capacity Explaining away [7] 1 𝑁 𝑁 1 Sequential decoding 1 2𝑁 2𝑁 5 1 4𝑁 4𝑁 5 Parallel decoding 2𝑁 1 2𝑁 1 Mixed decoding 2 𝑁 2𝑁 5 4 𝑁 4𝑁 8 define the retrieval capacity as the highest number of superposed symbol pairs that can be successfully decoded with >99% accuracy. Table 1 compares the retrieval capacity of the stochastic sparsely-activated resonator with different decoding approaches. Each approach can be described by its width and depth, which yields the resonator’s total queries to be decoded. The standard explaining away [7] can only retrieve one bound symbol pair, i.e., it cannot decode any superposition. Increasing the depth to twice the number of superpositions (2𝑁 ) improves the retrieval capacity to 5 superpositions. However, the benefit of the depth-increasing approach rapidly saturates: the retrieval capacity stays the same for a depth of 2𝑁 and 4𝑁 . Even though the parallel decoding is not effective in isolation, it notably improves the retrieval capacity when applied with sequential decoding (hence we name it, mixed decoding). Increasing the width to 2 improves the retrieval capacity to 5. Moreover, further increasing the width to 4 yields the highest retrieval capacity (8). Compared to pure sequential decoding with a depth of 4𝑁 , our mixed decoding can retrieve more bound symbol pairs (8 vs. 5) while requiring the same number of queries (4𝑁 ). RAVEN. The RAVEN dataset [16] contains Raven’s progressive matrices tests with gray-scale images with a resolution of 160 × 160. A test provides 8 context and 8 answer panels, each containing objects with the following attributes: position, type, size, and color. The objects can be aligned in 7 different constellations containing a variable maximum number of objects (see 𝑁max in Table 2). Following [6], we combine the positions from the different constellations, yielding 22 unique positions. Overall, the dataset contains 𝐶=6600 attribute combinations (22 positions × 5 types × 6 sizes ×10 colors). For every constellation, the dataset provides 6000 examples for training, 2000 for validation, and 2000 for testing. This work focuses on recognizing all the objects in a given panel. We define the four codebooks for position, type, size, and color. We then train a ResNet-18 to map an input image to a 𝐷- dimensional query (q), which resembles the superposition of all the bound symbols present in the image. This is achieved by maximizing the similarity between the query and each bound symbol using an additive cross-entropy loss [6]. Note that only the trainable weights of the ResNet-18 are updated during training while the codebooks are frozen. We train the architecture for 100 epochs using stochastic gradient descent (SGD) with a batch size of 256. 8 Michael Hersche et al. CEUR Workshop Proceedings 1–10 Table 2 Average object recognition accuracy on the RAVEN dataset. We set the decoding depth to 𝑁max +1 and the width to 1. The number of objects in each panel is either known (oracle) or not (threshold). C=Center; L-R=Left-right; U-D=Up-down; OI-C=Out-in center; 2x2=2x2 grid, OI-G=out-in grid; 3x3=3x3 grid; Avg.=Average accuracy. C L-R U-D OI-C 2x2 OI-G 3x3 Avg. Max. number of objects (𝑁max ) 1 2 2 2 4 5 9 Brute force [6] threshold 100% 99.9% 100% 99.9% 99.5% 99.0% 83.6% 97.4% oracle 100% 100% 100% 100% 100% 99.6% 95.3% 99.3% Resonator (ours) threshold 100% 100% 100% 100% 100% 99.0% 83.6% 97.5% oracle 100% 100% 100% 100% 100% 99.9% 99.3% 99.9% As a decoding baseline, the brute force approach computes the similarity between the query and each possible bound symbol in the product space. It selects the top-𝑘 bound symbols with the highest similarity if an oracle provides the number of objects. Otherwise, it detects the bound symbols whose similarity exceeds a threshold. Our resonator-based approach can distinguish the same between oracle and threshold. Table 2 compares the classification accuracy of the brute force and our resonator-based decoding. Both approaches can recognize up to five objects with high accuracy (≥ 99%) using only the threshold. The main challenge, however, appears in the 3x3 grid constellation with up to nine objects. If the exact number of objects is unknown, the brute force and resonator are on par (83.6%). However, the resonator outperforms the brute force in the oracle case by a margin of 4% (99.3% vs. 95.3%). At the same time, it reduces the computational cost by performing searching in superposition without explicitly searching through the entire product space. 6. Conclusions We present a new decoding approach for efficient retrieval of hyperdimensional bound symbols from additived superpositions using the stochastic sparsely-activated resonator networks. Our novel mixed decoding includes both sequential and parallel decoding with configurable width and depth. When combined with a query sampling mechanism, our mixed approach retrieves up to 8× more additive components, compared to pure sequential decoding. In future work, we plan to investigate real-world applications with larger problems such as extreme-label classification containing up to 670,000 combinations [15]. References [1] R. W. Gayler, Multiplicative binding, representation operators & analogy, in: Advances in Analogy Research: Integration of Theory and Data from the Cognitive, Computational, and Neural Sciences, 1998. [2] R. W. Gayler, Vector symbolic architectures answer Jackendoff’s challenges for cognitive 9 Michael Hersche et al. CEUR Workshop Proceedings 1–10 neuroscience, in: Proceedings of the Joint International Conference on Cognitive Science. ICCS/ASCS, 2003, pp. 133–138. [3] T. A. Plate, Holographic reduced representations, IEEE Transactions on Neural Networks 6 (1995) 623–641. [4] T. A. Plate, Holographic Reduced Representations: Distributed Representation for Cogni- tive Structures, Center for the Study of Language and Information, Stanford, 2003. [5] P. Kanerva, Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors, Cognitive Computation 1 (2009) 139–159. [6] M. Hersche, M. Zeqiri, L. Benini, A. Sebastian, A. Rahimi, A neuro-vector-symbolic architecture for solving Raven’s progressive matrices, Nat. Mach. Intell. (2023). [7] E. P. Frady, S. J. Kent, B. A. Olshausen, F. T. Sommer, Resonator networks, 1: An efficient solution for factoring high-dimensional, distributed representations of data structures, Neural Computation 32 (2020) 2311–2331. [8] S. J. Kent, E. P. Frady, F. T. Sommer, B. A. Olshausen, Resonator networks, 2: Factorization performance and capacity compared to optimization-based methods, Neural Computation 32 (2020) 2332–2388. [9] J. Langenegger, G. Karunaratne, M. Hersche, L. Benini, A. Sebastian, A. Rahimi, In-memory factorization of holographic perceptual representations, Nature Nanotechnology (2023). [10] D. Kleyko, D. Rachkovskij, E. Osipov, A. Rahimi, A survey on hyperdimensional com- puting aka vector symbolic architectures, part i: Models and data transformations, ACM Computing Surveys 55 (2022). [11] D. Kleyko, D. Rachkovskij, E. Osipov, A. Rahimi, A survey on hyperdimensional computing aka vector symbolic architectures, part ii: Applications, cognitive models, and challenges, ACM Computing Surveys 55 (2023). [12] E. P. Frady, D. Kleyko, F. T. Sommer, A theory of sequence indexing and working memory in recurrent neural networks, Neural Computation 30 (2018) 1449–1513. [13] H.-S. Kim, HDM: Hyper-dimensional modulation for robust low-power communications, in: 2018 IEEE International Conference on Communications (ICC), 2018. [14] M. Hersche, S. Lippuner, M. Korb, L. Benini, A. Rahimi, Near-channel classifier: symbiotic communication and classification in high-dimensional space, Brain Informatics 8 (2021) 16. [15] A. Ganesan, H. Gao, S. Gandhi, E. Raff, T. Oates, J. Holt, M. McLean, Learning with holographic reduced representations, in: Advances in Neural Information Processing Systems, volume 34, 2021, pp. 25606–25620. [16] C. Zhang, F. Gao, B. Jia, Y. Zhu, S.-C. Zhu, Raven: A dataset for relational and analogical visual reasoning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 10