<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Harnessing the Advantages of Binary Networks for Neural-Symbolic Computing</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nataliia Kunanets</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuriy Shcherbyna</string-name>
          <email>yshcherbyna@yahoo.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Volodymyr Karpiv</string-name>
          <email>volodymyr.karpiv@gmail.com</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff2">2</xref>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Lviv, Ukraine</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Ivan Franko National University of Lviv</institution>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Lviv Polytechnic National University</institution>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Neural-Symbolic Computing</institution>
          ,
          <addr-line>Symbolic AI, Connectionist AI, Deep Learning</addr-line>
          ,
          <country>Binary Neural Networks</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>SoftServe</institution>
          ,
          <country country="UA">Ukraine</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In the dynamic field of AI, this paper explores the fusion of Neural-Symbolic Computing with binary neural networks, aiming to unify the precise logic of Symbolic AI with the adaptability of Connectionist AI. Focusing on integrating logical reasoning, this approach seeks to overcome the constraints of conventional methodologies. Our study emphasizes the significance of binary networks in achieving computational eficiency and structured logic integration. Utilizing the MNIST dataset, we demonstrate the practicality of our framework, while acknowledging the need to extend our methods to more complex systems and a broader array of datasets. This research lays the groundwork for future AI models that harmoniously combine learning and reasoning, paving the way for enhanced capabilities in various AI applications.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>In the evolving landscape of artificial intelligence (AI), the pursuit of efective computational
models has led to diverse philosophies and methodologies. Historically, the AI community has
oscillated between two dominant paradigms: Symbolic AI and Connectionist AI. This paper
argues for the importance of Neural-Symbolic Computing, a field that synergizes the strengths
of both approaches. We aim to establish a framework based on binary neural networks as a
foundation for integrating logic and logical operators, addressing a critical gap in current AI
methodologies.</p>
      <p>
        In the dawn of AI research, Symbolic AI reigned supreme. This paradigm, rooted in formal
logic and symbolic reasoning, was driven by the belief that intelligence could be emulated by
explicitly programming rules and symbols. Pioneers like Newell and Simon, with their General
Problem Solver [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], exemplified this belief. Symbolic AI excelled in domains with well-defined
rules and clear objectives, such as chess. However, it struggled with real-world scenarios that
required adaptive learning and handling of ambiguous data.
      </p>
      <p>
        The limitations of Symbolic AI led to the ascendance of Connectionist AI, marked by the
development of artificial neural networks. Inspired by biological neural networks, this approach
focuses on learning from data, allowing systems to adapt to new situations and recognize
patterns. Seminal works like Rumelhart, Hinton, and Williams’ backpropagation algorithm
[
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] catalyzed the deep learning revolution. However, this shift also led to skepticism towards
knowledge-based systems, such as knowledge graphs, which were seen as rigid and unable to
cope with the complexity and variability of real-world data.
      </p>
      <p>
        Neural-Symbolic Computing emerges as a promising paradigm that integrates the symbolic
reasoning of Symbolic AI with the adaptive learning capabilities of Connectionist AI. This
hybrid approach aims to leverage the interpretability and structured knowledge representation
of symbolic systems alongside the pattern recognition and learning eficiency of neural networks.
Notable works in this domain include the integration of logic programming with neural networks,
as demonstrated by Garcez et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], and the development of diferentiable logic models.
      </p>
      <p>The primary advantage of Neural-Symbolic Computing lies in its potential to handle complex,
real-world problems that require both structured knowledge and adaptive learning. It ofers
interpretability, a critical aspect in fields like healthcare and finance, where understanding
decision-making processes is crucial. However, challenges remain, particularly in integrating
these paradigms eficiently and ensuring that the hybrid models retain the strengths of both
parent domains.</p>
      <p>This paper contributes to the Neural-Symbolic Computing field by proposing a binary neural
network framework. This framework aims to serve as a robust basis for integrating logic and
logical operators, addressing a gap in current methodologies. By focusing on binary neural
networks, we aim to enhance computational eficiency and provide a more structured approach
to logic integration in neural networks. The development of this framework is a step towards
more sophisticated AI models that can seamlessly incorporate both learning and reasoning, a
crucial advancement for complex problem-solving in various domains.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Related Works</title>
      <p>
        Symbolic AI, a foundational pillar in artificial intelligence, has significantly evolved through
contributions emphasizing logic, symbols, and rule-based processing. The General Problem
Solver (GPS) by Newell and Simon was a pioneering development [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], showcasing AI’s ability
to replicate human problem-solving using symbolic representations. This was further advanced
by John McCarthy’s work on formal logic and knowledge representation Check [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ], Check
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ], leading to a deeper understanding of how machines manipulate abstract concepts. Terry
Winograd’s SHRDLU program [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] extended Symbolic AI into natural language processing,
demonstrating machines’ capability to interpret and respond to human language in structured
environments. The practical application of Symbolic AI was further exemplified in Feigenbaum
and Barr’s DENDRAL project [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ], applying it to chemistry, and Nilsson’s STRIPS system for
robotics [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ], showcasing the versatility of Symbolic AI across various domains.
      </p>
      <p>
        The emergence of Connectionist AI, with its focus on artificial neural networks and
datadriven learning, marked a significant shift from Symbolic AI’s rule-based approach. The
development of the backpropagation algorithm by Rumelhart, Hinton, and Williams [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] laid the
foundation for modern deep learning, emphasizing adaptive learning’s value. Yann LeCun’s
convolutional neural networks (CNNs) [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] revolutionized pattern recognition in image
classiifcation, demonstrating neural networks’ practical capabilities in visual data processing. The
introduction of Long Short-Term Memory (LSTM) networks by Hochreiter and Schmidhuber
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] expanded these applications to sequential data processing, like language understanding,
showcasing Connectionist AI’s versatility. Bengio, LeCun, and Hinton’s overview of deep
learning [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ] and Krizhevsky, Sutskever, and Hinton’s AlexNet [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ] further underscored the
adaptability and eficacy of neural network-based AI approaches.
      </p>
      <p>
        Neural-Symbolic Computing, an integrative field blending neural networks’ learning
capabilities with Symbolic AI’s structured reasoning, emerged as AI research progressed. The
integration of logic programming with neural networks by d’Avila Garcez [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], laid the
groundwork for combining adaptive learning with logical reasoning. The new approach was introduced
by França et al. [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ], and Harnad’s exploration of the Symbol Grounding Problem [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. These
studies collectively illustrate Neural-Symbolic Computing’s potential in addressing complex
problems requiring both structured knowledge and adaptive learning.
      </p>
      <p>
        Recent reviews in the field of neuro-symbolic computing have illuminated its evolving
landscape and critical aspects. Wang and Yang [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] ofer a systematic overview of
neurosymbolic computing advancements, emphasizing its role in merging symbolic reasoning with
neural network learning for future AI development. Garcez and Lamb [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ] discuss the integration
of deep learning with logical reasoning in neuro-symbolic computing, stressing the need for AI to
be safe and interpretable. Each paper collectively underscores the significance of neuro-symbolic
computing in achieving trustworthy and advanced AI systems.
      </p>
      <p>
        In the neuro-symbolic computing domain, key studies have focused on the synergy of symbolic
reasoning and neural learning to enhance AI development. Smolensky [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ] and [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ], in two
distinct papers, emphasizes neurocompositional computing, advocating for the integration of
Compositionality and Continuity to facilitate advanced AI systems with human-like cognition.
Hitzler [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] ofers a broad survey of the neuro-symbolic field, noting the blend of machine
learning and symbolic AI as a significant trend. Van Krieken [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ] delves into diferentiable fuzzy
logic’s role in neural network training, incorporating symbolic knowledge for improved learning
outcomes. Hoernle [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ] presents MultiplexNet, a method that integrates logical formulas to
refine neural network training and decision-making. Silver [
        <xref ref-type="bibr" rid="ref22">22</xref>
        ] explores the application of
neuro-symbolic approaches in robotics, particularly in task and motion planning. Giunchiglia
[
        <xref ref-type="bibr" rid="ref23">23</xref>
        ] investigates the use of logical constraints to enhance deep learning models, focusing on
performance and safety. Collectively, these works highlight the importance of merging symbolic
and neural methods to create AI systems that are more efective, interpretable, and closely
aligned with human cognitive processes.
      </p>
      <p>
        Exploring the forefront of neuro-symbolic computing, recent studies have innovated in
architecture, logic, and reasoning frameworks to enhance AI’s capabilities. Karpas [
        <xref ref-type="bibr" rid="ref24">24</xref>
        ] presents
the MRKL system, a neuro-symbolic architecture combining large language models with discrete
reasoning, addressing the limitations of conventional language models. Stehr [
        <xref ref-type="bibr" rid="ref25">25</xref>
        ] proposes
a Probabilistic Approximate Logic for neuro-symbolic learning, facilitating the integration
of domain knowledge and neural computation. Pryor [
        <xref ref-type="bibr" rid="ref26">26</xref>
        ] introduces NeuPSL, an
energybased neuro-symbolic framework that significantly improves performance in low-data settings
by integrating neural and symbolic learning. Aditya [
        <xref ref-type="bibr" rid="ref27">27</xref>
        ] discusses PyReason, a software
for open-world temporal logic, enhancing reasoning over graphical structures and providing
explainable inference. Hersche [
        <xref ref-type="bibr" rid="ref28">28</xref>
        ] proposes a neuro-vector-symbolic architecture (NVSA) that
addresses the binding problem and rule-search ineficiencies, demonstrating high accuracy in
cognitive tasks. Lastly, Li [
        <xref ref-type="bibr" rid="ref29">29</xref>
        ] introduces a softened symbol grounding approach for
neurosymbolic systems, improving the interaction between neural training and symbolic reasoning.
These contributions collectively advance the neuro-symbolic field, pushing AI towards greater
eficiency, interpretability, and integrated reasoning capabilities.
      </p>
      <p>
        In the realm of visual data analysis, neuro-symbolic computing is revolutionizing the way AI
interprets and interacts with imagery. Yu, Yang [
        <xref ref-type="bibr" rid="ref30">30</xref>
        ] develop a bi-level probabilistic graphical
reasoning framework, BPGR, enhancing Visual Relationship Detection (VRD) by integrating
symbolic knowledge with deep learning, improving performance and interpretability. Gupta
and Kembhavi [
        <xref ref-type="bibr" rid="ref31">31</xref>
        ] introduce VISPROG, a neuro-symbolic system for compositional visual tasks
using natural language instructions, bypassing task-specific training by generating modular
programs for interpretable solutions. Surís [
        <xref ref-type="bibr" rid="ref32">32</xref>
        ] presents ViperGPT, a framework combining
vision-and-language models into executable subroutines for visual query answering, improving
interpretability and task generalization without further training. Li [
        <xref ref-type="bibr" rid="ref33">33</xref>
        ] proposes LOGICSEG, a
visual semantic parser that merges neural learning and logic reasoning, structuring semantic
concepts hierarchically for improved segmentation and cognition-mimetic reasoning. These
innovative approaches demonstrate neuro-symbolic computing’s potential to advance AI’s
capabilities in visual data processing, ofering more eficient, interpretable, and adaptable
solutions.
      </p>
      <p>
        In the sphere of reinforcement learning applications, neuro-symbolic computing is enhancing
AI’s problem-solving capabilities. Jin [
        <xref ref-type="bibr" rid="ref34">34</xref>
        ] introduces a deep reinforcement learning framework
with symbolic options, addressing challenges of data eficiency, interpretability, and
transferability. Their framework, validated in-game and real-world scenarios, shows improved performance
by integrating symbolic knowledge to guide policy enhancement through planning and learning
from interactive trajectories. Tian [
        <xref ref-type="bibr" rid="ref35">35</xref>
        ] proposes a weakly supervised neural symbolic learning
model, WS-NeSyL, for cognitive tasks, leveraging logical reasoning. This model enhances
learning eficiency and accuracy by using a back search algorithm to generate pseudo labels for
supervision and incorporating probabilistic logic regularization. These approaches demonstrate
how embedding symbolic reasoning into reinforcement learning can significantly improve AI’s
ability to learn and adapt across diferent domains and tasks.
      </p>
      <p>
        In the quest for eficient AI systems, the field of Binary Networks has emerged as a promising
avenue for integrating logic into AI. Courbariaux et al.’s BinaryConnect [
        <xref ref-type="bibr" rid="ref36">36</xref>
        ] introduced the
concept of training neural networks with binary weights, significantly reducing computational
complexity and memory requirements. Rastegari et al.’s study on Binary-Weight-Networks [
        <xref ref-type="bibr" rid="ref37">37</xref>
        ]
applied binary weights to large-scale image processing tasks, demonstrating these networks’
practicality in complex applications. Hubara et al.’s comprehensive study on Binarized Neural
Networks [
        <xref ref-type="bibr" rid="ref38">38</xref>
        ] extended the binary concept to both weights and activations, enhancing
eficiency in network computation and storage. Lin et al.’s research on reducing multiplication
operations in neural networks [
        <xref ref-type="bibr" rid="ref39">39</xref>
        ] highlighted the importance of computational eficiency in
AI deployment, especially in resource-constrained environments. Zhou et al.’s development of
DoReFa-Net [
        <xref ref-type="bibr" rid="ref40">40</xref>
        ], proposing a method for training neural networks with low bit-width weights
and activations, ofered insights into balancing eficiency and accuracy in neural architectures.
These advancements in Binary Networks represent a significant step towards creating more
eficient AI systems, making them highly suitable for integrating logic into AI, particularly in
sectors where computational resources are limited.
      </p>
      <p>
        In the realm of binary neural networks, significant strides have been made in various
applications as evidenced by several notable papers. Zhuang et al. [
        <xref ref-type="bibr" rid="ref41">41</xref>
        ], introduces an innovative
approach for detecting similarity in binary code, emphasizing the importance of semantic
awareness in neural networks. Martinez et al. [42], explores techniques to enhance the training
of binary neural networks, leveraging real-to-binary convolutions for improved eficiency and
performance. Bai et al. [43], represents a breakthrough in natural language processing, pushing
the boundaries of BERT model quantization to achieve eficient, yet powerful, binary
representations. Lastly, Lin et al. [44], presents a specialized binary neural network design tailored
for eficient keyword spotting, showcasing the adaptability and potential of binary networks
in audio processing tasks. Together, these works by Zhuang, Martinez, Bai, and Lin highlight
the versatility and advancing capabilities of binary neural networks in diverse domains of AI
research.
      </p>
      <p>This literature review encapsulates the evolution from Symbolic AI’s structured
problemsolving to Connectionist AI’s data-driven learning models, the unifying eforts in
NeuralSymbolic Computing, and the eficiency-driven innovations in Binary Networks. Each field,
with its unique contributions, demonstrates the multifaceted nature of AI research and its
continuous progression towards more sophisticated, eficient, and integrated AI systems.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Methods</title>
      <sec id="sec-4-1">
        <title>3.1. Challenges of Integrating Logic in Learning-Based Algorithms</title>
        <p>The integration of logical reasoning into learning-based algorithms faces the fundamental
challenge of reconciling two inherently diferent paradigms: the symbolic, rule-based approach
and the connectionist, data-driven approach. Symbolic models excel in structured
problemsolving and explicit reasoning, whereas connectionist models thrive on pattern recognition
and implicit learning. Merging these models requires a robust framework that can seamlessly
accommodate the discrete, structured nature of logical rules within the fluid, statistical nature
of neural networks.</p>
        <p>Another significant challenge is preserving the interpretability of logic-based systems when
integrated with learning-based algorithms. Neural networks, especially deep learning models,
are often seen as ”black boxes” due to their complex and opaque decision-making processes.
Integrating logic into these models demands a methodology that enhances their transparency,
ensuring that the decision-making process remains understandable and justifiable, which is
vital for applications in critical domains like healthcare and law.</p>
        <p>Eficiency and scalability pose a third challenge. Traditional logic-based systems are
computationally intensive and do not scale well with the increasing size and complexity of data,
unlike neural networks. Integrating logic into learning-based algorithms requires an approach
that can handle large-scale data without compromising on computational eficiency and speed,
ensuring that the integrated system is both practical and efective for real-world applications.
A key method for integrating logic into learning-based algorithms involves the creation of
hybrid models that blend the strengths of both symbolic and connectionist approaches. In
these models, neural networks are typically utilized for their ability in pattern recognition
and data-driven inference, while symbolic systems are employed for rule-based reasoning and
decision-making. The central aim is to design architectures where these two distinct paradigms
can work in harmony, thus leveraging the adaptability and learning prowess of neural networks
alongside the structured and clear logical reasoning of symbolic systems.</p>
        <p>Addressing the challenge of interpretability in neural networks requires the development of
transparent mechanisms that can efectively map and elucidate their decision-making processes.
This entails devising methods that allow for the visualization and explanation of the neural
network’s inferences in a manner that is coherent with logical reasoning. Employing techniques
such as attention mechanisms can shed light on the specific aspects of data that neural networks
focus on during decision-making. Moreover, the use of explainable AI (XAI) methods can help
in making the decisions of neural networks more transparent, ensuring they are in alignment
with the principles of logical reasoning.</p>
        <p>To enhance computational eficiency in the integration of logical reasoning with
learningbased algorithms, a dual approach of algorithm optimization and hardware acceleration can
be pursued. Developing eficient training algorithms that are less demanding in terms of
computational power and memory is essential. Concurrently, utilizing specialized hardware
designed for neural network processing, like Graphics Processing Units (GPUs) or Tensor
Processing Units (TPUs), can significantly boost the eficiency and scalability of these integrated
systems. This optimization of both software algorithms and hardware resources is crucial for a
seamless and efective integration of logical reasoning into learning-based algorithms.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.3. Potential of Binary Networks in Integrating Logic</title>
        <p>Binary Networks present a simplified computational model compared to traditional neural
networks, which significantly benefits the integration of logical reasoning. Their ability to
represent weights and activations in binary form greatly reduces computational complexity.
This reduction in complexity is particularly harmonious with the structured nature of logical
operations, potentially easing and streamlining the process of integrating logic into these
networks. The simplicity of binary representation in Binary Networks aligns well with the
discrete nature of logical reasoning, suggesting a more natural and efective pathway for merging
these two paradigms.</p>
        <p>Moreover, the binary architecture of these networks drastically cuts down memory
requirements and computational overhead, a crucial advantage when melding them with logic-based
systems. Logic systems, with their symbolic nature, tend to be memory-intensive. The eficiency
of Binary Networks, therefore, makes them ideally suited for scenarios where computational
resources are limited, yet there is a need for robust logical reasoning capabilities. This eficiency
not only reduces the strain on resources but also enhances the feasibility of deploying complex
AI systems in various real-world applications.</p>
        <p>Inherent in their design, Binary Networks operate with discrete values, closely mirroring the
binary nature of logical operations. This intrinsic compatibility suggests that Binary Networks
could serve as an eficient medium for embedding logical reasoning within a neural framework.
Such alignment facilitates a more natural integration of logical reasoning in learning-based
algorithms, potentially leading to AI systems that are both computationally eficient and logically
coherent.</p>
        <p>The architectural eficiency of Binary Networks extends to the processing of logical rules and
operations. When these rules are represented in a binary format, they can be processed more
rapidly and seamlessly within the network’s binary architecture. This synergy enhances the
system’s overall eficiency and efectiveness, making Binary Networks a promising candidate
for developing AI systems that seamlessly blend learning with logical reasoning.</p>
        <p>A particularly noteworthy advantage of Binary Networks is their potential to eliminate the
need for traditional backpropagation, which is a computationally intensive aspect of training
conventional neural networks. The simplified learning process inherent in Binary Networks
allows for the exploration of alternative learning mechanisms that could be more in tune with
the processes of logical reasoning. This capability of integrating logic without relying on
backpropagation represents a significant leap forward in creating more eficient and streamlined
AI systems.</p>
        <p>Lastly, Binary Networks take a step towards the realm of neuromorphic computing, where
the goal is to develop computing architectures that mimic the neural structures of the human
brain. This advancement holds considerable promise for the integration of logical reasoning,
as neuromorphic designs could potentially provide a more intuitive framework for combining
logical reasoning with learning-based approaches. By aligning AI systems more closely with
human-like reasoning processes, Binary Networks could play a pivotal role in the evolution of
artificial intelligence.</p>
      </sec>
      <sec id="sec-4-3">
        <title>3.4. Dataset Preparation and Preprocessing</title>
        <p>The MNIST dataset is a large database of handwritten digits, widely used for training and testing
in the field of machine learning. This dataset contains 70,000 images, split into a training set of
60,000 examples and a test set of 10,000 examples. Each image in the MNIST dataset is a 28x28
pixel grayscale representation of a digit (from 0 to 9). The simplicity and size of the MNIST
dataset make it ideal for experiments in machine learning and neural network architectures.</p>
        <p>In our implementation, the MNIST dataset is loaded using the HDF5 file format, a versatile
data model that can eficiently handle large, complex data. The dataset is then converted into a
lfoating-point format, which is more suitable for processing with neural networks. Specifically,
the pixel values are normalized to aid in the convergence of the training process. The target
variable, which is the actual digit each image represents, is converted into a one-hot encoded
format. One-hot encoding transforms the categorical data into a binary matrix representation,
which is essential for classification tasks in neural networks.</p>
        <p>Once loaded, the dataset is divided into training and test sets. The training set consists
of 60,000 images, while the test set comprises 10,000 images. This separation is crucial for
evaluating the performance of the neural network model; the training set is used to train the
model, and the test set is used to evaluate its performance on unseen data.</p>
        <p>The training data undergoes shufling to ensure that the training process does not get biased
by the order of the data. Shufling the data helps in reducing variance and making sure that
models remain general and overfit less. The data is then divided into batches. Batching is a
crucial process in neural network training, particularly for large datasets like MNIST. It involves
dividing the dataset into smaller, manageable batches, which are then used to train the model
iteratively. This approach is not only computationally eficient but also helps in optimizing the
neural network more efectively.</p>
      </sec>
      <sec id="sec-4-4">
        <title>3.5. Binary Network Model and Initialization</title>
        <p>Unlike traditional neural networks that operate with high-precision weights, Binary Networks
simplify these elements to binary values (-1 or 1). This architecture leads to a significant
reduction in memory requirements and computational overhead, making them particularly
suitable for applications where eficiency is a priority. The inherent simplicity of Binary
Networks also aligns well with logical operations, potentially facilitating the integration of
logical reasoning within a neural framework.</p>
        <p>In our implementation, the Binary Network is initialized with binary values for weights
and biases. Weights and biases are randomly assigned either -1 or 1. This binary initialization
is crucial to maintain the network’s binary nature and aligns with the overall computational
eficiency goal. The architecture of the network can vary depending on the specified number
of layers. For instance, a network with two layers would consist of one hidden layer and one
output layer. However, the model’s architecture is flexible and can be adapted with more layers,
depending on the complexity of the task at hand.</p>
        <p>Each layer in the Binary Network is designed to perform specific transformations on the
input data. The first layer (input layer) receives the raw input data (in the case of the MNIST
dataset, this would be the pixel values of the images). Subsequent hidden layers, if present, are
responsible for extracting and processing features from this input. The final layer (output layer)
produces the classification result, which, for the MNIST dataset, corresponds to the identified
digit.</p>
      </sec>
      <sec id="sec-4-5">
        <title>3.6. Activation Functions</title>
        <p>In our Binary Network, the Rectified Linear Unit (ReLU) function is the primary activation
function, selected for its efectiveness in introducing non-linearity while maintaining computational
simplicity. ReLU, is especially suitable for binary network architectures, facilitating complex
pattern learning eficiently. However, our framework is designed with inherent flexibility,
allowing for easy application of alternative activation functions depending on specific task
requirements or desired network characteristics.</p>
        <p>The network architecture supports several other activation functions, each of which is
already implemented and can be seamlessly integrated into the model. These include the
sigmoid function, binary step functions (binary01, binary11), and the scaled exponential linear
unit (SeLU).</p>
        <p>The sigmoid function, known for its smooth gradient, is particularly useful in scenarios where
a probabilistic output is required:
 () =
1 +  −</p>
        <p>On the other hand, binary step functions, including binary01 and binary11, align closely with
the binary nature of the network, making them ideal for tasks that benefit from a clear, decisive
output:
 01() = {
to a wide range of applications. Whether the task requires smooth probability distributions,
clear binary outputs, or stable training dynamics in deep networks, the framework can easily
accommodate these needs by simply switching the activation function. This feature enhances
the network’s versatility, making it suitable for various machine learning tasks and experimental
setups.</p>
      </sec>
      <sec id="sec-4-6">
        <title>3.7. Loss Functions</title>
        <p>In our Binary Network framework, while cross-entropy is the default loss function, we have
integrated two loss functions to provide diferent options. Cross-entropy, known for its
efectiveness in classification tasks, measures the diference between the predicted probabilities and
the actual distribution of labels. It is particularly valuable in guiding the optimization of neural
networks, especially for multi-class tasks like digit recognition in the MNIST dataset. With M
as the number of classes, the cross-entropy loss is defined:</p>
        <p>The SeLU function introduces self-normalizing properties, which can be advantageous for
maintaining stable gradients in deeper network architectures. In the SeLU function, λ λ and α
α are predefined constants. Typically, λ ≈ 1.0507 λ≈1.0507 and α ≈ 1.67326 α≈1.67326 to ensure
that the mean and variance of the inputs are preserved between layers during training:

=1
 = −</p>
        <p>∑  , log( , )</p>
        <p>However, recognizing the need for versatility in handling diferent types of problems, our
framework also includes the root mean square error (RMSE) as an alternative loss function.
RMSE is readily available and can be easily applied for tasks where the focus is on the magnitude
of errors. This loss function is especially suitable for regression tasks or scenarios where
assessing the accuracy of predictions in a quantitative manner is more relevant than evaluating
probabilistic diferences. The RMSE loss function is defined:
(6)
  =
1 
 Σ=1 (
  −   2</p>
        <p>)
 
√</p>
        <p>The integration of both cross-entropy and RMSE in our framework ofers users the flexibility
to choose the most appropriate loss function based on their specific task. Whether it’s a
classification problem requiring a probabilistic assessment or a regression task needing a
quantitative error evaluation, the framework allows for seamless switching between these loss
functions. This adaptability makes the Binary Network framework a versatile tool, capable of
addressing a wide range of machine learning challenges efectively.</p>
      </sec>
      <sec id="sec-4-7">
        <title>3.8. Training Parameters and Sampling Methods</title>
        <p>In our Binary Network framework, we diverge from the traditional backpropagation training
approach, familiar in neural network training. Instead, we employ a variety of statistical
sampling methods. These methods, well-established in statistical analysis, provide an alternative
and potentially more eficient pathway for training neural networks, particularly apt for the
unique characteristics of a binary architecture.</p>
        <p>Sampling methods bring a diferent perspective to network training, allowing for exploration
and optimization in a manner distinct from gradient-based approaches. This shift is particularly
advantageous in the context of Binary Networks, where the discrete nature of the parameters
aligns well with the sampling-based exploration of the solution space. By leveraging these
statistical techniques, we aim to harness their robustness and eficiency for efective training in
the specialized environment of binary neural networks.</p>
        <p>The network undergoes training over 50 epochs, allowing for gradual and thorough learning
from the MNIST dataset. The MNIST dataset comprises 60,000 training images of handwritten
digits, each converted into a 784-dimensional input vector to fit the network’s input layer. The
network architecture can have several layers, implying various hidden layers with 64 units each
and an output softmax layer for classification.</p>
        <p>The training process involves batch processing with a size of 600 samples per batch. The batch
size and the data-to-invert ratio (200) are calibrated to balance between computational eficiency
and training efectiveness. The choice of the sampling method for weight updates plays a
crucial role in the network’s training dynamics. The current setup uses global Gibbs Sampling,
which is efective for global optimization in binary networks. However, other sampling methods
like Local Random Sampling, Global Random sampling, and global Metropolis–Hastings are
implemented and can be potential alternatives, each ofering diferent benefits in terms of
exploration and exploitation in the weight space.</p>
        <p>In summary, the current Binary Network implementation is designed with flexibility in mind,
allowing for various activation and loss functions to suit diferent requirements. The choice
of ReLU and cross-entropy aligns well with the network’s architecture and the nature of the
MNIST dataset. The training process, characterized by specific hyperparameters and a chosen
sampling method, is geared towards eficient and efective learning.</p>
      </sec>
      <sec id="sec-4-8">
        <title>3.9. Logical AND Operator Mechanism</title>
        <p>In our methodology, we integrate the output features of three distinct binary neural networks
to enhance prediction accuracy and reliability through a specialized logical operation. This
process is not akin to conventional model ensembling techniques; instead, it involves a unique
training and inference setup tailored for binary networks. The core of our approach lies in
the application of a logical AND operator, designed to make final predictions based on the
agreement between the outputs of the three networks.</p>
        <p>The logical AND operator functions under the principle that if at least two out of the three
networks concur on a prediction, this consensus dictates the final output of the operator. This
method leverages the collective intelligence of the networks, ensuring that the prediction reflects
a majority agreement, thereby increasing the confidence in the decision made. In scenarios
where each network outputs a diferent prediction, indicating a complete disagreement, the
methodology defaults to the prediction made by the first network. This decision rule is predicated
on the premise that each network, while trained under the same overarching framework, may
possess subtle variations in specialization due to diferences in initialization or training nuances.
By prioritizing the first network’s output, we acknowledge its potential slight edge in capturing
the essential features necessary for the task at hand.</p>
        <p>Let’s denote predictions as  1,  2, and  3 respectively. Each prediction   for  = 1, 2, 3 can
be either 0 or 1, representing the binary output class of each network. The output of the AND
operator, denoted as  AND, can be defined as follows:
 AND = {
1 if ∑3=1   ≥ 2,</p>
        <p>3
 1 if ∑=1   &lt; 2,
(7)</p>
        <p>It is crucial to underline that this strategy diverges fundamentally from traditional model
ensembling. While ensembles typically combine models post-training to leverage their individual
strengths during inference, our approach intertwines the combination logic within the training
phase itself. This ensures that the networks are not only trained to perform their respective
tasks efectively but also to do so in a manner that is synergistic, considering the logical AND
operator’s requirements during the decision-making process.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Experimental Results</title>
      <sec id="sec-5-1">
        <title>4.1. Network Training and Sampling Method</title>
        <p>Monte-Carlo sampling and its variants, such as Gibbs sampling and Metropolis-Hastings, have
emerged as eficient methods for navigating high-dimensional spaces, which can be efectively
applied to deep neural networks. Traditionally, the training of binary networks has involved
either utilizing backpropagation while maintaining a full-precision version of the network
or adopting Bayesian learning methodologies. However, Monte-Carlo methods present an
alternative approach that complements the unique characteristics of binary networks.</p>
        <p>In our training approach, we implemented distinct strategies for the Gibbs Sampling Net
(GSNet) and Metropolis-Hastings Net (MHNet), each involving the flipping of a subset of weights.
In the GSNet architecture, which stands for Gibbs Sampling Net, the process involves selectively
lfipping a set of weights. The acceptance of this new configuration of weights is contingent
on a specific criterion: it must lead to a reduction in the loss over a given batch of data. An
interesting aspect of GSNet’s training methodology is the progressive increase in batch size
relative to the number of epochs. This gradual scaling allows the network to initially focus
on learning from smaller data segments, gradually adapting to larger and more varied data as
training progresses.</p>
        <p>MHNet, short for Metropolis-Hastings Net, also engages in flipping a random subset of
weights. However, MHNet distinguishes itself from GSNet in its acceptance criterion for new
weight configurations. Unlike GSNet, MHNet may accept a new weight configuration that
increases the loss, albeit with a small probability. This approach allows MHNet to explore a
broader range of solutions in the weight space, potentially avoiding local minima and discovering
more optimal configurations. In MHNet, the batch size remains constant over time, and the
number of units flipped per batch is determined randomly, adding an element of variability and
exploration to the training process.</p>
        <p>Our networks were benchmarked against the classification task on the MNIST dataset. While
BinaryConnect has maintained state-of-the-art (SOTA) results for some time, more recent
binary network architectures have focused on improving precision on larger and more complex
datasets like ImageNet. However, it’s observed that these networks often do not perform as
well on MNIST when compared to BinaryConnect.</p>
        <p>Initially, we conducted 100 experimental runs with 2000 epochs each, varying the number of
hidden layers (1, 2, 3, 5, 10), the number of units (10, 20, 64, 128, 256), and testing sigmoid against
ReLU nonlinearities, as well as cross-entropy against RMSE loss functions. The most promising
configurations from these initial experiments were then subjected to extended training, spanning
tens of thousands of epochs, to optimize for top precision.</p>
        <p>The training process is analyzed with respect to network depth, number of units, epochs,
activation functions, and choice of loss function. It’s noted that training binary networks with
either Gibbs Sampling or Metropolis-Hastings requires significantly more epochs to converge.
This observation is evident in our results as shown in Figure 1 and Figure 2, where various fully
connected architectures trained over 2000 epochs are compared. Interestingly, adding more
hidden layers resulted in lower precision compared to architectures with fewer or no hidden
layers.</p>
        <p>In our study, we observed that increasing the number of hidden units in the network initially
leads to improved precision up to a certain point, after which the precision begins to decline,
as illustrated in Figure 2. This trend suggests a nuanced relationship between the network’s
complexity and its performance. Our hypothesis is that while architectures with a greater
number of neurons have a higher capacity for learning and abstraction, they also demand a
more extended period of training to fully leverage this increased capacity. This extended training
requirement might be necessary to optimize the more complex parameter space efectively, thus
benefiting from the network’s higher capability.</p>
        <p>In our experiments, the sigmoid activation function outperformed ReLU in simpler network
designs, especially in networks with more hidden layers and units where sigmoid showed less
saturation. Regarding loss functions, while RMSE led to better initial precision, cross-entropy
loss yielded slightly improved results after extended training.</p>
        <p>Although binary inputs and activation functions slightly reduced performance, their potential
for computational eficiency is notable. By replacing multiplication with bitwise operations,
they could greatly accelerate network inference, making them an intriguing area for further
research with various configurations and extended training periods.</p>
      </sec>
      <sec id="sec-5-2">
        <title>4.2. Benchmarking Results</title>
        <p>In our study, we conducted a comprehensive comparison of the precision across diferent
binary network configurations, see Table 1. It’s important to note that adding more layers
and units generally improves the precision of neural networks. However, training networks
with significantly more hidden layers and units poses challenges, often leading to convergence
issues due to numerical limitations. Implementing techniques like batch normalization could
potentially facilitate the training of deeper networks.</p>
        <p>Given the challenges and variances in training, we evaluate precision for networks either
without hidden layers or for networks with a single hidden layer comprising 64 units. These
binary networks were benchmarked against a full-precision network with decimal weights,
trained for 50 epochs using Stochastic Gradient Descent (SGD) with momentum (see Table 2 for
details).</p>
        <p>BinaryConnect, a notable approach in binary networks, involves using a decimal network
for weight updates followed by binarization. This method, thanks to robust backpropagation,
achieves high precision in just 250 epochs, showing only a 1% precision drop compared to the
decimal network. The binarization process in BinaryConnect involves setting negative weights
to -1 and positive weights to +1.</p>
        <p>An extension to basic weight binarization includes an additional scaling parameter per layer,
ensuring the norm of weights per layer remains consistent post-binarization. This technique
showed improved precision in networks with one hidden layer, while having a slight precision
decrease in networks without hidden layers.</p>
        <p>BinaryConnect slightly outperforms GSNet, as shown in Table 1, and this gap could potentially
be reduced by implementing batch normalization, an important factor for binary networks.
However, it is worth noting that GSNet required a significantly longer training period, 20,000
epochs, as detailed in Table 2, indicating a need for accelerated training methods.</p>
        <p>On the other hand, MHNet demonstrated very competitive precision with substantially fewer
training epochs compared to GSNet. Direct weight binarization yielded promising results,
79.67% and 73.76% as per Table 1, and the addition of a scale factor per layer further improved
precision for networks with one hidden layer 76.83% as per Table 1.</p>
      </sec>
      <sec id="sec-5-3">
        <title>4.3. Integration of Logical Reasoning</title>
        <p>The training process for individual binary neural networks, as evidenced by the systematic
reduction in training loss across epochs, indicates a successful optimization trajectory.
Experiment 1 through Experiment 5, each representing a standalone binary network, shows a
consistent decline in loss values, signifying that the networks are efectively learning from the
data over time, see Figure 3. This pattern is characteristic of a well-tuned training regimen,
where the network parameters are being refined in response to the given training stimuli,
leading to improved performance on the training dataset.
Method Details for the MNIST Classification</p>
        <p>However, an interesting divergence is observed when these binary networks are combined
using the proposed AND operator. The resultant system, which integrates the output features
of the three networks and adjudicates the final prediction based on a majority rule, exhibits a
less favorable optimization pattern. The training loss for this ensemble does not decrease as
expected, suggesting that the joint training under the AND constraint is less efective. This
could imply that the coupling of outputs in this manner introduces complexity that hinders
the learning process, possibly due to conflicting gradients or a loss surface that is dificult to
navigate. The intricacies of this combined operation warrant further investigation to understand
the underlying causes and to explore potential modifications that could lead to more stable and
eficient training dynamics.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Discussions</title>
      <p>In our investigation, we have identified a current limitation within our framework: the training
of the AND operator. Despite the individual binary networks showing promising results, the
AND operator, which aims to combine the predictions from multiple networks, is not being
trained efectively at this stage. A notable limitation in our current research is the challenge
of applying our binary network framework to more complex neural architectures. Despite
achieving high precision with simpler structures, particularly on the MNIST dataset, scaling up
to networks with an increased number of layers and units poses significant challenges. These
complex architectures demand more advanced training strategies and heightened computational
requirements. Future work will be dedicated to developing methodologies that can eficiently
integrate our binary network approach into these more intricate architectures. This will involve
exploring new training techniques, possibly incorporating alternative activation functions, and
considering innovative layer types tailored for binary networks.</p>
      <p>Another area for improvement is the extension of our binary network framework to tasks
beyond the MNIST dataset. While MNIST serves as a fundamental benchmark in machine
learning, it lacks the complexity found in other datasets used for more advanced tasks like
natural language processing or detailed image recognition. Our future research aims to apply
the binary network framework to a diverse array of datasets and tasks. This expansion is crucial
not only to test the versatility of our approach but also to refine the network’s ability to process
various types of data. Such an expansion could lead to new insights and improvements in how
binary networks are structured and trained, potentially opening up new applications in AI.</p>
      <p>Finally, a significant area for future exploration is the eficient integration of logical reasoning
into the binary network framework. While binary networks are inherently well-suited for
logical operations, embedding complex logical reasoning within these networks eficiently
remains challenging. Future phases of our research will focus on discovering methods to
incorporate sophisticated logical reasoning into binary networks more efectively. This could
include developing novel training algorithms, experimenting with hybrid neural-symbolic
models, or creating specialized layers for logic processing. The ultimate goal is to enhance
binary networks not only in terms of computational eficiency but also to equip them with
advanced reasoning and decision-making capabilities.</p>
      <p>This research represents an early-stage exploration into integrating logical reasoning with
binary networks. Currently, its applicability is limited, reflecting the developing nature of this
innovative approach. However, with continued development and refinement, this framework has
significant potential to scale across a broader range of datasets and tasks in the future. Success
with this approach could pave the way for binary networks that not only perform standard
computational functions but also possess the capability for advanced reasoning. The prospect
of binary networks efectively conducting logical operations opens up exciting possibilities for
their application in more complex, real-world scenarios, ultimately enhancing the scope and
functionality of AI systems.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusions</title>
      <p>In this paper, we explored the novel integration of Neural-Symbolic Computing with binary
neural networks, an innovative approach that merges structured reasoning with the dynamic
learning capabilities of Connectionist AI. This pioneering synthesis is designed to forge a
cuttingedge framework that seamlessly blends logical operators within AI systems, thereby significantly
boosting computational eficiency and adaptability. Our exploration marks a contribution to the
ifeld, highlighting the potential of binary networks to revolutionize Neural-Symbolic Computing
by ofering a more eficient, logical, and adaptable AI architecture.</p>
      <p>Through our research, we demonstrated that binary neural networks could efectively embody
this integration, as evidenced by our experiments with the MNIST dataset. These networks
ofer a promising avenue for AI applications, especially in scenarios demanding both logical
processing and learning adaptability. However, our findings also underscore the challenges in
scaling these networks for more complex architectures and broader datasets.</p>
      <p>Looking ahead, our research opens several pathways for further exploration. The potential
expansion of binary network applications to more sophisticated tasks, and their adaptation to
handle a wider variety of data types, stand out as promising future endeavors. Additionally, the
eficient integration of logical reasoning into these networks remains a pivotal area for ongoing
development.</p>
      <p>In conclusion, our work contributes to the broader field of AI by proposing a novel approach
that leverages the strengths of both traditional symbolic systems and modern neural networks.
The development of this binary neural network framework marks a step towards creating
more advanced AI models capable of seamless learning and reasoning. It is a step towards the
realization of AI systems that are not only eficient and powerful but also interpretable and
adaptable, capable of tackling complex problems across various domains.
artificial intelligence (2020). doi: 10.1609/aaai.v34i01.5466.
[42] B. Martinez, J. Yang, A. Bulat, G. Tzimiropoulos, Training binary neural networks with
real-to-binary convolutions, arXiv preprint (2020). doi:10.48550/arXiv.2003.11535.
[43] H. Bai, W. Zhang, L. Hou, L. Shang, J. Jin, X. Jiang, I. King, Binarybert: Pushing the limit
of bert quantization, arXiv preprint (2021). doi:10.48550/arXiv.2012.15701.
[44] H. Qin, X. Ma, Y. Ding, X. Li, Y. Zhang, Y. Tian, X. Liu, Bifsmn: Binary neural network for
keyword spotting, arXiv preprint (2022). doi:10.48550/arXiv.2202.06483.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>A.</given-names>
            <surname>Newell</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. C.</given-names>
            <surname>Shaw</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. A.</given-names>
            <surname>Simon</surname>
          </string-name>
          ,
          <article-title>Report on a general problem-solving program</article-title>
          ,
          <source>Proceedings of the International Conference on Information Processing</source>
          (
          <year>1959</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>D. E.</given-names>
            <surname>Rumelhart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. J.</given-names>
            <surname>Williams</surname>
          </string-name>
          ,
          <article-title>Learning representations by back-propagating errors</article-title>
          ,
          <source>Nature</source>
          (
          <year>1986</year>
          ). doi:
          <volume>10</volume>
          .1038/323533a0.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>A. S.</given-names>
            <surname>Garcez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. C.</given-names>
            <surname>Lamb</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. M.</given-names>
            <surname>Gabbay</surname>
          </string-name>
          ,
          <source>Neural-symbolic cognitive reasoning</source>
          ,
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>J. McCarthy</surname>
          </string-name>
          ,
          <article-title>Programs with common sense</article-title>
          ,
          <year>1959</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>J. McCarthy</surname>
          </string-name>
          , Situations, actions, and causal laws,
          <source>Comtex Scientific</source>
          ,
          <year>1963</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>T.</given-names>
            <surname>Winograd</surname>
          </string-name>
          ,
          <article-title>Procedures as a representation for data in a computer program for understanding natural language (</article-title>
          <year>1971</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>B. G.</given-names>
            <surname>Buchanan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E. A.</given-names>
            <surname>Feigenbaum</surname>
          </string-name>
          ,
          <article-title>Dendral and meta-dendral: Their applications dimension</article-title>
          ,
          <source>Readings in artificial intelligence</source>
          (
          <year>1981</year>
          ).
          <source>doi: 10.1016/B978-0-934613-03-3</source>
          .
          <fpage>50026</fpage>
          -X.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Fikes</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. J.</given-names>
            <surname>Nilsson</surname>
          </string-name>
          ,
          <article-title>Strips: A new approach to the application of theorem proving to problem solving</article-title>
          ,
          <source>Artificial intelligence</source>
          (
          <year>1971</year>
          ). doi:
          <volume>10</volume>
          .1016/
          <fpage>0004</fpage>
          -
          <lpage>3702</lpage>
          (
          <issue>71</issue>
          )
          <fpage>90010</fpage>
          -
          <lpage>5</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Boser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. S.</given-names>
            <surname>Denker</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Henderson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. E.</given-names>
            <surname>Howard</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Hubbard</surname>
          </string-name>
          ,
          <string-name>
            <surname>J. L. D.</surname>
          </string-name>
          ,
          <article-title>Backpropagation applied to handwritten zip code recognition, Neural computation (</article-title>
          <year>1989</year>
          ). doi:
          <volume>10</volume>
          .1162/neco.
          <year>1989</year>
          .
          <volume>1</volume>
          .4.541.
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>S.</given-names>
            <surname>Hochreiter</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Schmidhuber</surname>
          </string-name>
          ,
          <article-title>Long short-term memory, Neural computation (</article-title>
          <year>1997</year>
          ). doi:
          <volume>10</volume>
          .1162/neco.
          <year>1997</year>
          .
          <volume>9</volume>
          .8.1735.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>Y.</given-names>
            <surname>LeCun</surname>
          </string-name>
          , Y. Bengio, G. Hinton,
          <article-title>Deep learning</article-title>
          ,
          <source>Nature</source>
          (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .1038/nature14539.
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>A.</given-names>
            <surname>Krizhevsky</surname>
          </string-name>
          , I. Sutskever,
          <string-name>
            <given-names>G. E.</given-names>
            <surname>Hinton</surname>
          </string-name>
          ,
          <article-title>Imagenet classification with deep convolutional neural networks</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          (
          <year>2012</year>
          ). doi:
          <volume>10</volume>
          .1145/ 3065386.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>M. V.</given-names>
            <surname>França</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Zaverucha</surname>
          </string-name>
          ,
          <string-name>
            <surname>A. S.</surname>
          </string-name>
          <article-title>d'Avila Garcez, Fast relational learning using bottom clause propositionalization with artificial neural networks, Machine learning (</article-title>
          <year>2014</year>
          ).
          <source>doi:10.1007/s10994-013-5392-1.</source>
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>S.</given-names>
            <surname>Harnad</surname>
          </string-name>
          ,
          <article-title>The symbol grounding problem</article-title>
          ,
          <string-name>
            <surname>Physica</surname>
            <given-names>D</given-names>
          </string-name>
          : Nonlinear Phenomena (
          <year>1990</year>
          ). doi:
          <volume>10</volume>
          .1016/
          <fpage>0167</fpage>
          -
          <lpage>2789</lpage>
          (
          <issue>90</issue>
          )
          <fpage>90087</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Towards data-and knowledge-driven artificial intelligence: A survey on neuro-symbolic computing, arXiv preprint (</article-title>
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2210.15889.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <surname>A. d'Avila Garcez</surname>
            ,
            <given-names>L. C.</given-names>
          </string-name>
          <string-name>
            <surname>Lamb</surname>
          </string-name>
          ,
          <article-title>Neurosymbolic ai: The 3 rd wave</article-title>
          ,
          <source>Artificial Intelligence Review</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.
          <year>2012</year>
          .
          <volume>05876</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>P.</given-names>
            <surname>Smolensky</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. T. McCoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Goldrick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>Neurocompositional computing in human and machine intelligence: A tutorial</article-title>
          ,
          <source>Microsoft Technical Report</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2205.01128.
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>P.</given-names>
            <surname>Smolensky</surname>
          </string-name>
          ,
          <string-name>
            <surname>R. McCoy</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Fernandez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Goldrick</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gao</surname>
          </string-name>
          ,
          <article-title>Neurocompositional computing: From the central paradox of cognition to a new generation of ai systems</article-title>
          ,
          <source>AI Magazine</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1002/aaai.12065.
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>P.</given-names>
            <surname>Hitzler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Eberhart</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Ebrahimi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M. K.</given-names>
            <surname>Sarker</surname>
          </string-name>
          , , L. Zhou,
          <article-title>Neuro-symbolic approaches in artificial intelligence</article-title>
          ,
          <source>National Science Review</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1093/nsr/nwac035.
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <surname>E. van Krieken</surname>
          </string-name>
          , E. Acar, ,
          <string-name>
            <surname>F. van Harmelen</surname>
          </string-name>
          ,
          <article-title>Analyzing diferentiable fuzzy logic operators</article-title>
          ,
          <source>Artificial Intelligence</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1016/j.artint.
          <year>2021</year>
          .
          <volume>103602</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hoernle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R. M.</given-names>
            <surname>Karampatsis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Belle</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Gal</surname>
          </string-name>
          , Multiplexnet:
          <article-title>Towards fully satisfied logical constraints in neural networks</article-title>
          ,
          <source>Artificial Intelligence</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv. 2111.01564.
        </mixed-citation>
      </ref>
      <ref id="ref22">
        <mixed-citation>
          [22]
          <string-name>
            <given-names>T.</given-names>
            <surname>Silver</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Athalye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. B.</given-names>
            <surname>Tenenbaum</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Lozano-Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. P.</given-names>
            <surname>Kaelbling</surname>
          </string-name>
          ,
          <article-title>Learning neurosymbolic skills for bilevel planning, Robot Learning (</article-title>
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2206. 10680.
        </mixed-citation>
      </ref>
      <ref id="ref23">
        <mixed-citation>
          [23]
          <string-name>
            <given-names>E.</given-names>
            <surname>Giunchiglia</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Stoian</surname>
          </string-name>
          , T. Lukasiewicz,
          <article-title>Deep learning with logical constraints</article-title>
          , IJCAI/AAAI Press (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .24963/ijcai.
          <year>2022</year>
          /767.
        </mixed-citation>
      </ref>
      <ref id="ref24">
        <mixed-citation>
          [24]
          <string-name>
            <given-names>E.</given-names>
            <surname>Karpas</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Abend</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Belinkov</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Lenz</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Lieber</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ratner</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Shoham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Bata</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Levine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Leyton-Brown</surname>
          </string-name>
          ,
          <article-title>Mrkl systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning</article-title>
          , arXiv preprint (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2205.00445.
        </mixed-citation>
      </ref>
      <ref id="ref25">
        <mixed-citation>
          [25]
          <string-name>
            <surname>M.-O. Stehr</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          <string-name>
            <surname>Kim</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          <string-name>
            <surname>Talcott</surname>
          </string-name>
          ,
          <article-title>A probabilistic approximate logic for neuro-symbolic learning and reasoning</article-title>
          ,
          <source>Log Algebr Methods Program</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1016/j.jlamp.
          <year>2021</year>
          .
          <volume>100719</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref26">
        <mixed-citation>
          [26]
          <string-name>
            <given-names>C.</given-names>
            <surname>Pryor</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Dickens</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Augustine</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Albalak</surname>
          </string-name>
          , W. Wang, ,
          <string-name>
            <given-names>L. N.</given-names>
            <surname>Getoor</surname>
          </string-name>
          ,
          <article-title>Neural probabilistic soft logic</article-title>
          , IJCAI-
          <volume>23</volume>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2205.14268.
        </mixed-citation>
      </ref>
      <ref id="ref27">
        <mixed-citation>
          [27]
          <string-name>
            <given-names>D.</given-names>
            <surname>Aditya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Mukherji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Balasubramanian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Chaudhary</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P. P.</given-names>
            <surname>Shakarian</surname>
          </string-name>
          ,
          <article-title>Software for open world temporal logic, arXiv preprint (</article-title>
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2302.13482.
        </mixed-citation>
      </ref>
      <ref id="ref28">
        <mixed-citation>
          [28]
          <string-name>
            <given-names>M.</given-names>
            <surname>Hersche</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zeqiri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Benini</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sebastian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Rahimi</surname>
          </string-name>
          ,
          <article-title>A neuro-vector-symbolic architecture for solving raven's progressive matrices</article-title>
          ,
          <source>Nature Machine Intelligence</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2203.04571.
        </mixed-citation>
      </ref>
      <ref id="ref29">
        <mixed-citation>
          [29]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Ma</surname>
          </string-name>
          , L. Jianetal,
          <article-title>Softened symbol grounding for neurosymbolic systems</article-title>
          ,
          <source>The Eleventh International Conference on Learning Representations</source>
          (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2403.00323.
        </mixed-citation>
      </ref>
      <ref id="ref30">
        <mixed-citation>
          [30]
          <string-name>
            <given-names>D.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pan</surname>
          </string-name>
          ,
          <article-title>A probabilistic graphical model based on neural-symbolic reasoning for visual relationship detection</article-title>
          ,
          <source>CVPR</source>
          (
          <year>2022</year>
          ).
          <source>doi:10.1109/ CVPR52688</source>
          .
          <year>2022</year>
          .
          <volume>01035</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref31">
        <mixed-citation>
          [31]
          <string-name>
            <given-names>T.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Kembhavi</surname>
          </string-name>
          ,
          <article-title>Visual programming: Compositional visual reasoning without training</article-title>
          ,
          <source>IEEE Conf. Comput. Vis. Pattern Recognit</source>
          . (
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2211. 11559.
        </mixed-citation>
      </ref>
      <ref id="ref32">
        <mixed-citation>
          [32]
          <string-name>
            <surname>D. S.</surname>
          </string-name>
          ́ıs, S. Menon,
          <string-name>
            <given-names>C.</given-names>
            <surname>Vondrick</surname>
          </string-name>
          , Vipergpt:
          <article-title>Visual inference via python execution for reasoning, ICCV (</article-title>
          <year>2023</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.2303.08128.
        </mixed-citation>
      </ref>
      <ref id="ref33">
        <mixed-citation>
          [33]
          <string-name>
            <given-names>L.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <article-title>Logicseg: Parsing visual semantics with neural logic learning and reasoning</article-title>
          , ICCV (
          <year>2023</year>
          ).
          <source>doi:10.1109/ICCV51070</source>
          .
          <year>2023</year>
          .
          <volume>00381</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref34">
        <mixed-citation>
          [34]
          <string-name>
            <given-names>M.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H. H.</given-names>
            <surname>Zhuo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <article-title>Creativity of ai: Automatic symbolic option discovery for facilitating deep reinforcement learning</article-title>
          ,
          <source>AAAI Conference on Artificial Intelligence</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1609/aaai.v36i6.
          <fpage>20663</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref35">
        <mixed-citation>
          [35]
          <string-name>
            <given-names>J.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xiao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <article-title>Weakly supervised neural symbolic learning for cognitive tasks</article-title>
          ,
          <source>AAAI</source>
          (
          <year>2022</year>
          ). doi:
          <volume>10</volume>
          .1609/aaai.v36i5.
          <fpage>20533</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref36">
        <mixed-citation>
          [36]
          <string-name>
            <given-names>M.</given-names>
            <surname>Courbariaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.-P.</given-names>
            <surname>David</surname>
          </string-name>
          , Binaryconnect:
          <article-title>Training deep neural networks with binary weights during propagations</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          (
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.1511.00363.
        </mixed-citation>
      </ref>
      <ref id="ref37">
        <mixed-citation>
          [37]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rastegari</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Ordonez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Redmon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Farhadi</surname>
          </string-name>
          ,
          <article-title>Xnor-net: Imagenet classification using binary convolutional neural networks</article-title>
          ,
          <source>European conference on computer vision</source>
          (
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          .1007/978-3-
          <fpage>319</fpage>
          -46493-0_
          <fpage>32</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref38">
        <mixed-citation>
          [38]
          <string-name>
            <given-names>I.</given-names>
            <surname>Hubara</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Courbariaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Soudry</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>El-Yaniv</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Binarized neural networks</article-title>
          ,
          <source>Advances in neural information processing systems</source>
          (
          <year>2016</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref39">
        <mixed-citation>
          [39]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Courbariaux</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Memisevic</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Bengio</surname>
          </string-name>
          ,
          <article-title>Neural networks with few multiplications, arXiv preprint (</article-title>
          <year>2015</year>
          ). doi:
          <volume>10</volume>
          .48550/arXiv.1510.03009.
        </mixed-citation>
      </ref>
      <ref id="ref40">
        <mixed-citation>
          [40]
          <string-name>
            <given-names>S.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Ni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <article-title>Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients, arXiv preprint (</article-title>
          <year>2016</year>
          ). doi:
          <volume>10</volume>
          . 48550/arXiv.1606.06160.
        </mixed-citation>
      </ref>
      <ref id="ref41">
        <mixed-citation>
          [41]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Cao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Nie</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <article-title>Order matters: Semantic-aware neural networks for binary code similarity detection</article-title>
          ,
          <source>Proceedings of the AAAI conference on</source>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>