<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>An Accuracy-Controllable Approximate Adder for FPGAs</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Masaki Sano</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiroki Nishikawa</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Xiangbo Kong</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Hiroyuki Tomiyama</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tongxin Yang</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Tomoaki Ukezono</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toshinori Sato</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Electronics Engineering and Computer Science, Fukuoka University</institution>
          ,
          <addr-line>Fukuoka</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Graduate School of Information Science and Technology, Osaka University</institution>
          ,
          <addr-line>Osaka</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>Graduate School of Science and Engineering, Ritsumeikan University</institution>
          ,
          <addr-line>Shiga</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Sony Semiconductor Solutions Corporation</institution>
          ,
          <addr-line>Kanagawa</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this paper, we propose an accuracy-controllable approximate adder for FPGAs. The proposed adder has a special input to dynamically change the accuracy in addition to two operands. When accurate computation is required, the adder computes accurately. On the other hand, when accurate computation is not required, the adder computes inaccurately but quickly at low power. The important feature of our adder is that it utilizes carry-chain modules which are built in FPGAs. By using the carry chains, our approximate adder computes much faster at lower power than an existing approximate adder.</p>
      </abstract>
      <kwd-group>
        <kwd>1 approximate computing</kwd>
        <kwd>approximate adder</kwd>
        <kwd>FPGA</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        In recent years, approximate computing technology has attracted attention to achieve higher
performance and lower power consumption by tolerating a certain degree of computational error.
Approximate computing techniques are used especially in the fields of image processing and machine
learning since they are computationally expensive and error-tolerant to some extent [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ][
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Research on approximate computing circuits has been conducted at various design levels from the
transistor to the architecture levels [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ][
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. This paper focuses on approximate adders since addition is
one of the most fundamental arithmetic operations. There are many studies on approximate adders
which improve power-performance efficiency by disconnecting carry propagation [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]-[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. The work in
[
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] proposes an approach to accurately calculate errors of approximate adders, and the work in [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]
provides a detailed analysis of the trade-off between accuracy and resource efficiency. The authors of
[
        <xref ref-type="bibr" rid="ref12">12</xref>
        ][
        <xref ref-type="bibr" rid="ref13">13</xref>
        ] state that circuit design with variable computational accuracy is desirable for designing
systems that meet diverse requirements. An approximate adder circuit, named carry-maskable adder
(CMA), that can change the computational accuracy is proposed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. Since CMA enables dynamic
control of accuracy, it is possible to perform approximate operations within an error tolerable range.
      </p>
      <p>
        Based on CMA in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], in this paper, we propose an accuracy-controllable approximate adder for
FPGAs. If the original CMA is implemented in FPGAs in a straightforward manner, lookup tables
(LUTs) are connected in series, and hence, the delay and power increase significantly. Our proposed
adder, named carry-chain based carry-maskable adder (CC-CMA), takes advantage of fast carry-chain
modules which are built in FPGAs.
      </p>
      <p>This paper is organized as follows. Section 2 presents CC-CMA, and Section 3 analyzes the
hardware cost, delay, computational error and power consumption of CC-CMA. Finally, Section 4
summarizes this paper and discusses future work.</p>
      <p>This work was done while the author was with Fukuoka University, Japan.</p>
    </sec>
    <sec id="sec-2">
      <title>2. Carry-Chain based Carry-Maskable Adder</title>
      <p>
        In this section, we first explain the carry-maskable adder [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], and then, we propose a carry-chain
based carry-maskable adder for FPGAs.
2.1.
      </p>
    </sec>
    <sec id="sec-3">
      <title>Carry-Maskable Adder</title>
      <p>
        This work is based on an approximate adder, named carry-maskable adder (CMA), which was
originally proposed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. CMA can dynamically change the accuracy level according to the special
input signal. Figure 1 (a) shows the diagram of an 8-bit CMA. The 8-bit CMA consists of a
carrymaskable half adder (CMHA) and seven carry-maskable full adders (CMFAs) connected in series, as
shown in Figure 1 (b). In addition to three inputs x, y and carry-in denoted as cin, CMFA has a special
input named a mask to dynamically control the accuracy. If the mask is 0, CMFA performs exact
addition. If the mask is 1, the carry-out signal Cout is 0, and the sum s is the logical sum of a and b
assuming that the carry-in signal from the lower bit is 0. This computation is not accurate but
approximate. However, since carry signals are not propagated from lower bits to upper bits, the delay
and power consumption are reduced. By setting masks of lower bits to 1 and those of upper bits to 0,
the upper (more significant) bits are computed accurately, and the lower (less significant) bits are
computed approximately. Thus, by controlling mask, we can explore the trade-off between accuracy,
delay, and power consumption, depending on the requirement of the applications.
      </p>
      <p>(a) 8-bit CMA</p>
      <p>(b) Carry-maskable full adder</p>
      <p>
        Figure 1. Carry-maskable adder [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]
2.2.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Carry-Chain Based CMA</title>
      <p>
        Originally, CMA was designed for ASICs, and how to implement CMA on FPGAs is not presented
or discussed in [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] despite the widespread use of FPGAs. If the Boolean expressions of CMA are given
to FPGA synthesis tools, each CMFA is mapped to one or two lookup tables (LUTs), and the LUTs are
connected in series, as shown in Figure 1 (a). This implementation is not efficient in terms of delay and
power consumption since LUTs are slow and power-consuming.
      </p>
      <p>Recent FPGAs are equipped with carry-chain modules for fast addition. Figure 2 shows a schematic
diagram of an accurate 4-bit adder using a built-in carry chain module. The carry-chain module consists
of four multiplexers and four EXOR gates, and is connected from four LUTs. For accurate addition,
each LUT is configured to compute EXOR of x and y as follows.</p>
      <p>As seen in Figure 2, carry signals go through the carry-chain module. In other words, the carry
signals do not go through LUTs which are slow and power-consuming. Thanks to the built-in
carrychain module, addition is computed fast at low power.</p>
      <p>In this work, we take advantage of the carry-chain modules in the design of approximate adders.
Figure 3 shows our proposed approximate adder, named carry-chain based carry-maskable adder
(CCCMA. LUTs labeled P0-P3 and G0-G3 compute Equations (2) and (3), respectively.</p>
      <p>Similar to the accurate adder shown in Figure 2, carry signals in CC-CMA do not go through LUTs
but go through fast carry-chain modules when mask is 0. When mask is 1, carry signals do not propagate
to upper bits, which achieves faster and lower-power computation than the accurate adder at the expense
of computational inaccuracy.</p>
    </sec>
    <sec id="sec-5">
      <title>3. Evaluation</title>
      <p>
        We have designed 32-bit and 64-bit CC-CMAs in Verilog-HDL, and synthesized them for Xilinx
Artix-7 device with Xilinx Vivado 2019.2. For comparison, we have also designed accurate adders and
original CMAs [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ] in Verilog-HDL. The synthesized accurate adders utilize built-in carry-chain
modules, as shown in Figure 2, while the synthesized CMAs do not. The three adders are compared in
terms of hardware resources, delay, power consumption, maximum error and average error. Delay,
average error and power consumption are obtained by post-synthesis simulation using the Vivado
toolkit. For 32-bit and 64-bit adders, 100,000 and 1,000,000 random simulations are performed,
respectively. Delay, error and power consumption of CMA and CC-CMA depend on the values of
masks. Recall that CMA and CC-CMA compute accurately when masks of all bits are 0. When the
masks of the least significant n-bits are set to 1, the lower n-bits are added approximately, and the upper
bits are added accurately. In our experiments, we vary the number of lower bits whose masks are set to
1.
      </p>
    </sec>
    <sec id="sec-6">
      <title>Hardware Resources</title>
      <p>Table 1 compares hardware resources in terms of the number of 6-input LUTs. The original CMAs
use twice as many LUTs as the accurate adders and CC-CMAs. CC-CMAs are as small as the accurate
adders, although the functionality of CC-CMAs is more complex.</p>
      <p>32-bit adders
64-bit adders</p>
    </sec>
    <sec id="sec-7">
      <title>Power Consumption</title>
      <p>(b) 64-bit adders</p>
      <p>Figure 5. Power consumption (µW)
leading to lower power consumption. It is also observed in Figure 5 that the power consumption of
CMA and CC-CMA decreases as more bits are masked and approximated.
3.4.</p>
    </sec>
    <sec id="sec-8">
      <title>Computational Errors</title>
      <p>So far, we have seen that CC-CMA computes faster at lower power than the accurate adders. These
advantages come at the cost of computational error. Table 2 shows the degree of computational errors
of CMA and CC-CMA. Recall that the functions of CMA and CC-CMA are exactly the same, and
therefore, the amounts of errors of the two adders are the same.
random simulation. As more bits are masked, the computational error increases.</p>
      <p>mask
max
average
mask
max
average
mask
max
average
(a) 64-bit adders
16
48
20
52
24
56
28
60
32
64
3.5.</p>
    </sec>
    <sec id="sec-9">
      <title>Trade-off between Error, Delay and Power</title>
      <p>From Figure 4 and Table 2, the trade-off between delay and computational error for 32-bit CC-CMA
can be derived as shown in Figure 6. The figure shows that the delay can be shortened at the cost of
computational error, but the cost is not low. Also, it is not beneficial to dynamically change the delay
of adders unless the adders exist on the critical path of the entire circuits.</p>
      <p>From Figure 5 and Table 2, the trade-off between power consumption and computational error for
32-bit CC-CMA can be derived as shown in Figure 7. A significant amount of power can be saved at
the expense of computational error.</p>
    </sec>
    <sec id="sec-10">
      <title>4. Conclusions</title>
      <p>In this paper, we have proposed an approximate adder named carry-chain based carry maskable
adder (CC-CMA) for FPGAs. CC-CMA has a special input signal named a mask to dynamically control
the computational accuracy. CC-CMA takes advantage of fast carry-chain modules which are equipped
in modern FPGAs. By using the built-in carry chains, CC-CMA computes fast at low power. The
experimental results demonstrate the efficiency of CC-CMA compared with an accurate adder and an
existing carry-maskable adder. In future, we plan to evaluate CC-CMA using real-world applications.
Also, we plan to develop accuracy-controllable multipliers for FPGAs based on CC-CMA.</p>
    </sec>
    <sec id="sec-11">
      <title>Acknowledgments References</title>
      <p>This work is supported partly by KAKENHI 20H00590, 19H04081 and 21K19776.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Esmaeilzadeh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Sampson</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Ceze</surname>
          </string-name>
          and
          <string-name>
            <surname>D.</surname>
          </string-name>
          <article-title>Neural acceleration for general-purpose approximate programs</article-title>
          , IEEE/ACM International Symposium on Microarchitecture,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>V.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Mohapatra</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. P.</given-names>
            <surname>Park</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Raghunathan</surname>
          </string-name>
          and
          <string-name>
            <given-names>K.</given-names>
            <surname>Roy</surname>
          </string-name>
          <string-name>
            <surname>IMPACT</surname>
          </string-name>
          :
          <article-title>IMPrecise adders for low-power approximate computing</article-title>
          ,
          <source>IEEE/ACM International Symposium on Low Power Electronics and Design</source>
          ,
          <year>2011</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>S.</surname>
          </string-name>
          <article-title>survey of techniques for approximate c ACM Computing Surveys</article-title>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>Q.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Mytkowicz</surname>
          </string-name>
          , and
          <source>N. computing: A s IEEE Design &amp; Test</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>R.</given-names>
            <surname>Gollu</surname>
          </string-name>
          and
          <string-name>
            <surname>P.</surname>
          </string-name>
          <article-title>carry maskable adder using modified full s</article-title>
          <source>Journal of Physics Conference Series</source>
          , vol.
          <year>1921</year>
          , no.
          <issue>1</issue>
          , article.
          <volume>012049</volume>
          ,
          <year>2021</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>P. K.</given-names>
            <surname>Sujit</surname>
          </string-name>
          , G. Bharat, and
          <string-name>
            <surname>R. K.</surname>
          </string-name>
          <article-title>A power and area efficient approximate carry skip adder for error-resilient applications</article-title>
          ,
          <source>Turkish Journal of Electrical Engineering and Computer Sciences</source>
          , vol.
          <volume>28</volume>
          , no.
          <issue>1</issue>
          , pp.
          <fpage>443</fpage>
          -
          <lpage>457</lpage>
          ,
          <year>2019</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>J.</given-names>
            <surname>Babita</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Vishesh</surname>
          </string-name>
          , and
          <string-name>
            <surname>S. EFCSA</surname>
          </string-name>
          :
          <article-title>An efficient carry speculative approximate adder with rectification</article-title>
          ,
          <source>IEEE 23rd International Symposium on Quality Electronic Design</source>
          ,
          <year>2022</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>L.</given-names>
            <surname>Jungwon</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hyoju</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Yerin</surname>
          </string-name>
          , and
          <string-name>
            <surname>K.</surname>
          </string-name>
          <article-title>Approximate adder design with simplified lower-part approximation</article-title>
          ,
          <source>IEICE Electronics Express</source>
          , vol.
          <volume>17</volume>
          , no.
          <issue>15</issue>
          , pp.
          <fpage>1</fpage>
          -
          <lpage>3</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Kanani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Mehta</surname>
          </string-name>
          and
          <string-name>
            <surname>N.</surname>
          </string-name>
          <article-title>ACA-CSU: A carry selection based accuracy configurable approximate adder design</article-title>
          ,
          <source>IEEE Computer Society Annual Symposium on VLSI</source>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezaalipour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Rezaalipour</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Dehyadegari</surname>
          </string-name>
          and
          <string-name>
            <surname>M. N.</surname>
          </string-name>
          <article-title>AxMAP: Making approximate adders aware of input patterns</article-title>
          ,
          <source>IEEE Transactions on Computers</source>
          , vol.
          <volume>69</volume>
          , no.
          <issue>6</issue>
          , pp.
          <fpage>868</fpage>
          -
          <lpage>882</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>D.</given-names>
            <surname>Catelan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Santos</surname>
          </string-name>
          and
          <string-name>
            <given-names>L.</given-names>
            <surname>Duenha</surname>
          </string-name>
          ,
          <article-title>Accuracy and physical characterization of approximate arithmetic circuits, XXI Simpósio em</article-title>
          Sistemas Computacionais de Alto Desempenho,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>S.</given-names>
            <surname>Venkataramani</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V. K.</given-names>
            <surname>Chippa</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S. T.</given-names>
            <surname>Chakradhar</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Roy</surname>
          </string-name>
          ,
          <article-title>and A. programmable vector processors for approximate computing</article-title>
          ,
          <source>International Symposium on Microarchitecture</source>
          ,
          <year>2013</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>A. B.</given-names>
            <surname>Kahng</surname>
          </string-name>
          and S.
          <article-title>-configurable adder for approximate arithmetic designs</article-title>
          ,
          <source>Design Automation Conference</source>
          ,
          <year>2012</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>T.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Ukezono</surname>
          </string-name>
          and
          <string-name>
            <surname>T.</surname>
          </string-name>
          <article-title>An accuracy-configurable adder for low-power applications</article-title>
          <source>IEICE Trans. on Electronics</source>
          , vol. E103-C, no.
          <issue>3</issue>
          , pp.
          <fpage>68</fpage>
          -
          <lpage>76</lpage>
          ,
          <year>2020</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>