<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Low Power FFT to Support OFDM Telecommunications in IoT Applications</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Nikos Petrellis</string-name>
          <email>npetrellis@uop.gr</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Electrical and Computer Engineering, University of Peloponnese</institution>
          ,
          <addr-line>Patra</addr-line>
          ,
          <country country="GR">Greece</country>
        </aff>
      </contrib-group>
      <fpage>145</fpage>
      <lpage>152</lpage>
      <abstract>
        <p>The minimum acceptable word length for the representation of real numbers in Fast Fourier Transform (FFT) with undersampled inputs is studied in this paper. FFT is a critical module in Orthogonal Frequency Division Multiplexing (OFDM) telecommunication infrastructure also used in Internet of Things (IoT) environments. The FFT input/output values, the twiddle factors and its intermediate results are complex numbers that can be represented either in fixed point or floating-point format. A tradeoff has to be made between rounding error and complexity. We focus on FFT with sparse inputs such as the values generated by many IoT sensors, surveillance cameras, etc. A configurable new FFT architecture has been developed to test various FFT sizes, word lengths and Quadrature Amplitude Modulations (QAM). The simulation results show that 8 bits FFT word length is sufficient in the employed undersampling procedure.</p>
      </abstract>
      <kwd-group>
        <kwd>FFT</kwd>
        <kwd>sparse</kwd>
        <kwd>OFDM</kwd>
        <kwd>word length</kwd>
        <kwd>rounding error</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1 Introduction</title>
      <p>
        Discrete Fourier Transform (DFT) is used for the representation of periodic
functions as a sum of sine and cosine functions. Fourier transform is used in several
scientific domains of mathematics (e.g., for the analysis of differential equations) and
signal processing (filtering, spectroscopy, etc.) [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. Fast Fourier Transform (FFT)
implementations avoid repeating the same operations [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. One option is to implement
real numbers in Floating Point format. This format supports wide dynamic range and
avoids issues like scaling and overflow/underflow. In floating point format standards
like IEEE-754 [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] a real number is described by one bit for the sign (sgn), a number of
bits for the significand (c) and a signed exponent (e) for a base b such as 2 or 10. More
specifically, the real number can be expressed as (-1)sgn×c×be. Floating point format is
implemented with complicated hardware. In Fixed Point format multiplications and
divisions by 2 can be implemented with simpler circuits but the outcome of these
operations may have scaling and overflow/underflow issues. In fixed point format, if
for example, base b=2 is used, 8 bits are allocated for the whole number and 5 of the 8
bits are used for the fraction then 10101101 corresponds to the real number:
1×22+0×21+1×20+0×2-1+1×2-2+1×2-3+0×2-4+1×25=4+1+1/4+1/8+1/32= 5.40625.
      </p>
      <p>
        The representation of the real numbers is very important to computationally
intensive operations like FFT. Due to the FFT’s high complexity, the real numbers
have to be implemented preserving a minimum word length, otherwise severe
rounding errors occur. The limited precision of fixed-point arithmetic for different FFT
algorithms is studied in [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] where radix-2 Decimation-In-Time (DIT) FFT is examined
due to its higher accuracy in term of signal-to-quantization-noise ratio. In [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] the
round-off error of fixed point FFT is investigated while the results of the classic paper
[
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] are reproduced and an error in the results presented in [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] is demonstrated. The
consequences of the violation of the assumption for almost pure sine waves is also
investigated in [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]. Radix-4 FFT is used in [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ] where input quantization and coefficient
accuracy is ignored. The error in Radix-2 butterflies is analyzed in [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        The analysis for the FFT word length presented here concerns an OFDM transceiver
that has been recently presented in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In this OFDM transceiver undersampling is
applied when sparse information is exchanged i.e., some FFT input samples are not
obtained by the Analog Digital Converter (ADC) on the receiver side and are replaced
by others that have been previously retrieved. A configurable FFT has been developed
in synthesizable Very high-speed IC Hardware Description Language (VHDL) in the
context of this paper in order to test the effect of limited word length in the
representation of the FFT parameters. These parameters include inputs, outputs,
twiddles and the intermediate results. The following combinations were studied here:
FFT size N of 256 or 1024 points, q=16 or q=4 QAM modulation, two subsampling
frequencies (R2=N/4 and R8=N/16) and sparseness levels up to 10%. Fixed point word
lengths of 8 or 6 bits are tested. The Normalized Mean Square Error (NMSE) and the
Symbol Error Rate (SER) are measured. The NMSE and SER are severely degraded if
6 bits are used. However, if 8 bits are used as FFT word length, the quantization error
is negligible compared with the one caused by the undersampling procedure.
      </p>
      <p>The OFDM and FFT architectures as well as the proposed undersampling method
are described in Section 2. The simulation results are discussed in Section 3.</p>
    </sec>
    <sec id="sec-2">
      <title>2 OFDM and FFT Architectures</title>
      <p>
        In [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ], wired and wireless OFDM transceivers are described and undersampling is
supported during sparse information exchange: some input samples of the receiver FFT
can be omitted and replaced by others. In this work, we use the wired OFDM transceiver
model in order to study the rounding effect in the FFT. The input of the OFDM
transmitter is encoded generating a parity bit stream. If q-QAM modulation is used,
log2(q) bits from the interleaved parity/data bit streams are mapped to the corresponding
constellation symbol Xk (0≤k&lt;N). At the input of the N-point IFFT, q-QAM symbols
are arranged in a proper order and the symbols xn (0≤n&lt;N) are generated at the IFFT
output. The reverse procedure is followed on the receiver where the symbols yn=xn+zn
form the N-pointt FFT input. The symbol zn is Additive White Gaussian Noise (AWGN)
noise. The output of the FFT are N, Yk symbols. An appropriate error decoder such as
Viterbi corrects a number of errors using the parity bits, avoiding packet retransmission
      </p>
      <p>The FFT input yk is split in two parts: yk_a and yk_b. In Radix-4 butterflies, 4 pairs of
inputs are used and radices that are not powers of 2 (such as Radix-3, Radix-5) can
also be employed. Decimation in Frequency (DIF) differs from DIT in the fact that the
order of the butterfly inputs and outputs is bit reversed. The FFT software
implementations are slow and thus, most of the modern telecommunication systems are
based on hardware implementations.</p>
      <p>There are log2(N) stages in the N-point FFT implemented in this paper. The real and
imaginary parts of the stage p inputs are stored in a pair of buffers. A pair of operands
are accessed through ports rp1 and rp2 while a pair of values can be stored to the buffer
through write ports wp1, wp2. Each one of these read or write ports has size d. The
twiddle factors w, are retrieved from a ROM with N/2p+1 size at stage p. Each Butterfly
block performs the following calculations:
1(3+, ), 2 = 13, 2+ 13#2∙ {} − {3#} ∙ {}.
1(3+, ), 2 = 13, 2+ 13#2∙ {} + {3#} ∙ {}.
1(3+, )#2 = 13, 2− 13#2∙ {} + {3#} ∙ {}.</p>
      <p>1(3+, )#2 = 13, 2− 13#2∙ {} − {3#} ∙ {}.</p>
      <p>The sparseness level S is the fraction of the non-zero bits in the input data stream
and is expected to be small (e.g., 1%-2%) if the input data are sparse. Many q-QAM
symbols are expected to be equal (Xc) due to data sparsity and they form an IFFT input
set with appropriate structure. A number of samples in this FFT input set are expected
to be equal (due to the input sparseness) and they can replace each other. The ADC in
does not need to sample all the values since some of them can be replaced by others
already received. The ADC sampling rate can thus be reduced on the receiver for lower
power. The OFDM undersampling method proposed for wired channels is based on
the following DFT property:</p>
      <p>
        The parameters yn, Yk are the N, DFT input, output symbols and Xk, xn are the N,
Inverse DFT (IDFT) input, output symbols respectively. The twiddle factors w are
defined as !" = #$"/!. The xn symbols of the OFDM transmitter IDFT output are sent
over the communication channel and they are received as yn at the input of the receiver
DFT module. FFT reduces the O(N2) operations required by DFT to O(N∙logN). FFT
building blocks are the Radix-r butterflies. Radix-2 butterflies operate on a pair of inputs
(y1 and y2) that are defined as follows [
        <xref ref-type="bibr" rid="ref10">10</xref>
        ] for Decimation in Time (DIT) FFT:
* = !, (∑2!-+.#,#,.. 2!2* + ∑2!"-+,, ,6..(2 − 20!" )!2*).
(1)
(2)
(3)
(4)
(5)
(6)
(7)
*0!" = ! (∑2!-+.#,#, 2!2* − ∑2!"-+,, ,6,(2 − 20!" )!2*).
      </p>
      <p>,
(8)
n+
2</p>
      <p>n+
The outputs xn and x</p>
      <p>N of the IDFT are equal, only if X k = X
k +</p>
      <p>N and k is odd.
2</p>
      <sec id="sec-2-1">
        <title>In this case x</title>
      </sec>
      <sec id="sec-2-2">
        <title>N (and thus: y</title>
        <p>n+</p>
        <p>2 2
half of the yn symbols placed in odd positions with n&gt;N/2, can be substituted by others.
The ADC on the receiver can operate at half speed, 50% of the time in this case. The
maximum number of substituted samples is R2=N/4 but this value would make sense
only if all inputs are 0, otherwise a high error floor occurs. If the value of R is smaller,
the error may be acceptable.</p>
        <p>N ) can be replaced by xn (yn). On the receiver, up to</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>4 Simulation</title>
      <p>The effect of the rounding error caused by the limited FFT word length is studied
through simulations. The two metrics used, are the Normalized Mean Square Error
(NMSE) and the Symbol Error Rate (SER). If it is assumed that an FFT output Y
corresponds to IFFT input X, its NMSE error can be expressed as  = ‖ − ‖##/‖‖##.
There is an area on the QAM constellation plane where Y is recognized as X due to
shorter distance compared with the neighboring constellations. Although the NMSE is
not 0 in this case this error does not affect SER. If e.g., Y is recognized as a different
16QAM symbol than X, only one of the 4 data bits that correspond to this constellation
would be inverted. Thus, the effect of this error on the Bit Error Rate (BER) would be
4 times lower than on the SER.</p>
      <p>(a)
(b)</p>
      <p>
        If two numbers of nb bits are multiplied, the rounding error is between ±
2
ds = 2-(nb-1) and if the error probability is uniform (1/ds), then the error variance is: ds2 /12
"
. The rounding noise for N-point FFT is up to [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ]: 78 = 9# (# − 2). The
:
undersampling error for 16QAM as explained in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] is ;8 = ( ∙  ∙ √2DE −
1G#)#. The simulation results presented in this section highlight the cases where
selecting appropriate number of bits nb makes PRE&lt;&lt;PUE. Three representative
combinations are examined: C1) N=256, 16QAM, C2) N=256, 4QAM (Quadrature
Phase Shift Keying-QPSK) and C3) N=1024, 16QAM. Data are taken from sparse
images like the ones shown in Fig. 1. The IFFT input packets formed have sparseness
S between 5% and 30%. This is too high for the proposed undersampling procedure as
explained in [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] but these IFFT input packets will merge with several others consisting
of only Xc symbols (S=0). Thus, a projection is also used to display how NMSE and
SER changes according to S.
      </p>
      <p>Fig. 2 shows the simulation results for combination C1. When 6 bits (with 3 bits
fraction) are used, truncation (“Trunc”) caused by limited word length plus
undersampling error makes the quantization error much higher than the undersampling
error alone (“NoTrunc” cases). However, the error is almost identical and equal to the
undersampling error when 8 bits (with 5 bits fraction) are used i.e., the quantization
error is small in this case.
2-(nb -1)</p>
      <p>. If</p>
      <p>Similar conclusions can be drawn from Fig. 3 and 4 for combinations C2 and C3,
respectively. It should be noticed that the undersampling error is quite small with
QPSK (C2) especially with R8, making the quantization error relatively high even
when 8-bit word length is used. In the C3 combination, different NMSE values lead to
similar SER behavior. NMSE can be drastically reduced when 8-bit word length is
selected. Fig. 5 shows how the NMSE, SER can be projected in the range S=[0..10%]
for one of the cases (C3). Similar plots can be retrieved for C1 and C2.</p>
    </sec>
    <sec id="sec-4">
      <title>5 Conclusions</title>
      <p>The effect of the selected FFT word length in an OFDM transceiver supporting
undersampling when sparse information is exchanged as in the case of IoT sensors,
surveillance cameras, etc. was studied. Simulation on three combinations of FFT size
and QAM modulations, showed that the use of 8-bits in the FFT word length was
adequate in order to keep the power consumption and complexity low.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bracewell</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          (
          <year>1999</year>
          )
          <article-title>The Fourier Transform</article-title>
          &amp;
          <article-title>Its Applications</article-title>
          . 3rd ed.
          <source>McGrawHill.</source>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Cooley</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Tukey</surname>
            ,
            <given-names>J.W.</given-names>
          </string-name>
          (
          <year>1965</year>
          )
          <article-title>An Algorithm for the Machine Calculation of Complex Fourier Series</article-title>
          .
          <source>Mathematical Computer</source>
          ,
          <volume>19</volume>
          , p.
          <fpage>297</fpage>
          -
          <lpage>301</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>3. IEEE Standard for Floating Point Arithmetic</article-title>
          , ANSI/IEEE Standard 754/
          <year>2008</year>
          , Aug.
          <year>2008</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <issue>4</issue>
          .
          <string-name>
            <surname>Chang</surname>
            ,
            <given-names>W-H</given-names>
          </string-name>
          and Nguyen,
          <string-name>
            <surname>T.</surname>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>On the Fixed-Point Accuracy Analysis of FFT Algorithms</article-title>
          .
          <source>IEEE Trans. On Signal Processing</source>
          ,
          <volume>56</volume>
          (
          <issue>10</issue>
          ), p.
          <fpage>4673</fpage>
          -
          <lpage>4682</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Pálfi</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kollár</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>2009</year>
          )
          <article-title>Roundoff Errors in Fixed-Point FFT</article-title>
          .
          <source>Proc. of the IEEE International Symposium on Intelligent Signal Processing</source>
          , Budapest, Hungary.
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Welch</surname>
            ,
            <given-names>P.D.</given-names>
          </string-name>
          (
          <year>1969</year>
          )
          <article-title>A fixed-point fast Fourier transform error analysis</article-title>
          .
          <source>IEEE Transactions on Audio and Electroacoustics</source>
          ,
          <volume>17</volume>
          (
          <issue>2</issue>
          ), p.
          <fpage>151</fpage>
          -
          <lpage>157</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Prakash</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Rao</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          (
          <year>1981</year>
          )
          <article-title>Fixed-Point Error Analysis of Radix-4 FFT</article-title>
          . Signal Processing, North Holland Publishing Company,
          <volume>3</volume>
          , p.
          <fpage>123</fpage>
          -
          <lpage>133</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Liu</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Thong</surname>
            ,
            <given-names>T</given-names>
          </string-name>
          (
          <year>1976</year>
          )
          <article-title>Fixed-Point Fast Fourier Transform Error Analysis</article-title>
          .
          <source>IEE Trans. Acoustics</source>
          , Speech, Signal Processing, ASSP-
          <volume>24</volume>
          , p.
          <fpage>563</fpage>
          -
          <lpage>573</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Petrellis</surname>
            ,
            <given-names>N.</given-names>
          </string-name>
          (
          <year>2016</year>
          )
          <article-title>Optimal Reconstruction of Sub-sampled Time-Domain Sparse Signals in Wired/Wireless OFDM Transceivers</article-title>
          .
          <source>Springer EURASIP Journal on Wireless Communications and Networking</source>
          ,
          <volume>122</volume>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Löfgren</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Nilsson</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2011</year>
          )
          <article-title>On hardware implementation of radix 3 and radix 5 FFT kernels for LTE systems</article-title>
          .
          <source>Proc of the IEEE NORCHIP conference, Lund</source>
          , Sweden.
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          11.
          <string-name>
            <surname>Widrow</surname>
            ,
            <given-names>B.</given-names>
          </string-name>
          and
          <string-name>
            <surname>Kollár</surname>
            ,
            <given-names>I.</given-names>
          </string-name>
          (
          <year>2008</year>
          )
          <article-title>Quantization Noise: Roundoff Error in Digital Computation</article-title>
          , Signal Processing, Control, and
          <string-name>
            <surname>Communications</surname>
          </string-name>
          . Cambridge University Press, Cambridge, UK.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>