Design and FPGA Implementation of a Low Power OFDM Transmitter for Narrow-Band IoT Gian Carlo Cardarilli, Luca Di Nunzio, Rocco Fazzolari, Riccardo La Cesa and Marco Re University of Rome Tor Vergata, Via del Politecnico 1, Rome, 00133, Italy Abstract 5G technology is now globally widespread. One of the most interesting applications that this technology offers concerns the "Internet of Things", better known as IoT which gave birth to several new technologies such as the Narrow-Band IoT (NB-IoT). These technologies provide a communication standard for wide areas and its main feature is low power consumption. In this paper the design and the FPGA implementation of a low-power OFDM transmitter for NB-IoT applications is proposed. It is mainly composed of a QPSK Mapper, a 12-points IFFT and a Cyclic Prefix Module. The whole developed system has been implemented on a Xilinx Spartan-7 device and it has been characterized in terms of hardware resources, timing, and power consumption. Keywords 5G, FPGA, OFDM, NB-IoT 1. Introduction the international standards organization [9]. The NB-IoT supports massive device connections guaranteeing ultra- In these last decades, IoT is expanding into various fields low power consumption, wide area coverage and bidirec- allowing the creation of intelligent environments such as tional triggering between signaling plane and data plane. smart cities and smart buildings as well as autonomous The main features of this technology [10] are shown in vehicles [1]. This is made possible thanks to the birth Fig. 1 and discussed below. of the fifth-generation technology of mobile telephony also known as 5G which has the target to obtain greater efficiency and versatility through better improved mobile device management skills, higher speed, lower latency between sent signal and available output [2, 3]. The 5G technology gives the possibility to connect a massive amount of devices [4] and having, as consequence, a huge amount of data to manage. For this purpose sev- eral communication architectures have been proposed in the literature [5]. Nevertheless, the problem of big- data generated by IoT devices, is often faced with the help of several algorithms of Artificial Intelligence (AI). Usually, for the development of intelligent environments [6, 7, 8] the IoT uses low-speed data transmission ser- vices defined LPWAN (Low-Power Wide-Area Network). The NB-IoT is an LPWAN technology proposed by 3GPP, SYSTEM 2021 @ Scholar’s Yearly Symposium of Technology, Engineering and Mathematics. July 27–29, 2021, Catania, IT " g.cardarilli@uniroma2.it (G. C. Cardarilli); di.nunzio@ing.uniroma2.it (L. D. Nunzio); Figure 1: Main transmission features of NB-IoT. fazzolari@ing.uniroma2.it (R. Fazzolari); riccardo.lacesa@alumni.uniroma2.eu (R. L. Cesa); re@ing.uniroma2.it (M. Re) ~ https://dspvlsi.uniroma2.it/ (G. C. Cardarilli); The bandwidth of the physical layer is 180 kHz. In the https://dspvlsi.uniroma2.it/ (L. D. Nunzio); down-link, it adopts Orthogonal Frequency-Division Mul- https://dspvlsi.uniroma2.it/ (R. Fazzolari); tiplexing (OFDM) with Quadrature Phase-Shift Keying https://dspvlsi.uniroma2.it/ (M. Re) (QPSK) sub-carriers. Modem BPSK or QPSK are adopted  0000-0002-7444-876X (G. C. Cardarilli); 0000-0002-4312-7939 with sub-carrier interval of 15 kHz or 3.75 kHz. To have (L. D. Nunzio); 0000-0002-7383-2663 (R. Fazzolari); 0000-0001-9046-1318 (M. Re) the bandwidth of 180 kHz 12 sub-carriers are defined Β© 2021 Copyright for this paper by its authors. Use permitted under Creative spaced by 15 kHz and 48 sub-carriers must be used when Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) they are spaced by 3.75 kHz. In this paper the 12 sub- 60 Gian Carlo Cardarilli et al. CEUR Workshop Proceedings 60–65 carriers transmission will be debated [11]. The aim of our fly diagram as a sequence of radix2, 4 and 3 IFFT from work is the FPGA implementation of a low-power/low- which the subsequent architecture will be derived (see area OFDM modulator for NB-IoT. Since the critical part Fig. 2)[18]. of the transmitter is the 12-points IFFT, starting from an algorithm proposed in [12], we developed an architecture efficient in terms of power and area. OFDM numeric mod- ulation is widely used for ADSL, DVB-T, WiFi, WiMAX transmission, and in 802.11a, 802.11n 802.11ac standards. Behind this modulation there is the use a series of or- thogonal sub-carriers with different frequencies, each one carrying a part of the information. Each sub-carrier is modulated with the common BPSK or QPSK modu- lation. For NB-IoT OFDM, the symbol time is fixed to 𝑇𝑠𝑦 = 66.7πœ‡π‘ . The sampling is carried out with a period of π‘‡π‘ π‘Žπ‘šπ‘π‘™π‘–π‘›π‘” = π΅π‘Žπ‘›π‘‘π‘€π‘–π‘‘π‘‘β„Ž1 = 5.5πœ‡π‘ . Two important issues in OFDM modulation are the Inter-Carrier Inter- Figure 2: Butterfly diagram for the realization of a 12-point FFT complete with operations to be carried out, multiplicative ference (ICI) [13] and the Inter-Symbol Interference (ISI) coefficients and explaining all the stages. [14]. To overcome these problems, techniques for adding redundancy symbols such as the Cyclic Prefix (CP) are used [15]. The CP used in OFDM modems for NB-IoT is the "normal" cyclic prefix used for LTE transmissions. In 2.2. IFFT Stage Structure this case, the bandwidth is so short then the CP is made of few samples since there will be a low inter-carrier in- The parallel 12-points architecture shown in Fig.2 is not terference. For LTE technologies the CP has a duration necessary the best choice for a NB-IoT OFDM modulator. of 5.5πœ‡π‘  which is exactly the sample time, this means it This is due to the slow data rate of the standard,and, as is sufficient to add only a single sample of each OFDM consequence, there is not any necessity of parallel IFFT symbol. implementation. Parallel implementations are very use- ful in case of fast data rate, vice versa, in case of slow data-rate, serial architectures are preferred in order to 2. OFDM modem with a 12-points reduce hardware resources. Let’s consider that all the IFFT more IoT devices as sensor nodes acquire data serially from ADC. For this reason, starting from the architecture 2.1. IFFT Butterfly Diagram shown in Fig. 2 we develop a serial IFFT. Serialization has been obtained by inserting dual port RAMs between each One of the characteristics that allowed OFDM modulation IFFT/FFT stage. These dual port RAMs have been used to develop and spread for most of the transmissions is the to implement the double buffering (ping-pong) operation efficiency in its digital implementation. The output mod- on data coming from the previous stage. According to ulation is proportional to the IFFT (Inverse Fast Fourier the NB-IoT standard, the transmitter was realized by in- Transform) of the input components. The structure of serting a QPSK modulator at the input of the IFFT. At the an IFFT-based OFDM transmitter modulator is formed output of the IFFT, another dual-port RAM implement- of the succession of a Mapper, a block that performs the ing the double buffering technique has been inserted, to IFFT followed by one that adds the cyclic prefix [16]. The realize the cyclic prefix required by the standard. The most popular FFT algorithm is the Cooley-Tukey algo- block diagram of the proposed architecture is shown in rithm which is based on the Divide et impera principle Fig. 4. It is interesting to focus on the realization of the and which recursively breaks a DFT of any size 𝑁 into individual stages of the IFFT. Being a series of 2, 4 and smaller DFTs [17]. Usually this is done on samples of 3-points IFFTs, each stage will have to be custom-made length equal to a power of 2 and therefore with 𝑁 = 2π‘š . following a basic architecture shown in Fig. 3 [19]. As discussed in the introduction, the NB-IoT requires an Each stage consists of a dual-port RAM in which the sam- OFDM modulation with 12 sub-carriers. For this reason, ples are saved and at the same time two samples are read it is required an 𝑁 = 12 points IFFT. Since 𝑁 = 12 and used for the elaboration. The simultaneous reading N is not a power of 2, the traditional Cooley-Turkey al- and writing are carried out through the double buffer- gorithm cannot be used. To solve this issue in [12] the ing technique and the reading and writing addresses are authors propose an architecture that uses a combination saved inside a ROM. The read data are used to perform of two algorithm: the Cooley-Turkey and the Split Radix complex multiplication with the IFFT twiddle factors Algorithm for length 6π‘š IFFT realizing a custom butter- stored in a ROM. Finally, the samples enter an adder and 61 Gian Carlo Cardarilli et al. CEUR Workshop Proceedings 60–65 sumption estimation on all the FPGAs involved in our experiments. In a second step, we perform a more accu- rate power consumption estimation on the xc7s6cpga196 using the SAIF files containing information about the switching activity. The system has been tested by gen- erating at its input random 2 bit symbols. In Tab.1 and Tab.2 implementation results are shown. Such results re- fer to the implementation with a a timing constraints of Figure 3: Base structure of a single IFFT 12pt. stage. 5.5πœ‡π‘  that is the minimum value to respect the NB-IOT specifications. are serialized with a multiplexer to be saved in the next Table 1 simulation Results ROM. Outside this scheme, there are the control signals generated by an FSM. FPGA Spartan-7 xc7s6cpga196-2 Clock period 5.5[𝑒𝑠] 3. Experimental Results Frequency 180[π‘˜π»π‘§] Total Power on chip 0.018[π‘Š ] The proposed architecture shown in Fig.4 has been sim- Energy per symbol 1.287 * 10βˆ’ 6[𝐽] ulated in SIMULINK. The Fixed point analysis has been performed to size all the algebraic elements of the system (multipliers, adders, etc.). We sized the entire system to assure a certain MER (Modulation Error Ratio). In Table 2 fact, the quantization error due to the truncation of al- Utilization gebraic operators implies an enlargement of the QPSK constellations points, this effect can be treated as MER Utilization degradation. The MER is the measure of the signal-to- Resource Utilization Avaible % Utilization noise ratio (SNR) in digital modulation applications. We LUT 1050 3750 28.00 targeted our system in order to have a quantization noise LUTRAM 259 2400 10.79 that introduces a MER degradation not more that 20dB. FF 648 7500 8.64 Fixed-point simulation results show that 8 bit for any IO 23 100 23.0 multiplier and adder is sufficient to obtain the MER of 20 BUFG 1 16 6.25 dB as depicted in (Fig. 5). Considering this reduced number of bits required for the multiplications and considered that all products are per- In Fig.6 is shown the hierarchical power report pro- formed with constant values (the IFFT twiddle factors), it viding dynamic power consumption information for each is possible to avoid the use of FPGA internal DSP blocks stage of the proposed OFDM modulator, results are shown by implementing multiplications with shift and additions. in percentage considering the total dynamic power dis- Thank to this optimization, power consumption is re- sipated as 100%. The third IFFT stage is the one char- duced and DSP blocks are not wasted. This latter aspect acterized by greater dynamic power consumption. This is very important because it allows preserving DSP block is an expected result as it is the stage that contains the for other application as for example Machine-Learning greatest number of multiplications and consequently it and other and in general, applications demanding high- is the most complex in terms of area. performance computing [20],[21],[22],[23],[24],[25] that, nowadays, is always more used in IoT nodes and it re- 3.1. Power and Energy Trend in quires a great number of multiplications. Frequency In this paper, we present the results obtained on the Spartan-7 xc7s6cpga196-2 FPGA that is one of the cheap- Because power consumption represents one of the most est Xilinx device and, consequently, one of the most in- important aspects of IoT Nodes, the proposed OFDM teresting for the realization of low-cost IoT nodes. The transmitter has been characterized in terms of energy. transmitter has been characterized in terms of resources Several implementations using different clock constraints utilization and power consumption that are crucial aspect have been realized. In this way, it has been possible to for IoT nodes [26]. Power analysis has been performed characterize the energy consumption for every imple- initially without any Switching Activity Interchange For- mentation. Energy consumption has been estimated in mat File (SAIF) in order to have a coarse power con- terms of energy per OFDM symbol according to Eq 1, 62 Gian Carlo Cardarilli et al. CEUR Workshop Proceedings 60–65 Figure 4: The proposed NB-IoT OFDM modulator. from the highest frequency to observe the trend of the maximum power and energy used by the transmitter. The results obtained are shown in Tab. 3 and in the graph in Fig. 7. Results show dynamic power increasing linearly with the increasing of the frequency in accordance with Eq. 2. 2 𝑃𝑑𝑦𝑛 = π‘Ž * 𝐢 * 𝑓 * 𝑉𝐷𝐷 (2) where a is the switching activity, C is the switching ca- pacitance, f is the clock frequency and 𝑉𝐷𝐷 the supply voltage. Note that at varying of clock frequency, the energy per Figure 5: Obtained MER with 8-bit fixed point simulation. OFDM-symbol remains about the same. This aspect sug- gests that the same architecture is synthesized by the tool without the necessity to introduce/duplicate new hardware for reaching high frequencies. Figure 6: Hierarchical Dynamic Dissipation Power. Figure 7: Dynamic Power trend. πΈπ‘ π‘¦π‘š = π‘ƒπ‘Žπ‘£π‘’ * 𝑁 * 𝑇 𝑐 (1) where π‘ƒπ‘Žπ‘£π‘’ is the power estimated through post-imple- mentation simulations taking into account the real switch- ing activities of nodes contained in the SAIF files provided 4. Conclusion to the power estimator, N is the number of clock cycles required to obtain an OFDM symbol that in our case is 13 The paper proposes a low-power FPGA implementation (12 for the IFFT computation and 1 for the cyclic prefix) of an NB-IoT OFDM transmitter. The proposed architec- and finally Tc is the Clock period. The frequency range ture has been developed in VHDL and implemented on a was chosen to start from the maximum frequency that Xilinx FPGA. Results are provided in terms of utilization allows the correct operation of the circuit, which turned resources and power consumption. For what concerns out to be 125MHz, decreasing it progressively. We started 63 Gian Carlo Cardarilli et al. CEUR Workshop Proceedings 60–65 Table 3 Power and Energy Trend in Frequency Clk Frequency Power (tot) Dynamic Power Sym. Energy 35.71𝑀 𝐻𝑧 0.032π‘Š 0.013π‘Š 4.732 * 10βˆ’ 9𝐽 41.67𝑀 𝐻𝑧 0.034π‘Š 0.016π‘Š 4.992 * 10βˆ’ 9𝐽 50𝑀 𝐻𝑧 0.037π‘Š 0.019π‘Š 4.94 * 10βˆ’ 9𝐽 62.50𝑀 𝐻𝑧 0.042π‘Š 0.023π‘Š 4.784 * 10βˆ’ 9𝐽 83.33𝑀 𝐻𝑧 0.05π‘Š 0.031π‘Š 4.836 * 10βˆ’ 9𝐽 100𝑀 𝐻𝑧 0.056π‘Š 0.038π‘Š 4.94 * 10βˆ’ 9𝐽 125𝑀 𝐻𝑧 0.068π‘Š 0.049π‘Š 5.096 * 10βˆ’ 9𝐽 this latter, power characterization has been provided tak- internet of things, IEEE access 5 (2017) 20557– ing into account the energy dissipated for an OFDM sym- 20577. bol transmission. Results show a very reduced utilization [7] G. Capizzi, C. Napoli, S. Russo, M. WoΕΊniak, Lessen- of resources and power consumption. These two aspects ing stress and anxiety-related behaviors by means are very important for IoT nodes that are characterized of ai-driven drones for aromatherapy, volume 2594, by strict energy consumption requirements and low cost 2020, pp. 7–12. . [8] P. Caponnetto, et al., The effects of physical exercise on mental health: From cognitive improvements to risk of addiction, International Journal of En- 5. Acknowledgments vironmental Research and Public Health 18 (2021). doi:10.3390/ijerph182413384. The authors would like to thank Xilinx Inc. for providing [9] Y. Miao, W. Li, D. Tian, M. S. Hossain, M. F. Al- FPGA hardware and software tools by Xilinx University hamid, Narrowband internet of things: Simulation Program. and modeling, IEEE Internet of Things Journal 5 (2017) 2304–2314. References [10] Y. Zou, X. Ding, Q. Wang, Key technologies and application prospect for nb-iot, ZTE Technology [1] I. Benedetti, R. Giuliano, C. Lodovisi, F. Mazzenga, Journal 23 (2017) 43–46. 5G wireless dense access network for automotive [11] Y. Wu, W. Y. Zou, Orthogonal frequency division applications: Opportunities and costs, in: 2017 In- multiplexing: A multi-carrier modulation scheme, ternational Conference of Electrical and Electronic IEEE Transactions on Consumer Electronics 41 Technologies for Automotive, IEEE, 2017, pp. 1–6. (1995) 392–399. [2] D. Wang, D. Chen, B. Song, N. Guizani, X. Yu, X. Du, [12] W. Zheng, K. Li, Split radix algorithm for length From iot to 5g i-iot: The next generation iot-based 6m dft, IEEE signal processing Letters 20 (2013) intelligent algorithms and 5g technologies, IEEE 713–716. Communications Magazine 56 (2018) 114–120. [13] P. Tan, N. C. Beaulieu, Reduced ici in ofdm systems [3] G. Capizzi, S. Coco, G. Sciuto, C. Napoli, A new using the" better than" raised-cosine pulse, IEEE iterative fir filter design approach using a gaus- Communications Letters 8 (2004) 135–137. sian approximation, IEEE Signal Processing Let- [14] X. Wang, P. Ho, Y. Wu, Robust channel estima- ters 25 (2018) 1615–1619. doi:10.1109/LSP.2018. tion and isi cancellation for ofdm systems with sup- 2866926. pressed features, IEEE Journal on Selected Areas [4] R. Giuliano, F. Mazzenga, A. Vizzarri, Satellite- in Communications 23 (2005) 963–972. based Capillary 5G-mMTC Networks for Environ- [15] A. A. Al-jzari, I. Kostanic, K. H. M. Mabrok, Effect mental Applications, IEEE Aerospace and Elec- of variable cyclic prefix length on ofdm system per- tronic Systems Magazine 34 (2019) 40–48. formance over different wireless channel models, [5] F. Mazzenga, R. Giuliano, F. Vatalaro, FttC-based Univers. J. Commun. Networ 3 (2015) 7–14. fronthaul for 5G dense/ultra-dense access network: [16] N. Chide, S. Deshmukh, P. Borole, Implementation Performance and costs in realistic scenarios, Future of ofdm system using ifft and fft, International Internet 9 (2017) 71. Journal of Engineering Research and Applications [6] M. Chen, Y. Miao, Y. Hao, K. Hwang, Narrow band (IJERA) 3 (2013) 2009–2014. [17] M. Puschel, Cooley-tukey fft like algorithms for 64 Gian Carlo Cardarilli et al. CEUR Workshop Proceedings 60–65 the dct, in: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03)., volume 2, IEEE, 2003, pp. II–501. [18] E. Dubois, A. Venetsanopoulos, A new algorithm for the radix-3 fft, IEEE Transactions on Acoustics, Speech, and Signal Processing 26 (1978) 222–225. [19] M. Garrido, S.-J. Huang, S.-G. Chen, O. Gustafsson, The serial commutator fft, IEEE Transactions on Circuits and Systems II: Express Briefs 63 (2016) 974–978. [20] G. Capizzi, G. Lo Sciuto, C. Napoli, E. Tramontana, A multithread nested neural network architecture to model surface plasmon polaritons propagation, Micromachines 7 (2016) 110. [21] F. Fallucchi, M. Gerardi, M. Petito, E. W. D. Luca, Blockchain framework in digital government for the certification of authenticity, timestamping and data property, in: Proceedings of the 54th Hawaii International Conference on System Sciences | 2021, University of Hawai’i at Manoa, Honolulu, HI, 2021, pp. 2307–2316. doi:10.24251/HICSS.2021.282, http://hdl.handle.net/10125/70895. [22] G. C. Cardarilli, L. Di Nunzio, R. Fazzolari, M. Panella, M. Re, A. Rosato, S. Span, A paral- lel hardware implementation for 2-d hierarchical clustering based on fuzzy logic, IEEE Transactions on Circuits and Systems II: Express Briefs 68 (2020) 1428–1432. [23] G. Capizzi, F. Bonanno, C. Napoli, Hybrid neural networks architectures for soc and voltage predic- tion of new generation batteries storage, in: 2011 International Conference on Clean Electrical Power (ICCEP), IEEE, 2011, pp. 341–344. [24] F. Fallucchi, M. Coladangelo, R. Giuliano, E. William De Luca, Predicting employee attrition using ma- chine learning techniques, Computers 9 (2020) 86. [25] G. Capizzi, G. Lo Sciuto, C. Napoli, R. Shikler, M. Wozniak, Optimizing the organic solar cell manufacturing process by means of afm measure- ments and neural networks, Energies 11 (2018). doi:10.3390/en11051221. [26] F. Silvestri, S. Acciarito, G. C. Cardarilli, G. M. Khanal, L. Di Nunzio, R. Fazzolari, M. Re, Fpga implementation of a low-power qrs extractor, in: International Conference on Applications in Elec- tronics Pervading Industry, Environment and Soci- ety, Springer, 2017, pp. 9–15. 65