=Paper=
{{Paper
|id=Vol-2694/paper7
|storemode=property
|title=An efficient HDL IP-core Generator for OFDM
modulators
|pdfUrl=https://ceur-ws.org/Vol-2694/p7.pdf
|volume=Vol-2694
|authors=Roberta Avanzato,Gabriele Nicotra
|dblpUrl=https://dblp.org/rec/conf/system/AvanzatoN20
}}
==An efficient HDL IP-core Generator for OFDM
modulators==
An efficient HDL IP-core Generator for OFDM modulators Roberta Avanzatoa , Gabriele Nicotrab a Department of Electrical, Electronic and Computer Engineering, University of Catania, 95125 Catania, Italy b Department of Mathematics and Computer Science, University of Catania, 95125 Catania, Italy Abstract In this paper, we propose a HDL IP generator for (Orthogonal Frequency-Division Multiplexing) OFDM modulators. This modulation is used in many telecommunication standards. However, each standard requires a specific OFDM modulator characterized by a different number of carriers and a cyclic prefix. These differences, in terms of OFDM parameters, have a negative impact on RTL hardware design. This diversity makes difficult the reusing modulators already designed for a different project involving a different communication standard. For this reason, the authors propose an automatic IP HDL generator capable of generating RTL code in VHDL or Verilog of OFDM modulators with number of carriers and cyclic prefix settable by user. The generated IP have been characterized in terms of max frequency, hardware resources, and power consumption. The authors performed the hardware implementations on a XILINX xc7z030 FPGA. Keywords OFDM, FPGA 1. Introduction plementation. This design flow can be divided into two steps: called front-end and back-end. The front-end In the several last years, digital electronics have been phase consists of RTL design using HDL languages like increasingly used in several fields. This is essentially VHDL, Verilog, or System Verilog. The back-end phase due to the capability of modern integrated digital cir- involves the physical-design (the circuit layout). cuits to provide high computational power allowing A hardware description language (HDL) is a lan- the realization of complex (Digital Signal Processing) guage used to describe the architecture and behavior DSP circuits [1, 2]. Digital systems can be developed of electronic circuits, usually digital logic circuits. Hard- using two main technologies that are (Application Spe- ware description languages born with the intent to help cific Integrated Circuits) ASICs and (Field Programma- engineers to describe circuits. Successively with the ble Gate Arrays) FPGAs. Nowadays FPGAs and dig- born of hardware synthesizers HDL language started ital ASICs can be used in several fields as Machine to be used for simulation and synthesis. Hardware Learning [3] [4],[5], health [6],[7],[8], and communi- synthesizers are software able to transform HDL files cation systems [9, 10], [11] audio [12],[13] etc [14], in a netlist of electronic circuits and connections. A [15]. Modern digital communication systems require netlist is a specification of physical electronic compo- high computation capabilities and for this reason, FP- nents and how they are connected together. GAs represent nowadays an optimal solution for their A hardware description language looks much like implementation For example in [16], [17] FPGA im- a programming language such as C but differs from plementations of digital transmitters are presented, in them for several aspects. An important difference be- [18] the authors use an FPGA to implement a space- tween programming languages and HDLs is that HDLs craft tracking system. Similar approaches can be used explicitly include the notion of time. A second impor- for modem in current and future wired Digital Sub- tant difference is that HDL languages describe parallel scriber Line technologies [19] or satellite [20] . process. The Hardware implementation of digital commu- Due to the exploding complexity of digital electronic nication systems both on ASIC and FPGA requires a circuits since the 1970s (see Mooreβs law), synthesis very complex design flow. Such as a flow is extremely through HDL languages began a necessity. There are slow if compared with the one used for software im- two major hardware description languages: VHDL and Verilog. SYSTEM 2020: Symposium for Young Scientists in Technology, The front-end phase is very slow because HDL lan- Engineering and Mathematics, Online, May 20 2020 " roberta.avanzato@phd.unict.it (G. Nicotra) guages are very complex to develop and verify. In or- der to speed-up the RTL design phase, HDL Intellec- Β© 2020 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). tual Property (IPs) are increasingly proposed in the lit- CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) erature and used by RTL engineers. IPs are re-usable blocks of HDL code that can either be taken from inter- nal design libraries or be purchased from third-party vendors. Thanks to their re-usability and reconfigura- bility, IP cores allow the speed-up of the RTL design phase also for specific device design such as those in Internet of Things [21] and [22]. In this paper, the authors propose an IP generation tool for OFDM modulators. The IP generator has been developed in MATLAB/SIMULINK. It allows users to select the number of carrier number, the length of the cyclic prefix. In addition, users can provide fixed-point Figure 1: OFDM signals: subcarrier signals (a); subcarriers information as the number of the bit of the inputs and (b). of the outputs. The IP generator is capable to gener- ate both VHDL and Verilog. The paper is organized as follows: In Sect.2 the OFDM modulation is described. In Sect.3 the IP generator and the OFDM Hardware architectures that are implemented are described. In Sect.4 the experimental results in terms of area speed and power consumption are provided. Finally in Sect.5 conclusions are provided. 2. The OFDM modulation Nowadays several communication applications require high data-rate transmission over mobile or wireless channels [23] [24]. In the case of single-carrier mod- Figure 2: OFDM transmitter block diagram. ulation, as in time-division multiple access (TDMA) in Global System for Mobile Communications (GSM) since the symbol duration reduces with the increase (1): of the data rate, and spreading fading of the wireless channels will cause more severe intersymbol interfer- π‘+π β ππ (π‘) β ππ (π‘)ππ‘ = π΄ ππ π=π (1) ence (ISI). In order to reduce the effect of ISI, it is nec- β« π‘ essary that the symbol duration is much larger than the delay spread of wireless channels. The Orthogo- and 0 otherwise (i.e. for π β π). Wave-forms sat- nal Frequency-Division Multiplexing (OFDM) modu- isfying (1) are those reported in Figure 1(a). Thus, a lation divides the entire channel into many narrow- generic OFDM signal can be written as: band subchannels [25], [26]. These subchannels are transmitted in parallel in order to maintain high-data- π (π‘) = β ππ (π‘)π π2πππ π‘ β π π2ππ0 π‘ (2) rate transmission and, at the same time, to increase π the symbol duration. In this way, the ISI effects are where π0 is the Radio Frequency (RF) translation. When drastically reduced [27]. In Figure 1 it is reported an we sample the OFDM signal in (2) for π‘ = ππ , we ob- example of the subchannel division of OFDM modu- tain: lation and their corresponding signal transmitted of a ππ π (ππ /π ) = β ππ (π‘)π π2π π π (3) single subcarrier. In Figure 1 the overall bandwidth is π divided into π subcarriers, where each subcarrier has which is the Inverse Fourier Transform (πΉ πΉ π β1 ) of the a reduced bit rate π π = π π /π , where π is the ODFM transmitted symbols before the RF translation. In Fig- signal duration in the subcarrier and π π is the bit rate ure 2 it is reported the classical scheme of the transmit- of the total bandwidth [28]. ter of an OFDM signal. In the demodulator at the re- The principle of operation is based on the orthogo- ceiver, we consider for example the frequency of sub- nality of the subcarriers, whose concept is reported in carrier ππ = π/π , the received signal is, after removing 45 Figure 4: Principle scheme of the OFDM receiver. OFDM receiver. After the quadrature mixer to extract the Real and imagery parts, the parallel-to-serial stage Figure 3: Example of cyclic Prefix in OFDM signal. provides inputs to the FFT. Then, symbols are extracted and passed to the decoder. the RF component: 3. OFDM modulator architecture ππ π β« π (π‘)π βπ2πππ π‘ ππ‘ = β β« ππ (π‘)π π2πππ π‘ π βπ2πππ π‘ ππ‘ = The proposed IP generator has been developed in MAT- (πβ1)π π 0 LAB. Using a graphical interface users can customize and generate the VHDL or Verilog code the OFDM π (4) modulators. The IP is provided with 5 I/O ports di- π2π πβπ π‘ π = β β« ππ (π‘)π π ππ‘ = ππ π 0 (π π ) vided into control and data ports. In the following, a detailed description of these port is provided giving Due to the subcarrier orthogonality, the receiver is able information about the data size and the direction: to extract the correspondent complex symbol trans- mitted on th π-th subcarrier. β’ clock: it is a one-bit input port used to provide In a wireless channel, multipath can affect the orthog- the clock to the circuit. onality due to delays and reflections can provide the receiver with replies. It is possible to have a guard β’ reset: it is a one-bit input port used for the global time interval ππ in order to properly start the recep- reset. The reset is asynchronous active high. tion, thus having the integration between ππ ππππ +ππ . β’ enable: it s a one-bit input port used for the Unfortunately, this is not enough since different termi- global enable. nals can transmitting simultaneously. In order to avoid an inter-carrier interference, a cyclic prefix is added to β’ ready: it is a one-bit input port. If the ready is the transmitted signal as described in Figure 3. low, the IP ignores the input. This port must be The cyclic prefix allows to have an integer number 1 when input data are available. of times the oscillation of the basic waveform π π2πππ π‘ despite the reply comes late. Even if it partially leaves β’ done: It s a one-bit output port. This port pass the integration interval thanks to the cyclic prefix, the from zero to one when data are available at the missing part of the reply go back in the interval of in- output of the circuit terest without affecting the receiver. The only differ- β’ real-data: it s an N-bit Input port (N is selected ence is that now the duration of the OFDM symbols by the user) used to provide the real input sam- are π + ππ , while the integration occurs for a time of π β ples seconds, thus reducing the collected energy of π +π π . β’ imag-data: it s an N-bit Input port (N is selected π Of course, the guard time should be selected properly, depending on the channel characteristics. ππ should by the user) used to provide the imaginary input be greater than the maximum delay introduced by the samples wireless channel (at least greater than the channel de- β’ real-out: it s an N-bit output port (N is selected lay spread) but as minimum as possible due to the loss by the user) used to provide the real output sam- in the energy collection. ples In Figure 4 it is reported the generic scheme of the 46 Figure 6: IP Core Block Diagram,the real and imaginary parts of the input and output signals have been fused to simplify the schematic three multiplications instead of four as shown in [29]. In addition to these main blocks the system is provided with a Finite State Machine FSM for the generation of the control signals "ready" an "done". Figure 5: 4QAM constellation implemented in the proposed In Fig.8 is shown the timing of the core. It supports IP CORE the streaming mode providing output without any in- terruption. When the ready pass from zero to one the IP CORE can receive data that must be provided at any β’ imag-out: it s an N-bit output-port (N is se- clock cycle. In order to simplify the diagram, we fuse lected by the user) used to provide the imaginary the real and the imaginary part in a single signal. Af- output samples ter a certain latency depending on the number of car- In Fig.6 is shown the block diagram of the IP core. riers, the IP provides results at the output port. When It is composed of four main blocks. it occurs the valid signal pass from zero to one. Also for output signals, the real and the imaginary part are β’ A QAM Mapper fuse in the diagram. When the ready signal pass from β’ A IFFT Core one to zero, the input data stream must be interrupted. The time interval in which this signal remains to zero β’ A Cyclic Prefix engine is required for the cyclic prefix computation. β’ A dual Port RAM The QAM Mapper maps the I/Q inputs on a QAM con- 4. Experimental Results stellation. In this first version of the core generator, In this Section, experimental results are provided. We only the 4QAM modulation is available. However fu- use the proposed IP generator to generate the VHDL ture releases will include also other QAM modulation code of 9 OFDM modulators. These modulators differ schemes. Fig.5 show the 4QAM constellation imple- from each other in terms of the number of carriers and mented in the proposed IP core. cyclic prefix. In order to verify the correct behavior of The IFFT core is the complex element of the IP CORE the circuit, we performed several test benches using in terms of hardware complexity. It is composed of the RTL simulator models. Simulations are performed Nlog2 Processing Element (PE) where N is the num- providing at the input of the IP sinusoidal waves and ber of OFDM carriers and consequently the number of chirp. Simulation results are compared with theoret- IFFT bins. Each processing element consists of a dual- ical results obtained by a MATLAB model especially port RAM used for the ordering of the samples, a ROM realized for this purpose. containing the IFFT twiddle factors, a complex multi- The generated VHDL files are has been synthesized plier, and an address generator. The address generator using the XILINX Vivado toolchain. Synthesis and Pla- and the dual-port RAM order the input using the dou- ce and Route have been performed with a clock con- ble buffering technique. straint of 200 MHz. Implementation results have shown A detailed description of this block is provided in in Tab. 1 We varied the number of carriers from 8 to Fig.7 2048 (considering only power of two). The second col- In order to reduce the number of multipliers, the umn of the table shows the cyclic prefix adopted for complex multiplication has been implemented using 47 Figure 7: Processing Element (PE) Block Diagram, the real and imaginary parts of the input and output signals have been fused to simplify the schematic.In order to reduce the number of multipliers, the complex multiplication has been implemented using three multiplications instead of four as shown in [29]. consumption correlated on the circuit area [30],[31]. There are three power dissipation components in CMOS digital circuits: 1. Switching Power 2. Short-Circuit Power 3. Static Power. Among these contributions, the switching power rep- resents the most important because one and it is de- fined in Eq. 5 where a is the switching activity, C is the switching capacitance, f is the clock frequency and Figure 8: IP Core Timing diagram. The real and imaginary Vdd the supply voltage. parts of the input and output signals have been fused to simplify the schematic.The systems works on positive clock 2 π = π β π β π β πππ (5) edges The second contributionβ is related to the short-circuit currents flowing through the MOS transistors. It is strongly dependent on switching activity, clock fre- any test case. Results are in terms of LUTs, LUTRAM, quency, and supply voltage, but it also depends on the FF, BRAM, and DSP. In Fig.10 it is shown the dynamic design (for example the transistor ratios and the node power consumption required for the computation of waveforms). The third component, the static power, the test cases. Power consumption nowadays repre- depends on the leakage currents and it is related to the sents a crucial aspect of digital circuits design, espe- circuit design, the technology, and the supply voltage. cially for embedded systems. Such systems are usu- The first two power contributions are usually consid- ally powered by batteries and for this reason, power ered together under the name of Dynamic Power. Be- consumption must be reduced as possible in order to cause our experiments are performed on FPGAs, we extend the battery life. For this reason, circuits must be did not consider the static power dissipation but also realized in order to minimize the area being the power the dynamic one. Static power consumption on FPGA 48 Figure 9: IP Core mapped on target FPGA. Figure refers to a 2048 OFDM modulator compatible with 5G standard Table 1 Implementation results on a XILINX xc7z030 device with a 200 MHz clock constraint CN CP LUT L.RAM FF BRAM DSP 8 2 730 152 1494 64 6 16 4 1017 203 1941 64 9 32 8 1462 293 2626 64 12 64 16 1998 328 3244 64 15 128 16 2437 495 3950 65 18 256 32 2904 551 4600 66 21 512 32 3504 736 5512 68 24 1024 128 4301 1048 6563 70 27 2048 5G 256 5459 1792 8039 73 30 is always negligible if the FPGA is almost full. in terms 5. Conclusions of hardware resource usage. This is always true being the size of the FPGA selected considering the target In this paper, we presented an OFDM modulator IP project. generator suitable in all communication standards re- Finally Fig.9 show the Implemented circuit layout quiring a power of two FFT based OFDM. for the 2048 case. Results show that the hardware re- The proposed tool allows RTL designers to design sources required for the IP implementation are very flexible OFDM modulators offering the possibility to reduced, the power consumption increases with the customize the number of carriers, the cyclic prefix, and area following perfectly the theory. The choice to im- fixed-point. The IP has been characterized in terms plement complex product using only three multipliers of area, speed, and power consumption on a XILINX reduces the number of DSP involved in the implemen- xc7z030 FPGA. Results show a very efficient imple- tation. mentation requiring a reduced number of hardware resources. In the future, additional characterizations will be performed, in particular we will synthesize the VHDL code generated by the proposed IP Generator on ASIC. The synthesis will be performed using Syn- 49 Figure 10: Power consumption characterization of the proposed IP core in term of Dynamic Power. The Dynamic power consumption increases with the number of carriers since the increasing of the required hardware resources. opsis. In addition, we will introduce other modulation gorithm, and fpga implementation, IEEE Trans- schemes for the OFDM carriers. In order to further im- actions on neural networks 14 (2003) 993β1009. prove the performance of the future releases of the IP [6] L. Quitadamo, M. Abbafati, G. Cardarilli, D. Mat- generator, we are considering the hypothesis to imple- tia, F. Cincotti, F. Babiloni, M. Marciani, ment the IFFT architecture presented in [32]. This so- L. Bianchi, Evaluation of the performances of dif- lution will allow reducing the hardware resources in ferent p300 based brainβcomputer interfaces by particular the number of multipliers. This hardware means of the efficiency metric, Journal of neuro- simplification will introduce also a power consump- science methods 203 (2012) 361β368. tion reduction. [7] L. R. Quitadamo, M. G. Marciani, G. C. Cardarilli, L. Bianchi, Describing different brain computer interface systems through a unique model: a uml References implementation, Neuroinformatics 6 (2008) 81β 96. [1] G. Capizzi, S. Coco, G. L. Sciuto, C. Napoli, A [8] G. C. Cardarilli, L. Di Nunzio, R. Fazzolari, M. Re, new iterative fir filter design approach using a F. Silvestri, Improvement of the cardiac oscillator gaussian approximation, IEEE Signal Processing based model for the simulation of bundle branch Letters 25 (2018) 1615β1619. blocks, Human Health Engineering (2020) 165. [2] M. WΓ³zniak, D. PoΕap, R. K. Nowicki, C. Napoli, [9] C. Napoli, G. Pappalardo, E. Tramontana, An G. Pappalardo, E. Tramontana, Novel approach agent-driven semantical identifier using radial toward medical signals classifier, in: 2015 Inter- basis neural networks and reinforcement learn- national Joint Conference on Neural Networks ing, in: Proceedings of the XV Workshop βDagli (IJCNN), IEEE, 2015, pp. 1β7. Oggetti agli Agentiβ, volume 1260, CEUR-WS, [3] G. Capizzi, C. Napoli, L. PaternΓ², An innovative 2014. URL: http://ceur-ws.org/Vol-1260/. hybrid neuro-wavelet method for reconstruction [10] D. PoΕap, M. WoΕΊniak, C. Napoli, E. Tramontana, of missing data in astronomical photometric sur- Real-time cloud-based game management sys- veys, Lecture Notes in Computer Science (includ- tem via cuckoo search algorithm, International ing subseries Lecture Notes in Artificial Intelli- Journal of Electronics and Telecommunications gence and Lecture Notes in Bioinformatics) 7267 61 (2015) 333β338. LNAI (2012) 21β29. [11] F. Beritelli, A. Gallotta, C. Rametta, A dual [4] S. Han, J. Kang, H. Mao, Y. Hu, X. Li, Y. Li, streaming approach for speech quality enhance- D. Xie, H. Luo, S. Yao, Y. Wang, et al., Ese: ment of voip service over 3g networks, in: 2013 Efficient speech recognition engine with sparse 18th International Conference on Digital Signal lstm on fpga, in: Proceedings of the 2017 Processing (DSP), IEEE, 2013, pp. 1β5. ACM/SIGDA International Symposium on Field- [12] F. Beritelli, A. Spadaccini, The role of voice ac- Programmable Gate Arrays, 2017, pp. 75β84. tivity detection in forensic speaker verification, [5] D. Anguita, A. Boni, S. Ridella, A digital archi- in: 2011 17th International Conference on Digi- tecture for support vector machines: theory, al- 50 tal Signal Processing (DSP), IEEE, 2011, pp. 1β6. [24] F. Mazzenga, R. Giuliano, F. Vatalaro, Fttc-based [13] F. Beritelli, A. Spadaccini, A statistical approach fronthaul for 5g dense/ultra-dense access net- to biometric identity verification based on heart work: Performance and costs in realistic scenar- sounds, in: 2010 Fourth International Conference ios, Future Internet 9 (2017) 71. on Emerging Security Information, Systems and [25] I. Pasya, T. Kobayashi, A. Khalid, N. A. Wahab, Technologies, IEEE, 2010, pp. 93β96. A. Rashid, Z. Awang, et al., Target localization in [14] G. Iazeolla, A. Pieroni, A. DβAmbrogio, D. Gianni, mimo ofdm radars adopting adaptive power allo- A distributed approach to wireless system sim- cation among selected sub-carriers, International ulation, in: 2010 Sixth Advanced International Journal on Advanced Science, Engineering and Conference on Telecommunications, IEEE, 2010, Information Technology 7 (????) 291β298. pp. 252β262. [26] B. Prasetya, A. Kurniawan, A. Fahmi, Joint power [15] G. Iazeolla, A. Pieroni, Power management of loading and phase shifting on signal constellation server farms, in: Applied Mechanics and Materi- for transmit power saving on ofdm/ofdma sys- als, volume 492, Trans Tech Publ, 2014, pp. 453β tems, Int. J. Adv. Sci. Eng. Inf. Technol. 8 (2018) 459. 2039β2045. [16] P. N. Whatmough, M. R. Perrett, S. Isam, I. Dar- [27] T. Hwang, C. Yang, G. Wu, S. Li, G. Y. Li, Ofdm wazeh, Vlsi architecture for a reconfigurable and its wireless applications: A survey, IEEE spectrally efficient fdm baseband transmitter, transactions on Vehicular Technology 58 (2008) IEEE Transactions on Circuits and Systems I: 1673β1694. Regular Papers 59 (2012) 1107β1118. [28] T. Hwang, C. Yang, G. Wu, S. Li, G. Y. Li, Ofdm [17] K. Elango, K. Muniandi, Vlsi implementation of and its wireless applications: A survey, IEEE an area and energy efficient fft/ifft core for mimo- transactions on Vehicular Technology 58 (2008) ofdm applications, Annals of Telecommunica- 1673β1694. tions (2019) 1β13. [29] D. E. Knuth, Art of computer programming, [18] G. C. Cardarilli, L. Di Nunzio, R. Fazzolari, D. Gia- volume 2: Seminumerical algorithms, Addison- rdino, M. Matta, M. Re, L. Iess, F. Cialfi, G. De An- Wesley Professional, 2014. gelis, D. Gelfusa, et al., Hardware prototyping [30] S. SpanΓ², G. C. Cardarilli, L. Di Nunzio, R. Fazzo- and validation of a w-πΏdor digital signal proces- lari, D. Giardino, M. Matta, A. Nannarelli, M. Re, sor, Applied Sciences 9 (2019) 2909. An efficient hardware implementation of rein- [19] F. Mazzenga, R. Giuliano, F. Vatalaro, Effective forcement learning: The q-learning algorithm, strategies for gradual copper-to-fiber transition Ieee Access 7 (2019) 186340β186351. in access networks, Computer Networks (2020) [31] F. Silvestri, S. Acciarito, G. C. Cardarilli, G. M. 107225. Khanal, L. Di Nunzio, R. Fazzolari, M. Re, Fpga [20] S. Mukherjee, M. De Sanctis, T. Rossi, E. Cianca, implementation of a low-power qrs extractor, M. Ruggieri, R. Prasad, Mode switching algo- in: International Conference on Applications rithms for dvb-s2 links in w band, in: 2010 IEEE in Electronics Pervading Industry, Environment Global Telecommunications Conference GLOBE- and Society, Springer, 2017, pp. 9β15. COM 2010, IEEE, 2010, pp. 1β5. [32] Y. S. Algnabi, F. A. Aldaamee, R. Teymourzadeh, [21] S. Singh, N. Singh, Internet of things (iot): Se- M. Othman, M. S. Islam, Novel architecture curity challenges, business opportunities & ref- of pipeline radix 2 2 sdf fft based on digit- erence architecture for e-commerce, in: 2015 slicing technique, in: 2012 10th IEEE Interna- International Conference on Green Computing tional Conference on Semiconductor Electronics and Internet of Things (ICGCIoT), IEEE, 2015, pp. (ICSE), IEEE, 2012, pp. 470β474. 1577β1581. [22] R. Giuliano, F. Mazzenga, A. Neri, A. M. Vegni, Security access protocols in iot networks with heterogenous non-ip terminals, in: 2014 IEEE In- ternational Conference on Distributed Comput- ing in Sensor Systems, IEEE, 2014, pp. 257β262. [23] M. Jamal, B. Horia, K. Maria, I. Alexandru, Study of multiple access schemes in 3gpp lte ofdma vs. sc-fdma, in: 2011 International Conference on Applied Electronics, IEEE, 2011, pp. 1β4. 51