# OFDM Baseband Transmitter Implementation Compliant IEEE Std 802.16d on FPGA

Shahid Abbas, Studend Member, IEEE, Waqas Ali Khan, Talha Ali Khan and Saba Ahmed

Abstract — Broadband Wireless Access (BWA) is a promising technology which can offer high speed voice, video and internet connection. The leading candidate for BWA is WiMAX, a technology that complies with the IEEE 802.16 family of standards. This paper is focused towards the hardware Implementation of WirelessMAN-OFDM Physical Layer of IEEE Std 802.16d Transmitter on FPGA. The RTL coding style of Verilog HDL was used which gave a high level design-flow for developing and validating communication system protocols and provides flexibility of modifications in future in order to meet real world performance evaluation. The proposed design is fully supportive to adaptive modulation schemes described in IEEE Std 802.16d and equipped with soft interfaces for MAC layer and RF-front end, so that in future more work could be done in order to deploy complete WiMAX CPE IP core.

*Keywords* — WiMAX, IEEE Std 802.16d, OFDM, PHY Layer.

# I. INTRODUCTION

This paper elaborates the hardware design strategies to model OFDM baseband Transmitter PHY of IEEE Std 802.16d,approved by IEEE-Standards Association on June 24, 2004 to consolidate IEEE Std 802.16-2001, IEEE Std 802.16a<sup>TM</sup>-2003, and IEEE Std 802.16c<sup>TM</sup>-2002 [1]. The revised system aims on describing MAC and multiple physical layer specifications of fixed BWA systems [2]. WiMAX is the name given to the products based on IEEE Std 802.16 protocol. The arrival of the WiMAX has made long-range wireless network communication (up to 40km) a reality. Its high-speed voice, video and data services become an alternative to 3G [3], and compromise between high speed data transmission and mobility was achieved

This paper covers the OFDM PHY details and its architectural view to model a real-time hardware. The text is fully supportive to adaptive modulation and coding schemes described in the OFDM WiMAX standard with channel bandwidth selection of 1.75, 3.5, 7 and 14 MHz and CP time of 16 samples. The minimum and maximum data rates achieved with these specs were 5.64 Mbps and 50.82 Mbps respectively which require a maximum of 6.55 MHz clock for processing. Altera Cyclone II EP2C35F672C6 FPGA chip on DE2 Development kit was used for the purpose of hardware synthesis.

This document grasps an overview of OFDM PHY compliant IEEE Std 802.16d and its FPGA implementation under section II. Section III provides data rate calculations, required system clock and hardware synthesis results.

#### II. WIMAX OFDM PHY & FPGA IMPLEMENTATION

The OFDM PHY of IEEE 802.16d basically consists of three processes namely channel coding, modulation and OFDM as shown in Fig. 1. Each of these comprises of certain internal processes depending upon specific coding schemes. This sequence of steps is employed at transmitter while they are applied in the reverse order at reception [4].



Fig. 1. PHY Processing Sequence [1].

## A. Channel Coding

It is a set of processes by which one can make the signal secure while transmitting through a physical channel. In the proposed design, channel coding typically comprises of three steps [1] as shown in Fig. 2.



Fig. 2. Channel Coding [5].

#### 1) Data Scrambling

A 15-bit PRBS generator having polynomial,  $1+x^{14} + x^{15}$  was implemented to produce scrambled data bits. The seed value shown in Fig. 3 shall be used to calculate the scrambled bits [1]. The DIUC value is simply the Rate ID values as mentioned in table 224 of section 8.3.3.4.3 of IEEE Std 802.16d.



Fig. 3. Scrambler Downlink Initialization Vector [1].

FPGA implementation of scrambler is capable of performing scrambling of 8-bits at a time, making system to work faster 7 times with no extra hardware as the next state of LFSR register was defined by the XORed shifting of LSBs to eight MSBs position. Hardware architecture is given in Fig. 4, showing I/O interfaces of the module.

Shahid Abbas, Waqas Ali khan and Talha Ali Khan are the students of Final Year in Department of Electronic Engineering, NED University of Engineering and Technology, Karachi- 75270, Pakistan, (email: shahid.nedian@hotmail.com, talha080@hotmail.com ).

Saba Ahmed is Assistant Professor in Telecommunications Department, NED University of Engineering and Technology, Karachi-75270, Pakistan, (email: <u>sabaa@neduet.edu.pk</u>).



# 2) FEC

It is the process employed to detect and correct errors without retransmission [6]. For OFDM PHY, the FEC is accomplished by the concatenation of Reed-Solomon outer code and a rate compatible Convolution inner code shown in Fig. 5. WiMAX supports adaptive PHY, hence different modulation and coding schemes could be employed depending upon channel conditions [7], details of whose are given in table 1.



Fig. 5. OFDM PHY FEC Scheme [5].

## a) Reed-Solomon Coding

RS-Encoding is specified by adding parity symbols to the original data packet. If data packet contains k-symbols each of m-bit(s) then encoded version is determined by RS(n,k,t), with n is length of coded block and t is the maximum number of symbols that can be corrected. RS-Encoding governs the relations (1), (2) and (3) [8];

$$n = 2^m - 1 \tag{1}$$

$$k = 2^m - 1 - 2t (2)$$

$$t = (n - k)/2 \tag{3}$$

In the proposed design, RS-code shall be derived from a systematic RS (N = 255, K = 239, T = 8) code using GF ( $2^8$ ), primitive and generator polynomial depicted in (4) and (5) respectively [1].

$$p(x) = x^8 + x^4 + x^3 + x^2 + 1$$
 (4)

$$g(x) = (x + \Lambda^0)(x + \Lambda^1)...(x + \Lambda^{2T-1}); \ \Lambda = 02 \ Hex \ (5)$$

In this project the RS-encoder was generated using Altera Quartus-II 9.0 mega function wizard and was wrapped in a top-level file to introduce variable coding rates depicted in table 1 The I/O data path is 8-bit wide. The numcheck signal specifies the desired number of parity symbols. The operation is completed in minimum 14 and maximum 124 clock cycles. The high-level hardware design is shown in Fig. 6.



Fig. 6. Reed-Solomon Top-Level Module Design.

# b) Convolutional Encoding

In OFDM PHY, each RS encoded data is further encoded by convolutional encoder with a native rate of 1/2, a constraint length equal to 7. The output code bits are generated by generator polynomial defined in (6) and (7);

$$G_1 = 171_{oct} \ FOR X \tag{6}$$

$$G_1 = 133_{oct} \ FOR \ Y \tag{7}$$

Four coding rates are given in table 1, which are achieved by puncturing patterns given in table 2 hence four convolutional encoders were implemented, each having input data path 8, 16, 24 and 40 bits.

| TABLE 2. PUNCTURING PATTERNS FOR CC-CODE RATES [1]. |               |             |                |                      |
|-----------------------------------------------------|---------------|-------------|----------------|----------------------|
|                                                     | CC-Code Rates |             |                |                      |
| Rate                                                | 1/2           | 2/3         | 3/4            | 5/6                  |
| X                                                   | 1             | 10          | 101            | 10101                |
| Y                                                   | 1             | 11          | 110            | 11010                |
| XY                                                  | $X_1Y_1$      | $X_1Y_1Y_2$ | $X_1Y_1Y_2X_3$ | $X_1Y_1Y_2X_3Y_4X_5$ |

These four encoders were enclosed in a top-level file, contain input and output data buffers and selects only one encoder at a time indicated by rate\_id input. The input data path is 8-bit while output is 48-bit wide. The whole process takes at least 16 and at most 244 clocks. The hardware view is shown in Fig. 7.



Fig. 7. Convolutional-Encoder Top-Level Module

TABLE 1. MANDATORY CHANNEL CODING PER MODULATION [1].

| Modulation | Un-coded block<br>size (bytes) | Coded block<br>Size (bytes) | Overall rate | RS-Code     | CC Code rate |
|------------|--------------------------------|-----------------------------|--------------|-------------|--------------|
| BPSK       | 12                             | 24                          | 1/2          | (12,12,0)   | 1/2          |
| QPSK       | 24                             | 48                          | 1/2          | (32,24,4)   | 2/3          |
| QPSK       | 36                             | 48                          | 3/4          | (40,36,2)   | 5/6          |
| 16-QAM     | 48                             | 96                          | 1/2          | (64,48,8)   | 2/3          |
| 16-QAM     | 72                             | 96                          | 3/4          | (80,72,4)   | 5/6          |
| 64-QAM     | 96                             | 144                         | 2/3          | (108,96,6)  | 3/4          |
| 64-QAM     | 108                            | 144                         | 3/4          | (120,108,6) | 5/6          |

# 3) Interleaving

Let  $N_{cpc}$  be the number of coded bits per subcarrier, i.e., 1, 2, 4 or 6 for BPSK, QPSK, 16-QAM, or 64-QAM respectively. Let  $s = ceil (N_{cpc}/2)$ . Within a block of  $N_{cbps}$  bits at transmission, let k be the index of the coded bit before the first permutation;  $m_k$  be the index of that coded bit after the first and before the second permutation and let  $J_k$  be the index after the second permutation. The first and second permutation is defined by (8) and (9) respectively;

$$m_k = \left(\frac{N_{cbps}}{12}\right) \cdot k_{mod12} + floor(\frac{k}{12}) \tag{8}$$

$$J_k = s. floor\left(\frac{m_k}{s}\right) + (m_k + N_{cbps} - floor(\frac{12.m_k}{N_{cbps}}))_{mod(s)}$$
(9)

Where  $N_{cbps} = 192$ , 384, 768 and 1152 called number of coded bits per OFDM symbol for BPSK, QPSK, 16-QAM and 64-QAM respectively [1].

Interleaver was designed by an array of 1152, 1-bit registers. The input data bus is 48-bit which takes minimum of 4 and maximum of 24 clock cycles. Writing address of array is generated inside the module which takes decision on the value of rate\_id input. There are 4-output data buses, corresponds to provide data to BPSK, QPSK, 16-QAM and 64-QAM modulation mappers, having widths 1, 2, 4 and 6 bit(s) respectively. As there are 192-data carriers, so for output 192 clock cycles are needed, hence operation of interleaver takes minimum of 196 and maximum of 216 cycles. Hardware architectural view is presented in Fig. 8. All the modules of channel coding are wrapped up under a top-level file.



Fig. 8. High-Level View of Interleaver.

#### B. Digital Modulation

OFDM PHY is adaptive and supports BPSK, QPSK, 16-QAM and 64-QAM for data carriers' modulation. The constellation diagrams are gray mapped, shows the magnitudes I and Q components of each incoming bit(s) as given in (10) Table 3 gives the Q1.14 representation of I and Q magnitudes using (10), which are 16-bit wide [9];

Mag. Of Carrier = (value of the carrier at that point)\*C (10)

Where C = normalization factor, its value is 1,  $1/\sqrt{2}$ ,  $1/\sqrt{10}$  and  $1/\sqrt{42}$  for BPSK, QPSK, 16-QAM and 64-QAM respectively [1].

The modulation mappers were modeled as ROMs in which 16-bit hex numbers were saved, outputted for respective combinations of input bits. Two ROMs were employed for each mapper, one for I part and other for Q. Thus total of 8 ROMs were wrapped up in 4 sets corresponds to complete mapper of any scheme, having different input data bus widths and number of locations but same 16-bit wide output path.

TABLE 3. Q1.14 FIXED POINT FORMAT OF I & Q MAGNITUDES

| MODULATIO<br>N | CARRIER<br>VALUE | MAGNITUDE    | HEX (16-BIT) |
|----------------|------------------|--------------|--------------|
| DDCI           | 1                | 1            | 4000         |
| BLSK           | -1               | -1           | C000         |
| QPSK           | 1                | 0.707106781  | 2D41         |
|                | -1               | -0.707106781 | D2BF         |
| 16-QAM         | 1                | 0.316227766  | 143D         |
|                | -1               | -0.316227766 | EBC3         |
|                | 3                | 0.948683298  | 3CB7         |
|                | -3               | -0.948683298 | C349         |
| 64-QAM         | 1                | 0.15430335   | 09E0         |
|                | 3                | 0.46291005   | 1DA0         |
|                | 5                | 0.77151675   | 3160         |
|                | 7                | 1.08012345   | 4520         |
|                | -1               | -0.15430335  | F620         |
|                | -3               | -0.46291005  | E260         |
|                | -5               | -0.77151675  | CEA0         |
|                | -7               | -1.08012345  | BAE0         |

The rate\_id input selects one of the four output lines The top level design is shown in Fig. 9.



Fig. 9. Hardware Realization of Mapper

The I and Q parts each of 192 data subcarriers are fed into OFDM symbol assembler which inserts pilot, DC and guard carriers to make total of 256 carriers for OFDM realization.

# C. Orthogonal Frequency Division Multiplexing

IFFT of magnitude N, applied on N symbols, realizes an OFDM signal [10]. The IFFT takes frequency domain spectrum X(k) and converts it to time domain signal x(n)by successively multiplying it by a range of sinusoids as given by (11);

$$x(n) = \sum_{n=0}^{N-1} X(k) \sin\left(\frac{2\pi kn}{N}\right) - j \sum_{n=0}^{N-1} X(k) \cos(\frac{2\pi kn}{N})$$
(11)

Where k = 0 to *N*-1 and *N* = 256 [11].

#### 1) OFDM Symbol Assembler

An OFDM symbol is made up of three types of basic sub-carriers [3];

- 192 Data subcarriers: Frequency indices; -100 to -1 and +1 to +100 (except at pilot positions).
- 8 Pilot subcarriers (BPSK modulated): Frequency indices; -88, -63, -38, -13, +13, +38, +63 and +88.
- 56 Null subcarriers: Frequency indices; DC-carrier at 0; Lower guard from -128 to -101; upper guard from +101 to +127.

Pilots are generated by PRBS generator using

polynomial  $x^{11} + x^9 + 1$ , which produces a sequence  $W_k$  at its LSB and denotes the OFDM symbol number in the current frame[1]. Pilots are BPSK modulated and their I magnitudes are given by  $1 - 2W_k$  and  $1 - 2\overline{W}_k$  for carrier indices -88, -38, 63, 88 and -63, -13, 13, 38 respectively. DC and guard are null carriers with zero magnitudes. The OFDM symbol is assembled inside a module comprises of pilot generator module and two 256x16 RAMs, each for I and Q parts to accommodate on specified locations correspond to frequency indices[12]. The input data line of each RAMs selects either data, pilot, DC or guard carriers. The system is designed in a way that pilot, DC and guard carriers are already present in the RAMs, and just after 2 clocks of 1<sup>st</sup> input data sample, output is available.

# 2) The IFFT Module

The IFFT module is fed by the OFDM symbol assembler with 256-complex samples. The IFFT module was generated using Altera Quartus-II 9.0 mega function wizard and wrapped in a top-level file to produce sop, eop and source ena signals internally in the same way as for RS-encoder. It takes minimum of 512 clocks to complete the processing. The output data is fed to CP generator which introduces the redundancy to the OFDM symbol thus act as a protection from inter symbol interference.

# 3) CP Generator

A copy of the last  $N_{IFFT} * 1/G$  samples is appended to the beginning of the symbol, which is termed as CP and increases symbol duration hence multipath is achieved [12]. The value of G is taken as 1/16 which represents a moderate channel, thus the term  $N_{IFFT} * \frac{1}{G}$  comes out to be 16 and the overall symbol length becomes 272-complex samples. The hardware structure is simply achieved by designing two 256x16 RAMs, in which first whole 256complex carriers are written but reading is started from 239 to 255 and then 0 to 255 RAM locations. The whole process takes about 532 clocks.

This is all about the hardware implementation of IEEE Std 802.16d OFDM PHY on FPGA, next section would present various results of the project.

#### **III. RESULTS**

#### A. Data Rate Calculation

Data rate = un-coded bits / OFDM symbol Duration (12)

$$T_{s} = [1/(nBW)/N_{IFFT}](1+G)$$
(13)

Where  $T_s = OFDM$  symbol Duration;

n = 8/7 called sampling factor;

BW = Channel Bandwidth; in this design it was taken as 1.75, 3.5, 7 and 14 MHz;

 $N_{IFFT} = 256$  called IFFT points;

G = 1/16; called ratio of CP time to useful symbol time.

Using (12) and (13) for each modulation and coding scheme depicted in table 1, the minimum and maximum data rates were found to be 5.64 Mbps and 50.82 Mbps for BPSK 1/2 and 64-QAM 3/4 respectively that require 6.55 MHz clock which was easily achieved by scaling down the oscillator of frequency 50 MHz on Altera DE2 FPGA

development kit. Timing analysis suggests 109.26 MHz as maximum attainable speed for the implemented design.

#### Hardware Resources R

Synthesis of the design using Altera Quartus II 9.0 Web Edition software suggests device resource summary for Altera Cyclone II EP2C35F672C6 FPGA chip shown in table 4.

| Module           | LC<br>Combination | LC<br>Registers | Memory<br>Bits | DSP Mult-<br>ipliers |
|------------------|-------------------|-----------------|----------------|----------------------|
| Channel<br>coder | 3007              | 2123            | 1152           | 0                    |
| Mapper           | 7                 | 8               | 2048           | 0                    |
| OFDM             | 3864              | 4386            | 56577          | 18                   |
| Total            | 6878              | 6517            | 59777          | 18                   |

#### IV. CONCLUSION

Simulations on ModelSim-Altera 6.4a (Ouartus II 9.0) Starter Edition were fully compliant with the IEEE test data provided in the standard. For hardware prototyping of the design, Visual Basic application software was developed which provides GUI for data sending and receiving to FPGA Chip through UART interface. As FPGA platform provides a flexible design approach so this work could open up the doors for many other projects as one can deploy complete WiMAX CPE if MAC and RF front end for Rx and Tx would properly designed and integrated.

#### V. REFERENCES

- [1] IEEE Standard for Local and Metropoliton Area Networks. Part 1 : Air Interface for Fixed Broadband Wireless Access Systems. New York, USA : s.n., October 1, 2004.
- Jordan Douglas Guffey. OFDM Physical Layer Implementation for [2] the Kansas University Agile Radio. University of Kansas. Kansas : s.n., 2008. Technical Report.
- [3] Wimax-speed. wikimedia.org. [Online] [Cited: September 5, 2009.] http://commons.wikimedia.org/wiki/File:Wimax-speed.jpg.
- Lili Zhang. A study of IEEE 802.16a OFDM-PHY Baseband. [4] Electrical Engineering, Linköping Institute of Technology. Linköping : s.n., 2005. Master thesis in Electronics Systems. LiTH-ISY-EX--05/3627--SE.
- Loutfi Nuaymi. WiMAX: Technology for Broadband Wireless [5] Access. ENST Bretagne : John Wiley & Sons Ltd, 2007. p. 310.
- [6] Andy Bateman. Digital Communication - Design for the Real World. s.l.: Addison Wesley Longman Ltd., 1999.
- Mohammad Azizul Hasan. Performance Evaluation of [7] WiMAX/IEEE 802.16 OFDM Physical Layer. Department of Electrical and Communications Engineering , Helsinki University of Technology . Espoo : s.n., 2007. Master Thesis.
- [8] Bernard Sklar. Digital Communications Fundamentals and Applications. 2nd. Los Angeles : Pearson Education, Inc.
- [9] MATLAB 7.8.0 (R2009a), Help. s.l. : MathWorks, Inc., 2009.
- [10] van Nee, R. and Prasad, R. OFDM for Wireless Multimedia Communications. s.l. : Artech House, 2000.
- [11] Charan Langton. OFDM. Intuitive Guide to Principles of Communications. [Online] http://www.complextoreal.com/
- [12] Amalia Roca Persiva . Implementation of a WiMAX simulator in Simulink. Institute of Communications & Radio-Frequency Engineering, Vienna University of Technology. Vienna : s.n., 2007. Master Thesis.