# FPGA resource allocation of a DS-CDMA indoor system

X. Revés; A. Gelonch; F. Casadevall

Universitat Politècnica de Catalunya, Dept. of Signal Theory and Communications

#### Jordi Girona 1-3, 08034 Barcelona (Spain)

{xreves, antoni, ferran}@xaloc.upc.es

Software Defined Radios are the trend that can more strongly modify the way personal radio terminals are understood. With this issue a lot of architectural, technological and even philosophical work has to be done. Regarding to physical layers implementation, devices with high computational power and capacity to modify their functional qualities will be very interesting. Following this idea, in the letter a DS-CDMA low cost mobile communications system capable to operate in picocellular (e. g. Indoor) environments is presented. It has been developed by using an FPGA approach and designed to support video, voice and low speed data transfers. Most of the system, including all the digital signal processing from intermediate frequency to base band, has been implemented using a SHaRe platform with nine Xilinx 4K family FPGA chips for a maximum gate count beyond 2 million logic gates.

## I. Introduction

Third generation mobile systems will soon be deployed, likely not providing full featured terminals but feature reduced terminals. The first steps have to be done aside the mature second generation systems (GSM, DECT, TETRA, IS95, PHS, PDC, etc). Then one can expect a transition phase where multimode terminals will be common. Third generation systems (UMTS in Europe) tend to be highly flexible and adaptive transmission schemes to apply the best transmission one for each user requirements, which is a different approach from second generation where terminals were oriented to a concrete scheme and service. Both characteristics of the future terminals, multimode operation and high capacity of adaptation, moreover of other important advanced features, move the concept of radio terminals to the called Software Radios [Taylor97]. The software radio technology is enabled thanks to the advent of fast digital signal processing structures that allow processing signal that some time ago could only be processed by means of analog processing devices. Modifying digital processor parameters many different reception/transmission structures can be implemented through the "simple" software download to terminals even from the air interface.

In a receiver, after analog to digital conversion, which is a key point in software radio terminals, is time to process the raw digital data. Several kind of devices are offered at the market ranging from partially programmable ASICs (Application-Specific Integrated Circuit), that offer a limited set of configurable blocks to cope with different radio interface formats, to the digital signal processors (DSPs) based on a central processing unit (CPU). Both have different areas of application, while the first are mainly devoted to translating signal from IF to base band with the corresponding channel selection, the second ones are used for typical base band tasks of demodulation, decoding, etc. But a third approach capable to handle tasks corresponding to IF processing and base band processing exists. This approach uses FPGAs (Field Programmable Gate Array) [Cummings99] that can handle the high speed demanding operations of channel selection and frequency translation as well as the, in general, more complex tasks related with base band processing, being in high co-operation and co-ordinated with a DSP. Independently of the capacity of FPGAs to handle different types of processing algorithms, FPGAs offer flexibility and high performance at the same time. These features makes of them a good selection for Software Radios.

In this paper we will deal about implementation of digital stages of an indoor DS-CDMA system focusing on particular architectures well suited for FPGA devices, providing significant figures about resource utilisation. The main goal resides in identifying adequate solutions for each constituent part of the system being described in order to minimise the resource allocation.

# **II.** General System Description

The DS-CDMA [Adachi98] indoor system presented here, which works at the 2.4GHz frequency band, has an star architecture where all the terminals interact with its own base station that has the transmission, reception and control capabilities, as user control, channel assignment, code distribution, power control, etc. About the user terminals we distinguish a voice terminal, using DPCM (Differential Pulse Code Modulation) at 32kb/s similar to that used in DECT (Digital European Cordless Telephone), a data terminal at 9.6 kb/s and delays below 30 ms, a video terminal, using standard H261 capable to adapt its rate up to 2Mbits/s depending on the image quality desired.

The CDMA radio link will use traffic channels (TCH) to carry data, voice and video. These channels will be complemented with associated control channels (ACCH) for power control, channel information, status of communication, etc. Other signalling channels used are the pilot channel (PICH), to allow correct synchronism of the mobile to the network, the broadcast paging channel (BPCH), for paging, access to the network or handover, and the random access channel (RACH), to allow the mobile asking a traffic channel.

The parts of this system implemented using an FPGA approach range from the physical channels to intermediate frequency. It does not include channel coding, puncturing, interleaving, etc. of information coming from upper layers. The service that the FPGA-based layer provides is a simple raw service of transport of data over the radio channel. From the functional point of view, we distinguish two kind of connections: Up Link (from the Base Station to the Mobile) and Down Link (from the Mobile to the Base Station).

#### **II.I. Down Link Structure**

Down Link is mainly based on an orthogonal multiplexing of information and control channels using pseudonoise sequences. The sequences used to multiplex channels are called Walsh or Hadamard sequences and those used to isolate adjacent cells are called GOLD sequences. Each user has a maximum of 4 QPSK-CDMA channels each of them at 64kbits/s separated by different Walsh sequences at 1024Mchips/s. This provides a maximum of 256kbis/s per user which is the limit assumed in the system. The base station uses two additional PN sequences (one for the inphase channel and another for the quadrature channel) to reduce intercell interference. These sequences are repeated in frames of 10ms. An additional PN sequence is continuously sent without modulation in the PICH channel using a BPSK. The new generated chip rate is 4096Mchips/s. Since only 32 Walsh sequences are used, each base station can bear up to 64 different channels of 32kbits/s simultaneously assigned to a maximum of 16 different users.

In transmission two separated antennas are used to transmit twice the same signal with a determined delay longer than the CDMA chip resolution between them. In the receiver side a RAKE structure can take advantage of this transmission diversity. To simplify the receiver, a pre-RAKE [Esmailzadeh93] structure is included in the transmitter (base station) for each user.

Figure 1 gives a general view of the blocks implemented using the FPGA approach for the base station transmitter and the mobile receiver. In the transmission section (at the base station) the first step comprises spreading and pre-RAKE for each user. Then the several data flows are added with the corresponding weights to constitute a single data stream, one for the inphase channel and another for the quadrature channel. After adding PICH and BPCH channels data are filtered to produce the correct pulse shape and, at last, an I/Q up converter translates signal to an intermediate frequency of 8192kHz at a sample rate of 32768ksps. In the reception section, after sampling again at fs=32768ksps and moving the signal at an intermediate frequency of fs/4, the complex components of the signal are separated. Then, a frequency correction is applied by means of an Automatic Frequency Control (AFC) that operates directly at base band. The last step is pulse shaping to get a final Nyquist pulse (raised cosine with a 0.313 roll-off factor). Output information from the filtering stage is used to generate synchronism signals for the receiver. At base band, tasks as frame synchronisation, channel estimation (for transmitter pre-RAKE) and Frequency Error Detection (FED) are performed moreover of the corresponding CDMA demodulation which inverts the spread process made in transmission.



Figure 1. Down Link Transmitter and Receiver

#### **II.II. Up Link Structure**

In this case a simple QPSK scheme was used for transmission where the maximum speed for both, the inphase and quadrature channels, was established in 128 kb/s. This data rate was spread using GOLD sequences up to 4096 kchips/s, obtaining a processing gain of 32 when transmitting at the maximum speed (256 kb/s total). To simplify the design an asynchronous access was implemented, then a synchronism stage in the base station receiver must be implemented. In the Figure 2 the FPGA sections of the Up Link transmitter and receiver schemes can be observed.

The mobile terminal provides two independent data channels, I and Q, at 128 kbps each one that are spread by its own pseudorandom sequence (GOLD sequence). The outgoing signal is filtered by the shaping filter which, together with the matched filter located in reception, produce a Nyquist pulse (raised cosine with a 0.5 roll-off factor). After filtering, a QPSK modulation is performed producing a signal sampled at 32768kHz and located at an intermediate frequency of 8192kHz. This frequency is precisely adjusted by the numerically controlled oscillator used to generate the modulating tones, thus providing a simple way to compensate frequency drifts and offsets introduced by RF stages.

In the Figure 2 the base station receiver for a single user appears. As much as 16 receivers like this are required for the whole base station. In the scheme appears the synchronisation and tracking process between the received signal and the pseudonoise sequence generated locally. Moreover, since the indoor channel does not introduce multipath phenomena, the CDMA demodulator can be simplified not being necessary the use of a RAKE structure. The reception chain consist of the QPSK demodulation followed by a filtering process (matched filter) and the final stages of synchronisation, tracking and CDMA demodulation.

The use of the CDMA access technique requires an exhaustive control of the power received from each user terminal because them must show a constant power average. The base station receiver measures the power coming from each terminal and sends a message to it in order to correct its power. This makes unnecessary the use of a AGC in the base station receiver to adapt the incoming signal to the ADC dynamic range. It is also assumed in the receiver scheme developed that all the frequency offsets and frame synchronisation problems have been corrected in the Down Link. About the use of antenna diversity and trying to limit the complexity of the user terminal a conventional procedure of selection is used to manage the signal coming from one antenna or the other.



Figure 2. UpLink Transmitter and Receiver

# **III.** Physical Implementation

The system roughly described above, although represents a simplification with respect to commercial products, is an example that incorporates all the functions that imply a higher signal processing demand. These are intermediate frequency processing and synchronisation algorithms [Chapman00]. The blocks that perform these tasks must use resources optimally for both reduce the amount of resources required and also to reduce the power consumption of the devices. This has been the main goal of the design.

The information of resource allocation is provided in terms of Logic Elements (LE). Here one LE is defined as the composition of one 4-input look up table (LUT), equivalent to a RAM of one bit wide an 16 addresses of depth, and one flip-flop. It should be noted that figures given are approximated because are extracted from CAD tools reports where several blocks are mixed and/or other support functions are considered, like µprocessor access interface. Moreover, the hardware platform used to check the validity of the system is not tuned for this specific design. It is a pre-constructed platform with enough flexibility to accommodate different applications. This platform is called SHaRe (**Re**configurable **Ha**rdware **S**ystem) [Revés99] which provides up to 8 user programmable FPGAs interconnected in a flexible way with access to additional resources like RAM, ROM, FIFOs and programmable clocks. Each one of the FPGAs can be (re)programmed individually at any time by simply performing write cycles to a memory address over the host bus (VME bus). The main processor over the bus controlling the whole system is an Sparc CPU running Solaris. This host processor provides connection to an IP network thus obtaining a higher functionality of the system.

In table 1 the LE utilisation for the whole system is summarised. This is a value that concerns only to the specific part which gives name to the block. The figures show that the complete terminal system would fit into an state-of-the-art FPGA with about 7000 LEs. This is a quite large FPGA, but this amount of LEs as defined can be found into a Xilinx Virtex XCV400 [Xilinx00]. The base station part for a single user would need less than 9000 LEs which are also available into that FPGA, although the increase for more users is not linear as can be observed from the figures of table 1. XCV400 has 9600 LEs and 81920 BlockRAM bits which can be used in this design.

The actual implementation presented is not based on Xilinx Virtex family but on Xilinx 4K family [Xilinx99]. The basic elements of both families are quite similar but Virtex includes more RAM bits, more and faster interconnect logic, more input/output standards, etc., as correspond to a more modern

family. As stated before, the platform used, called SHaRe, has 8 user-programmable FPGAs with a count of up to 50176 LEs (with 8 XC4085 FPGAs on the board), 2 Mbytes of SRAM distributed along the devices, up to 512 Kbytes of FIFOs and up to 512 Kbytes of ROM. The mobile terminal should fit over a board and the base station for one user receiver and 16 transmitters should fit over another board. Every new base station receiver will require a new board. This way the maximum number of resources required over a single board is about 9000. This amount of LEs can be obtained completing a SHaRe board with the smallest devices it can bear, that is XC4013, to achieve the total of 9216 LEs. Although the numbers fit, it is not possible to adjust the design to that level and some room must be considered to install accessories, controls and even leave some resources free to easy the process of mapping of the blocks over the devices. All the boards involved in the design together with the main system controller, a Sparc diskless board for VME bus, are joined over a VME bus to allow a correct configuration and management of the application.

| Down Link                                                      |      | Up Link                                                     |       |
|----------------------------------------------------------------|------|-------------------------------------------------------------|-------|
| Block Description                                              | LEs  | Block Description                                           | LEs   |
| Base station 16 users spreading and pre-<br>RAKE + PICH + BPCH | 1140 | Terminal spreading and TX matched filters                   | 400   |
| Base station I/Q shaping filter and IF translation             | 1090 | Terminal NCO and IF translation                             | 620   |
| Terminal I/Q down conversion                                   | 1060 | Base station 1 user I/Q down conversion and matched filters | 4390  |
| Terminal NCO + FED                                             | 830  | Base station 1 user chip synchronism and tracking           | 1180  |
| Terminal RX matched filters                                    | 1450 | Base station 1 user CDMA demodulation                       | 500   |
| Terminal chip, bit and frame synchronism                       | 1520 |                                                             |       |
| Terminal channel estimation and CDMA demodulation              | 500  |                                                             |       |
| Total Down Link                                                | 7590 | Total Up Link                                               | 7090  |
| Total Base Station 16 users                                    |      |                                                             | 99350 |
| Total Mobile Terminal                                          |      |                                                             | 6380  |

#### Table 1. Number of LEs required to implement system blocks

In general the blocks have been implemented exchanging area for speed. Reducing area at the cost of an increment of frequency clock does not save power consumption. Power consumption is minimised only when the mapping of a function over FPGA resources is optimally done. Power demanded by the FPGA devices depend highly on the physical properties of them but more advanced the technology used less the power required is. Of course, the flexibility provided by FPGAs over ASICs is paid as an increase of power consumption when the FPGA option is selected. By other hand, the amount of LEs stated here are relatively low compared to the availability of them in commercial FPGAs, so a different structure with higher sampling rates can be considered simply increasing area. Note that in the digital implementation of a radio receiver only the very first filtering stages after sampling will have high rates. As soon as the channel of interest has been selected the sampling rate is reduced and/or transformed to be adequate for the required demodulation.

## **IV. Conclusions And Future Work**

There are several ways to implement a re-configurable radio terminal but one that offers a wide application flexibility, ranging from high computational intensive tasks to algorithmically complex but relatively low speed tasks, and at the same time flexibility to modify its behaviour, is the one based on FPGAs. Their increasing capacity and speed, together with the progressive reduction of power consumption make of them a good candidate to occupy relevant positions in future radio terminals designed under the Software Radio methodology. A DS-CDMA system which takes advantage of the peculiarities of FPGAs has been designed to check their adaptation to that kind of application and to explore the pros and cons when designing specific parts. Like in many digital systems here appears a trade-off between speed and logic resources used: higher is the speed required, higher are the resources required to implement the functions. Designing with FPGA pre-defined structures tend to modify the way the different blocks are implemented to efficiently use resources. This, in general, makes impact on the complexity of solution. Then, it is important to identify clearly the blocks which require a more accurate tuning and get a set of possible solutions that may fit in many different applications. Also it is important to investigate which are the more interesting architectures for FPGA arrays plus complements (RAM, I/O, etc.) to extend its flexibility and reusability.

An interesting investigation line comes to the surface when typical signal processing structures are going to be implemented over FPGA. Finding the right mechanism will improve final design in terms of die and power consumption. All that has an interesting application when considering restrictions related to fully digital radio terminals because a better adaptation to FPGA structure will allow to implement more computationally intensive tasks. An UMTS physical layer implementation based on FPGAs is a good testbed to analyse which are the most appropriate solutions to get a re-configurable fully digital terminal. Of course AD and DA technology has a lot to say about that, but it is clear that optimum implementation of digital processing blocks (whatever algorithm is used) is a cornerstone.

## References

- [**Taylor97**] C. Taylor, "Using Software Radio in 3<sup>rd</sup> Generation Communications System", ACTS Mobile Communications Summit, Aalborg, Denmark, Oct. 1997.
- [Cummings99] M. Cummings, S. Haruyama. "FPGA in the Software Radio". IEEE Communications Magazine. February 1999.
- [Adachi98] F. Adachi, M. Sawahashi, H. Suda, "Wideband DS-CDMA for Next-Generation Mobile Communications Systems", IEEE Communications Magazine, September 1998.
- [Chapman00] K. Chapman, P. Hardy, A. Miller, M. George, "CDMA Matched Filter Implementation in Virtex Devices", Xilinx XAPP212(v1.0), March 2000.
- [Esmailzadeh93] R. Esmailzadeh, M. Nakagawa, "Pre-RAKE Diversity Combination for Direct Sequence Spread Spectrum Mobile Communications Systems", *IEICE Transactions on Communications*, vol. E76-B, no. 8, pp. 1008-1015, August 1993.
- [**Revés99**] X. Revés, A. Gelonch, F. Casadevall, "Reconfigurable Hardware Platform for Software Radio Applications (SHaRe) in Mobile Communications Environments." Proc. ACTS Mobile Communication Summit, Sorrento, Italy, June 1999.

[Xilinx00] XILINX Virtex<sup>™</sup> 2.5 V Field Programmable Gate Arrays. 2000.

[Xilinx99] XILINX XC4000E XC4000X Series Field Programmable Gate Arrays. 1999.

Acknowledgement: This work has been supported by CYCIT (Spanish National Science Council) under grant TIC98-0684.