Steve Parker and Edward Fry, RF Engines Ltd
Newport, Isle of Wight, United Kingdom
Abstract
This paper discusses the relative merits of the various digital signal processing techniques used to channelise signals. ChannelCore Flex (CCF) exploits all of these strengths to provide a flexible channeliser architecture that is capable of supporting thousands of independently defined channels in a single FPGA. The CCF core can be tailored at build-time to support the user’s generic channel plan and required level of flexibility. The precise channel plan can then be loaded and updated at run-time. The FPGA resources required to implement CCF in a Xilinx Spartan-6 LX100 are presented for an example channel plan with 1024 channels of various bandwidths.
1. INTRODUCTION
Modern communication signals are restricted to tightly regulated channels to maximise the social and economic benefits of the radio spectrum. Figure 1 shows how the spectrum in the USA is partitioned (defined 2003, [1]), supporting a plethora of wireless communication services. In the future, the radio spectrum is likely to become increasingly congested with active transmissions.
Figure 1: the complexity of radio spectrum partitioning (USA, defined 2003 [1])
The bandwidth of many communication signals is increasing with each emerging new standard, to achieve ever higher data rates. These high bandwidth signals may co-exist with lower rate transmissions that occupy adjacent regions of the spectrum. Furthermore, signal bandwidths may be dynamically reconfigured, governed by the particular services requested by a user. In the future, the dynamics of the spectrum may become increasingly complicated by devices that monitor transmissions and transmit opportunistically as regions of the spectrum become inactive. For example, cognitive radio has been proposed as a potential technology for LTE Advanced [2].
The complexity and dynamics of how signals are organised within the radio spectrum, dictates the receiver technology needed to recover them. A contemporary digital radio receiver must channelise the received band to isolate the frequencies of interest, thereby minimising noise and interference, prior to demodulation and decoding.
Surveillance systems, and emerging communication transceivers that support services carried by multiple wireless standards, need to independently channelise a large number of signals that potentially have very different properties. Dedicating a receiver chain to each channel is prohibitively expensive in hardware resources. Consequently, there is a need for a flexible channeliser architecture that can channelise an input bandwidth, spanning potentially in excess of a gigahertz, according to an arbitrary channel plan consisting of as many as several thousand channels.
An ideal channeliser should be capable of being configured according to a channel plan that defines the individual channels by their centre frequencies, signal bandwidths, output sample rates, required filter characteristics and gains. In addition, the channeliser should be designed so that the stop-band performance and bit-width are sufficient to guarantee a specified spurious-free dynamic range (SFDR). For some applications this may be as large as 80dB.
2. CHANNELISER TECHNOLOGY
2.1 Processing a Single Channel
Many legacy wireless terminals operate using a single channel. These systems adopt a simple receiver architecture that mixes the RF or IF signal down to complex baseband, consisting of in-phase (I) and quadrature (Q) components. Typically the receiver architecture consists of an analogue RF down-converter, analogue-to-digital converter (ADC) and a digital down-converter (DDC).
An input signal may be real valued, such as from a single ADC, or complex valued if it represents both I and Q components from a previous mixing stage. A real valued input carrier signal s_{r} may be defined mathematically by Equation 1, where a is the signal amplitude, f_{c} is the carrier frequency, and an arbitrary phase offset has been ignored for these illustrative purposes.
Equation 1: real input signal
A mixing signal m, of frequency f_{m }and amplitude b , is given by Equation 2, where
Equation 2: complex mix signal
It may be shown that the mixed signal is given by Equation 3.
Equation 3: result of mix on real input
In contrast, if the input signal is complex (Equation 4) then the mixed signal is given by Equation 5.
Equation 4: complex input signal
Equation 5: result of mix on complex input
Consequently, when a real input is applied then an additional term is generated that is located at the sum of the input and mix frequencies. This unwanted term must be removed by digital filtering.
A further complication that must be considered is that the signal bandwidth must not exceed f_{s }/ 2 to prevent aliasing, where f_{s }is the sampling rate. It is therefore common to arrange that f_{c }, which is nominally the centre of the channelised band, is located close to f_{s }/ 4 or 3 f_{s }/ 4 . A selectable mix frequency of - f_{s }/ 4 (first Nyquist) or -3 f_{s }/ 4 (second Nyquist) may then be used to centre the channelised band on DC. The choice of these particular mix frequencies facilitates conversion to complex baseband using a very simple poly-phase filter structure.
Figure 2 shows how the DDC can be implemented for a real input, by multiplying by a complex exponential of frequency - f_{s }/ 4 (or -3 f_{s }/ 4).
Figure 2: DDC architecture
A low-pass filter, which is integrated within the optimised poly-phase structure, is used at the output of the DDC to remove the unwanted term of Equation 3. The filter bandwidth is typically 80% of the Nyquist zone, i.e. 0.8x , which is matched to the pass-band of the front-end anti-aliasing filter. The resulting selectable alias-free bandwidth is therefore either 0.05 fs to 0.45 fs, or 0.55 fs to 0.95 fs . The filter is designed to ensure that the stop-band response guarantees that the pass-band is alias-free to the specified SFDR.
With a single channel system, the centre of the down-converted channel is unlikely to be aligned to the signal of interest. The complex valued signal may therefore be mixed, as shown by Equation 5, to perform a fine frequency alignment, without the danger of introducing additional terms that need filtering.
Finally, the data stream is filtered to the required bandwidth; and resampled to align the output sample rate to that required by subsequent processing stages, such as a demodulator. For example, with a communications’ system the sample rate is often an integer multiple of the symbol rate.
2.2 Generating a Large Number of Channels
2.2.1 Channelising by a Parallel Bank of DDCs
A small number of channels (<<100) may be generated within a single FPGA using a parallel bank of DDCs. Figure 3 illustrates the process performed by one of these DDCs on a set of channelised input signals. A DDC architecture provides full control of the low-pass filter response and therefore each channel can have a bespoke bandwidth and pass-band ripple as shown in Figure 4.
Figure 3: channelisation by a bank of DDCs
Figure 4: channelisation using DDCs gives full control over the channel bandwidth and ripple
2.2.2 DFT and FFT Channelisers
It is often impractical to generate the required number of channels using a bank of DDCs. For example, orthogonal frequency division multiplexing (OFDM) typically uses at least 64 sub-carriers e.g. wireless local area networks conforming to IEEE 802.11a/g use 64 sub-carriers.
The Discrete Fourier Transform (DFT), defined by Equation 6, provides a means of generating a potentially large number of equi-spaced channels X(k) from a discretely sampled time-domain input x(n) . Inspection of Equation 6 reveals that the DFT is essentially a correlation of the signal against a finite number of complex exponentials of equally spaced frequencies. If a narrowband signal is of infinite extent, and has a frequency matching one of these exponentials, then a strong output is produced in only one of the output channels. In practice, a signal is of finite extent, and/or has a frequency different from the complex exponentials. Consequently, spectral leakage occurs and outputs are produced in multiple DFT channels.
Equation 6: discrete Fourier transform
The DFT is prohibitively complex to implement for a large number of channels. However, in 1965 Cooley and Tukey [3] identified that, for powers-of-two numbers of channels, the symmetry of the DFT can be exploited to minimise the number of processing operations needed. Their radix-2 algorithm, known as the Fast Fourier Transform (FFT), can be implemented efficiently in an FPGA.
In general, a high radix DFT may be factored into several smaller radices, which can be implemented using a cascade of smaller radix DFTs. When a highly parallelised pipelined architecture is used, so that several spectral outputs are computed on every FPGA clock cycle, multi giga-sample/s channelised output sample rates can be achieved.
Despite the elegance of the FFT, for efficiently computing a DFT, channelisers based solely on this concept have several limitations:
1. The pass-band channel response is not flat and signals ‘leak’ into adjacent channels.
2. Channels are equi-spaced.
3. The channels have an equal bandwidth, which decreases with the number of channels.
The first of these failings may be addressed by windowing and use of the Weighted OverLap and Add (WOLA) pre-processor (see sections 2.2.3 and 2.2.4). The other two limitations may be addressed by the Pipelined Frequency Transform (PFT), which is presented in section 2.2.5.
2.2.3 Windowed FFT
The pass-band droop of a channel is a consequence of the finite extent of the time-domain ‘analysis window’. The analysis window is naturally rectangular in shape (Dirichlet function) and, as shown by the convolution theorem [4]), the multiplication of this window with the signal leads to convolution of the required spectrum with a ‘sinc’ (=sin x/x) function (Fourier transform of the rectangular window). Hence, when a sinusoidal carrier is processed by an FFT channeliser, the response obtained is dependent on its frequency relative to the centres of the FFT channels.
The red chained line in Figure 5 shows how the frequency response of a channel varies with the frequency of an injected carrier. The passband ‘droop’ across FFT channel bin 0, as measured at bin 0.5, is an appreciable 3.92 dB. Furthermore, there is spectral leakage of -13.3 dB into adjacent channels. Clearly this response is far from the ideal ‘brick wall’ response desired for analysing narrowband signals and is much worse than achieved with a bank of DDCs (Figure 4).
Figure 5: impact of windowing on the frequency response of an FFT channeliser
The channel pass-band and roll-off characteristics may be improved by weighting the data in the analysis window by an explicit window function. The window function may be designed so that its impulse response, which governs its pass-band and stop-band characteristics, is functionally closer to a brick wall response. The blue line in Figure 5 shows that a Hamming window can improve the response, with the pass-band droop reduced to 1.68 dB and the stop-band now at less than -40 dB. However, leakage into adjacent channels has increased. In conclusion, the performance of a windowed FFT is not as good as a bank of DDCs and may not be good enough for many applications.
Despite these limitations, the FFT has been widely adopted in hardware for implementing the now ubiquitous OFDM communication systems. This application is attractive because the orthogonal sub-carriers are aligned to the centres of channels and therefore the impact of pass-band droop and stop-band leakage is negligible.
2.2.4 WOLA FFT Channeliser
Conventional windowing prior to the FFT offers only a limited improvement in the pass-band droop. An analysis window of duration has an impulse response of extent in the frequency-domain, which limits the steepness of the roll-off, irrespective of the function used.
The roll-off may be steepened, and the pass-band flattened, by increasing the duration over which the window acts through the use of a WOLA ‘pre-processor’. The principles of the WOLA are illustrated in Figure 6 and a theoretical explanation is given in [5].
Figure 6: operation of the WOLA FFT
In the WOLA architecture, a window function is used that spans several analysis periods (frames) of a traditional FFT. The time-domain samples are weighted by this window and then the corresponding samples of the multiple analysis frames are summed as shown in Figure 6. A composite input, the length of a single analysis frame, is produced that is input to a ‘conventional’ FFT.
From the perspective of the FFT, which acts over the duration of a single analysis frame, the frames that are overlapped and summed are time-domain aliases. The output of the WOLA is a set of equi-spaced channels of identical frequency resolution to a standard FFT, but with a channel response associated with a much longer analysis window. Figure 7 illustrates the channel response possible with a WOLA, where a 32-point FFT has been used and several adjacent channels are shown to illustrate overlap. Comparison with Figure 4 shows that the WOLA delivers a channel response on a par with a DDC, but with the potential of serving thousands of channels.
Figure 7: WOLA/DFT channel responses
2.2.5 Pipelined Frequency Transform
The WOLA FFT is a powerful architecture for generating potentially thousands of equi-spaced channels, with identical bandwidths and a common frequency response that can be tailored to an application. However, despite these merits, the WOLA FFT is a uniformly spaced filter bank that does not generate channels of arbitrary bandwidth. For example, the WOLA FFT may produce 512 x 1 MHz channels, but a requirement might be 512 x 250 kHz, 64 x 3 MHz and 4 x 30 MHz channels. The aggregate bandwidth of the required channels is less than 512 MHz, but the channel plan requires a much more flexible architecture than offered by the WOLA FFT.
A PFT channelises signals in a completely different manner to an FFT and therefore provides related advantages and disadvantages. The PFT channelises signals hierarchically, with each successive stage performing both an f_{s} / 8 up and down mix, relative to a two-times oversampled input f_{s}. Figure 8 illustrates the band-splitting of a PFT, where Fs= f_{s} / 2 refers to a critically sampled input.
Figure 8: hierarchical band spitting of a PFT
The result of cascading the band-splitters is that a tree of channels is produced, with the important property that access is provided to every level of the channelised hierarchy. The symmetry of the PFT architecture, and the underlying mathematics, mean that it may be implemented efficiently in hardware. Consequently, while not as efficient as the FFT for large numbers of channels, the PFT does provide many of the advantages associated with a bank of DDCs without the associated hardware complexity.
3. CHANNELCORE FLEX
3.1 Concept and Implementation
The CCF channeliser is the hardware efficient fusion of the best aspects of all of the channeliser technologies discussed in section 2, partnered with a resource optimised resampler. CCF is not a fixed architecture, but rather a range of potential architectures that are configured at build-time and run-time to yield a channeliser that minimises resource usage for a customer’s specific requirement. A functionally equivalent diagram of CCF is shown in Figure 9.
Figure 9: functionality of Channel Core Flex
If the customer requires a channeliser that serves a static channel plan then the design can be highly optimised and the number of supported channels can be maximised. Conversely, if the customer requires considerable run-time flexibility then an architecture is auto-designed at build-time that is more flexible, but may be slightly less efficient.
The CCF architecture is controlled by a Matlab design script that configures the VHDL modules at build-time according to the user’s generic channel plan. At run-time, the core accepts the user’s specific channel plan that may be updated continuously. The channel plan is expressed in terms of channel centre frequency, bandwidth, sampling rate, gain and filter requirements. The channeliser automatically configures the core at run-time to generate a time-multiplexed output of these channels. The only restriction is that the aggregate sample rate of the channels must not exceed the clock rate of the core. Extra parallelism can be used to relax this restriction, provided that the chosen FPGA has sufficient resources.
3.2 Resource Estimates
The resources required to implement a channeliser, defined by the parameters in Table 1, are given in Table 2 for a Xilinx Spartan-6 LX100 device, which currently costs approximately $120 in low volumes. This is a relatively modest device and yet over one thousand channels of various bandwidths can be supported.
Feature | Value |
No. of real inputs | 1 |
Input width | 14-bits |
Input sample rate | 250 MS/s |
Maximum number of independent channels | 1024 |
Channel tuning resolution | <1 Hz |
Channel output sample rates | From 4 kS/s to 125 MS/s |
Maximum aggregate output sample rate | 250 Ms/s |
Channel output sample rate resolution | <1 S/s |
Output channel response | 24-tap FIR applied to output: each channel has a choice between 16 programmable filters |
Stop-band rejection | >70 dB |
Pass-band ripple | <0.02 dB |
Table 1: example requirement
Resource | Usage |
DSP48E | 52 (29%) |
18kb RAM | 239 (89%) |
LUT | 29778 (47%) |
Flip flop | 34978 (28%) |
Table 2: resource usage for a Spartan-6 LX100
4. CONCLUSIONS
This paper has surveyed the key technologies used to channelise signals. The CCF concept exploits the best strengths of all of these complementary methodologies to produce a highly flexible channeliser. The CCF core is configurable at both build-time and run-time, to produce a channeliser that is highly optimised within the constraints imposed by the user’s generic channel plan. The operating domain of the CCF channeliser, relative to one based on DDC or FFT technology, is illustrated conceptually in Figure 10.
Unlike traditional channelisers, the CCF core can channelise potentially thousands of channels, while retaining the ability to independently, and with little restriction, specify the channel characteristics at run-time.
Figure 10: CCF operating domain
Most importantly, the CCF architecture relaxes the restriction that the maximum available output bandwidth is inversely proportional to the number of channels supported. This paper has shown that a total of 1024 channels of various bandwidths can be supported in a Xilinx Spartan-6 LX100.
CCF provides a cost effective, flexible and reconfigurable front-end channeliser that can be used for numerous applications. CCF is an enabling technology for a software defined radio (SDR) system when combined with similarly flexible system elements. The CCF core is the result of over ten years of sustained development and therefore, despite its flexibility and diverse applicability, it is founded on proven IP, which ensures that bespoke designs can be brought to market quickly with confidence in their reliability.
REFERENCES
[1] http://www.ntia.doc.gov/files/ntia/publications/2003-allochrt.pdf.
[2] http://www.nomor.de/uploads/_c/V1/_cV1fTZl6MtFnAHVXltStg/LTEAdvanced_2008-07.pdf.
[3] J. W. Cooley and J. W. Tukey, “An algorithm for the machine computation of complex Fourier series,” Math. Comp., 19, pp. 297-301, Apr. 1965.
[4] J. G. Proakis and D. G. Manolakis, “Digital signal processing: principles, algorithms and applications,” Third Edition, Prentice Hall Int. Editions, 1996.
[5] R. E. Crochiere and L. R. Rabiner, “Multirate digital signal processing,” Prentice Hall Signal Processing Series, 1983.