The toughest system design challenge today is the third-generation cellular basestation, which is approximately 100 times more complex than its GSM counterpart. Standards are still evolving, with significant updates occurring every six months. This mandates both extreme computing power and field programmability. While it is possible to get there with legacy architectures, we would propose a novel approach.
Most basestations today use a combination of legacy DSP and FPGA in the baseband. The FPGA typically handles chip-rate processing while the DSP takes on symbol-rate processing of the protocols. Managing these clusters are RISC processors, which also handle packet processing and Layer 2/3 tasks. Supporting 64 channels in a 3G basestation might require nine FPGAs and more than 100 DSPs.
FPGAs deliver the performance and predictability, but have the wrong level of abstraction and take a long time to design. DSPs are familiar, and hav e a wealth of code to draw on. But they lack the raw horsepower needed and their performance is hard to predict and test. Partitioning the design is difficult, not least because control functions interact with many different blocks. It takes a lot of development to coordinate among them.
Worse, functions tested in isolation may fail to meet design goals in the presence other functions, given the statistical, unpredictable nature of instruction scheduling and interrupt handling in legacy DSPs. The same holds for FPGAs as they interact with the control functions managed by a nondeterministic processor. The root of the problem is simple: Subblocks are not orthogonal and interactions among them are not deterministic.
A third alternative is a parallel array of several hundred individual processors linked by a deterministic high-speed interconnect fabric. Each core would be a 16-bit device, roughly equivalent to an ARM9 for control tasks or a TI C5x for DSP roles, with some application-directed en hancements.
Our research has shown that this single environment can handle high-speed chip-rate processing, slower but complex symbol-rate processing and interwoven control tasks. The fabric's moderate granularity makes it possible to map tasks directly onto CPUs almost as easily as drawing a block diagram. But application mapping requires that the fabric be fully deterministic: in the tool chain, the processing elements (no arbitration, interrupts or complex pipelines) and the connections between elements. Performance must be totally established at compile time, with cycle-accurate simulation.
| Doug Pulley is CTO of Picochip Designs, a 3G Infrastructure company in Bath, England. |
Tasks would be decomposed to manageable chunks, which would then be statically mapped to discrete elements. These elements must be small enough to test, and because they would be static and would interact only in controlled ways, validation would be trustworthy-subblocks would be orthogonal.