John Glossner, Vice President, Engineering, Erdem Hokenek, Chief Hardware Architect, Mayan Moudgill, Chief Software Architect, Sandbridge Technologies, White Plains, NY
The technologies necessary to realize true broadband wireless handsets and systems present unique design challenges. Wireless handset manufacturers are must deliver products that offer expanded services and operate transparently worldwide. And, product designers are challenged to create extremely power efficient yet high-performance, broadband wireless devices.
The design tradeoffs and implementation options inherent in meeting these demands highlight the extremely challenging requirements for next generation baseband processors. Tremendous hardware and software challenges exist to realize convergence devices. Power dissipation constraints are requiring new techniques at every stage of design architecture, microarchitecture, software, algorithm design, logi c design, circuit design, and process design. With performance requirements exploding as bandwidth demand increases, power conscious design becomes more difficult. System-on-chip (SoC) integration and low voltage process technologies will contribute to lower power SoC ICs but are insufficient as the only solution for streaming multimedia.
Convergence applications are fundamentally DSP applications. A large number of standards exist or have been proposed for the wireless and wired communication markets. Such a diversity of standards necessitates a programmable platform for their timely implementation. In wireless communications, GSM and IS-54 data rates were limited to less than 15 kbit/second. Future third-generation (3G) systems may provide data rates more than 100 times the previous rates. Higher communication rates are accelerating higher DSP processing requirements.
SDR solutions, in particular, are DSP applications due to the large amount of signal processing required for baseband imple mentation. Programmable DSP represent just fewer than 35% of the total DSP market and about 55% of programmable DSPs are used in wireless communications. Typically, fixed function ASICs have been required to implement wireless baseband processing. The SDR solutions move this function into software programmable platforms. Therefore, the total market for efficient SDR solutions may be anticipated to be large.
A significant market trend for 3G wireless communications is Java execution. Future 3G wireless systems will make significant use of Java. A number of carriers are already providing Java-based services and may require all 3G systems to support Java.
Previous communications systems have been developed in hardware due to the high computational requirements. DSP's in these systems have been limited to speech coding and orchestrating the custom hardware blocks. In high-performance 3G systems, there may be over 2 million logic gates to implement the system. A complex 3G system may also take ma ny months to implement. After logic design is complete, any errors in the design may cause up to a 9-month delay for correcting the bugs and refabricating the device. This labor intensive process is counter productive to fast handset development cycles. The Sandbridge's design team has taken a completely new approach to communications system design.
Rather than designing custom blocks for every function in the transmission system, the team has implemented a processor capable of executing operations appropriate to broadband communications. The small and power efficient core is then highly optimized and replicated to provide a platform for broadband communications. This approach scales well with semiconductor generations and allows flexibility in configuring the system for future specifications and any field modifications that may be necessary.
The process involves designing the communications system in Matlab, thus ensuring the bit and block error rates for the transmission system are achiev ed. The Matlab system design is then ported to fixed point C code. From that point, no further programmer intervention is required. Sandbridge's highly optimizing compiler extracts the parallelism in DSP operations and optimizes performance on the SandBlaster DSP.
The design includes a unique combination of modern techniques such as a SIMD Vector/DSP unit, a parallel reduction unit, and RISC-based integer unit. Instruction space is conserved through the use of compounded instructions that are grouped into packets for execution. The resulting combination provides for efficient Control Code, DSP, and Java processing execution.
JVM translation designers have used both software and hardware methods to execute Java bytecode. The advantage of software execution is flexibility. The advantage of hardware execution is performance. The Delft-Java architecture, designed in 1996, introduced the concept of dynamic translation of Java code into a multithreaded RISC-based machine with Vector SIMD DSP operations. Dynamic translation was also explored by engineers at Sandbridge. The important property of Java bytecode that facilitated this translation is the statically determinable type state.
Programmer productivity is one of the major concerns in complex DSP applications. Because most classical DSPs are programmed in assembly language, it takes a very large software effort to program an application. For example, for modern speech coders, it may take up to nine months or more before the application performance is known. Then, an intensive period of design verification ensues. If efficient compilers for DSPs were available, significant advantages in software productivity could be achieved.
One of the outputs of the company's tool chain is a traditional ISA compiler. GCC is an example of this type of compiler. With an ISA compiler, a high-level language (HLL) is compiled to the instruction set of the processor. Often, optimizations are performed in translating the HLL code into assembly language. The company has developed its own highly optimizing compiler. Software compilation enables the efficient translation of high-level language such as C/C++ into optimized machine language.
One unique aspect of the compiler: DSP operations are automatically generated. The compiler uses a technique called semantic analysis in which a sophisticated compiler must search for the meaning of a sequence of C language constructs. A programmer writes C code in an architecture independent manner such as for a microcontroller focusing primarily on the function to be implemented. If DSP operations are required, the programmer implements them using standard modulo C arithmetic.
The compiler analyzes the C code, automatically extracts the DSP operations and synthesizes optimized DSP code without the excess operations required to specify DSP arithmetic in C code. This technique has a significant software productivity gain over intrinsic fu nctions.
Another challenge that DSP compiler writers face is parallelism extraction. Early VLIW machines alleviated the burden from the compiler by allowing full orthogonality of instruction selection. Unfortunately this led to code-bloat. General purpose machines have recognized the importance of DSP operations and have provided specialized SIMD instruction set extensions. Unfortunately, compiler technology has not been effective in exploiting these instruction set extensions, and library functions are often the only efficient way to invoke them.
Sandbridge's architectures make liberal use of these so called multimedia instruction sets because DSP applications are amenable to them. The vectorizing compiler is efficient at extracting this parallelism using Vectorizing optimizations. The compiler also handles the difficult problem of outer loop vectorization which is often a requirement for inner loop optimizations.
See related chart