Streaming multimedia challenges DSP design
By Vishal Markandey, Nat Seshan, Pradeep Bardia, , EE Times
October 1, 2001 (4:51 p.m. EST)
Vishal Markandey, Distinguished Member, Technical Staff, Nat Seshan, Senior Member Technical Staff, Pradeep Bardia, Worldwide Strategic Marketing, DSP Video and Imaging, Texas Instruments, Dallas, Texas, firstname.lastname@example.org
If an application developer sat down with a semiconductor architect to design a digital signal processor (DSP) specifically tailored to streaming multimedia and streaming video applications, what features would the developer ask for? Would flexibility and upgradability to handle a variety of video and audio standards and to evolve with the standards head the wish list? Would tremendous channel density rank high? How about real-time performance without latencies or quality degradation and full software reprogrammability?
Chances are, a streaming multimedia developer would want all of those features along with the lowest possible system cost, efficient use of board real estate, software programmability, ease of us e and more.
Texas Instruments technical staff, DSP Video and Imaging, carefully considered the special requirements of streaming multimedia and video applications when developing the next generation of high-performance DSPs.
For example, real-time performance is critical for handling the extreme volume of data processed in multimedia applications. At 600 MHz, the companys TMS320C6414, C6415 and C6416 are designed to provide the flexibility, real-time performance and bandwidth essential to wired and wireless streaming multimedia development. These devices deliver roughly twice the clock speed generally available on todays leading DSPs. To further enhance performance, the companys very long-instruction-word engine has been upgraded to wring more operations from each cycle than ever before. Instruction extensions add support for packed data processing, which provides a significant performance boost for video and imaging algorithms.
The team aimed to speed up computationally intense algorithms for motion estimation and compensation; picture filtering; and transforms commonly used in streaming video such as discrete cosine transform or discrete wavelet transform. Toward those goals, special-purpose instructions were added in the 64x DSP family. Two special-purpose instructions can be used to perform extended precision 16 x 32 multiplies, useful in audio applications, while up to eight simultaneous 8-bit multiply operations can be utilized for video processing functions.
Because of its increased performance and efficiency, one DSP can replace several in rack-based systems that support thousands of channels. In comparing a 300-MHz processor from TIs C62x family, for example, the C64x DSP delivers six to 10 times the performance for emerging video applications. A designer can maximize results with fewer components, thus reducing desig n complexity, system cost, power consumption and board real estate.
Another necessary feature is overall system flexibility. The streaming media infrastructure increasingly requires transcoding (changing one format to another, e.g., converting MPEG-2 video to MPEG-4) and transrating capabilities (changing one bit rate to another, e.g., converting an MPEG-2 video stream at 6 Mbits/second to an MPEG-2 stream at 5 Mbits/s). Video-over-packet gateways, streaming media servers, multimedia routers, wireless 3G equipment with video extensions and other equipment rely on transcoding and transrating for maximum performance in a multistandard environment.
For instance, it may be necessary to convert MPEG-2 video content to MPEG-4 video for transmission to a wireless device. A system may also need to convert a video stream from a particular bit rate to a lower bit rate for transmission over a limited-bandwidth channel. The C64x DSPs are designed to handle these requirements, enabling system developers t o provide cost-effective solutions that address key requirements of system flexibility and high channel density.
Processor performance and enhanced instruction sets cannot provide meaningful solutions without fast and effective communications throughout the system. A key function is balancing the Gbits of I/O bandwidth with the high-speed processing core. The C64x DSPs are designed to eliminate data flow bottlenecks and latencies that result in blotchy or jerky video.
A 64-bit external memory interface and a secondary 16-bit EMIF each operate at 133 MHz. Flexible PCI or HPI32 host connections provide 33-MHz, 32-bit connectivity for control or interprocessor communications. A Utopia 2 asynchronous transfer mode connection enables 50-MHz, wide-area-network connectivity.
A primary goal behind the development of the C6000 DSP architecture was to provide an excellent C-compiler target. Key enablers incorporated into the architecture for efficient C performance are deterministic o rder and time of execution, a large general-purpose register file, simple independent instructions and avoidance of special modes and status bits. This resulted in industry-leading C-compiler performance.
A challenge processor architects face is the need to develop devices ready for future applications but, nevertheless, fully code compatible with older-generation products. Since the C64x is to be object code compatible with the rest of the C6000 DSP platform, developers and OEMs can reuse existing software and can build upon earlier development work. A wide array of digital media content and third-party algorithms, including MPEG-4 video, MPEG-2 video, H.263 and JPEG, are also being offered to facilitate streaming media applications.
The result is a road map that can take an application developer or system designer from products on the market today to those that emerging streaming media applications will require tomorrow with minimum re-engineering and development time.
Enha ncements such as tailored peripherals and additional special-purpose instructions will drive further enrichment of streaming media applications.
Currently, the TI team is accelerating activity into portable, battery-operated streaming media applications with the TMS320 DSC family of processors, offering higher levels of performance, integration and power efficiency for handheld digital cameras, PDAs, e-books and Web pads. TI has also created the OMAP architecture for multimedia applications such as streaming video, videoconferencing and high-quality audio, targeted for next-generation wireless multimedia devices.
While no one can predict how technology will advance over the next few years, consumers are sure to demand more streaming multimedia quality and functionality. The biggest challenge is to anticipate the possibilities and move to satisfy demand ahead of competitors.