Santa Clara, Calif. - The growing importance of video in mobile devices-which typically don't have great data bandwidth but do have small batteries-is creating a need for video codec hardware that can operate at both high compression ratios and very low power.
A simultaneous trend is rustling through the advanced-systems designers in this market, according to Joseph Perl, president and chief executive officer of video codec startup Mobilygen. That development is the move from MPEG-4 to H.264 compression.
Two factors are driving the shift, said Mobilygen marketing vice president Steve Musallam. The first is substantially better picture quality at the low bit rates required in mobile devices. The second is the angst generated by the MPEG-4 standards body over licensing fees. That concern, Musallam said, is driving some companies out of the MPEG camp and into the arms of H.264 developers.
This is good news for Mobilygen, which has spent several years developing first an architecture and now a chip, the MG 1264, to do H.264 VGA, 30-frame/second encode and decode at palmtop power levels. For good measure, the chip supports a two-channel AAC audio codec.
"If you look at all the different modes in which you can encode a macroblock in H.264, there are nearly a thousand possibilities," Musallam said. "To get the best possible bit rate, conventional encoders try all of them, or a fixed subset of them, one after the other, and use the one that gives the best combination of bit rate and sum of absolute differences."
Mobilygen's claim to fame is an architecture that both reduces the workload from the beginning and then performs the remaining work with a minimum of energy expenditure. "This is hard enough stuff that it is a barrier to entry," Musallam claimed.
The first step occurs in a video preprocessor on the chip. Incoming video data is filtered to eliminate noise, thereby reducing bandwidth. During the digital filter process, the preprocessor also collects statistics on the image. The stats are then passed on to a heuristic analysis that recommends which compression modes should be tried on each macroblock. "We end up trying an average of 10 modes per macroblock," Perl said, whereas the typical encoder "will try over a hundred."
Once the video stream leaves the preprocessor, it enters an architecture for the actual compression or decompression process. There are three fundamental components to this architecture.
First is the sequencer that guides macroblocks through the processing elements. This, rather than being the nearly obligatory ARM 9, is a proprietary control processor with hardware multithreading and a single-cycle context switch. The version on the current chip supports 14 concurrent threads, each with its own register set.
Second is the collection of hardware blocks that do the motion estimation, integer transform and other heavy lifting of H.264. These data-driven numeric processors are directly connected to each other under control of the sequencer. Instead of pulling data from main memory, they pass macroblocks directly from function to function, using local on-chip storage, until the macroblock is completed. Each functional block starts execution when a block of data arrives and shuts down again when the operation is completed. "This system is so efficient that we can run the 30-frame/s VGA encoder in real-time out of FPGAs at around 40 MHz," Perl said.
The final component is a sophisticated SDRAM controller that performs queuing and reordering operations to minimize the energy expended in both the chip itself and its external 8-Mbyte SDRAM. Here's where multithreading becomes important. By switching threads quickly, the control processor can keep the chip active while waiting for SDRAM transactions that may have been delayed to improve bus or DRAM efficiency.
In simulation, the chip consumes less than 300 mW when performing 30-frame/s VGA H.264 encode and less than 20 mW for decode. The chip has recently taped out in 130-nanometer TSMC CMOS. Mobilygen expects first silicon on or around March 18.
Mobilygen said the chip's first two customers are already developing with the company's FPGA-based prototype board.