By Nitin Jain, MindTree ConsultingApr 1 2007 (11:03 AM), Embedded Systems DesignMaintaining the proper level of performance is the key to integrating speech algorithms.
Designing and developing an embedded system from scratch and making it stable is always a challenge. Integrating and evaluating a digital signal processing (DSP) algorithm with the system is equally tricky and can bring even the strongest programmers to their knees. Today, uncountable numbers of algorithms are embedded into various electronic systems. How does an embedded system developer know which algorithm to use for speech processing, such as that found in basic telephony systems?
The audio frequency spectrum that stretches to 40 kHz is divided in two bands. The speech components consume the lower part of spectrum, from 5 Hz to 7 kHz, with other audio components residing on the remaining higher portion, as shown in Figure 1.
View the full-size image
Speech processing mainly involves compression-decompression, recognition, conditioning, and enhancement algorithms. Signal-processing algorithms are very dependent on system resources, such as available memory and clock capacity. As these resources add cost to systems, they're often restricted by the product vendor to keep the product cost low. Basic traits, such as memory and clock consumption, are inherent parts of an algorithm's complexity. The lesser the complexity, the better the algorithm, provided it does its job efficiently.
Measuring an algorithm's complexity is the first step when evaluating an algorithm. The clocks required to run the algorithm on a specific processor determine the processing load, which is architecture dependent and varies with different processors. Memory requirements of the algorithm remain the same obviously. Most of the DSP algorithms work on a collection of samples, better known as a frame. The collection of samples to form a frame introduces an inevitable delay that is further followed by the actual processing delay. The International Telecommunication Union (ITU) standardizes the acceptable delay for each algorithm.
The algorithm's processing load is typically represented by the term "million of clocks per second," or MCPS. To better understand MCPS, assume that an algorithm that processes a 64-sample frame at a frequency of 8 kHz and requires 300,000 clocks to process each frame. The time required to collect the frame is 64/8,000, or 8 ms. A little math shows that 125 frames can be processed per second. When the algorithm processes all the frames, it consumes at least 300,000 125 = 37,500,000 clocks from the core per second, or 37.5 MCPS.
Another way to represent MCPS is that it equals (the clock required to execute one frame times the sampling frequency divided by the frame size) divided by one million.
A second term that's often used to define an algorithm's processing load is MIPS, or millions of instructions per second. The calculation of MIPS for an algorithm can also be a little tricky. If the processor effectively executes one instruction per cycle, the MCPS and MIPS ratings for that processor are the same. On the other hand, if the processor architecture takes more than one cycle to execute an instruction, there's a ratio between MCPS and MIPS. For example, an ARM7TDMI processor effectively requires about 1.1 cycles per instruction.
Click here to read more ...