by Ronald J. Landry, AMI Semiconductor, Inc. Dallas, Texas USA Abstract:
As microprocessors and SoCs find their way into an increasing number of mixed signal applications, the need for ultra low power consumption is becoming a major factor. The methodologies to support these designs are just as important. This is perhaps no more apparent than in the case of medical implantable devices but it pertains to any battery operated application. In this paper a background review of the types of current draw are discussed in order to gain an understanding of exactly where power is consumed. The types described are internal cell current, switching current and cell leakage current. Clock gating is discussed as a means of decreasing dynamic current and its impact on the design flow is explored. Power gating as a means of limiting leakage current can be addressed by several techniques outlined in this paper. The designer’s approach to power gating can have a dramatic impact to the library and IP requirements of the design and therefore must be considered carefully. Introduction
The System on a chip design approach has proven its worth by decreasing time to market, increasing design flexibility and increasing quality through high levels of design reuse. This has not gone unnoticed by mixed-signal chip designers or their customers. One area of strong growth in the Mixed-signal market space is low power and ultra low power applications. The high levels of integration achieved by SoC designs can contribute a significant benefit to the overall power budget. This paper will discuss the applications for low power mixed-signal technologies and the design techniques to achieve the lowest power possible. Ultra Low Power Mixed-Signal Applications Understanding the types of power consumption
There are three types of power consumption relevant to digital circuits. Internal power, Switching power and Leakage. Internal power and switching power are referred to as dynamic power and leakage is referred to as static power. The reason for this is somewhat obvious, dynamic power varies as the signals toggle, leakage is more a function of the physical implementation and will be static given constant operating conditions. It is important to understand the interworkings of these types of power consumption because the SoC designer may make decisions that trade off one for the other.
Switching power is perhaps the easiest to comprehend. As the output of a cell changes (switches) the parasitic capacitance associated with that node must either charge or discharge. The currents used to deliver this charge come from the power supply and therefore translate directly to power consumption. If the outputs do not change there is no charging or discharging and therefore no associated switching current. This can be impacted at the design architecture level. An example of this is the approach taken for a simple counter. A Gray code counter will by definition switch fewer outputs than a binary counter of the same bit width.
Internal power is somewhat more difficult to understand. To bring this to light consider the structure of a common Master-Slave Flip Flop in the figure below. What you see is the clock coming into the cell where master and slave versions are generated by two inverters in series. These two versions of the clock are used drive the “clock inverter” gates throughout the flip-flop. Even when the input (and therefore the output) remains static, each of the clock inverter inputs are toggled on every clock cycle. These nodes have capacitance so current flows in the same way as described for switching power. Internal power is also consumed when multiple transistors in a totem pole configuration are all in the “on” state. This can be minimized by making sure that signal edge transition times are short.
Leakage occurs because transistors are never perfect. Even when they are in the “off” state, the conductance of the channel does not go to zero.
Leakage is highly dependent on the geometry of the transistor. This is why leakage is a major factor in nanometer designs. As the channel gets shorter the conductance from drain to source gets larger. Think of this as one resistor rather than several in series.
Faster designs tend to have more leakage as well. This is because speed is achieved by increasing the drive of the cell. The higher drive is achieved by increasing the width of the channel. Think of this as putting several resistors in parallel. While this approach decreases the “on” resistance it also decreases the “off” resistance, which allows more leakage current to flow in the “off” state.
Voltage thresholds also have a drastic effect on leakage. This relationship can be seen in the following graph. A lower Vt corresponds to higher leakage. In many processes Vt is minimized to increase speed or lower the minimum supply voltage requirements. For ultra low power designs where leakage is a major factor, this needs to be taken into account. SoC design power minimization techniques Sleep and Idle modes
Many ultra low power applications have very low work duty cycles. Consider a remote temperature or pressure sensor. The application may only require that the sensor sample data once every few seconds, minutes or even hours. In this case shutting down the circuit for the majority of the time is a straightforward and obvious choice. This is usually accomplished by either gating the clocks to the circuit or gating the power. In these applications, leakage can be the dominant factor in power consumption since the device is actively working less than 1% of the time. Power gating would be beneficial in this case.
The term “Sleep mode” is normally used to describe the state where the micro clock is completely shut down. It will take some sort of external event to wake it up, for instance an interrupt or reset. “Idle Mode” usually means that the peripherals are running. The processor core will be reactivated by a timer, reception on the serial port or something of that nature.
One thing to consider when bringing a core out of sleep mode is the oscillator startup time. Normal oscillator architectures can take several milliseconds to stabilize. In many cases this may be longer than the core will stay active before going back to sleep. Spending a majority of your power budget waiting for an oscillator to start can be hard to justify. AMIS offers a quick-start oscillator that can power up in the 10-15us range, allowing the application to quickly perform it’s function and then get back to sleep. This technique is fairly transparent to the SoC designer but can pay huge dividends in the overall power budget. Clock gating
With an understanding of internal power we easily see how clock gating can benefit a design and why this technique has been incorporated into EDA software. Synthesis tools will look for the proper constructs within the RTL and determine whether clock gating can be used. This approach only makes sense if multiple flip-flops can be gated by the same clock enable signal. The minimum number of flops needed for effective clock gating can be determined by the designer. Gate level optimization
Gate level optimization can be just as affective as clock gating. The basic approaches to gate level optimization are technology mapping, cell sizing, buffer insertion and pin swapping.
Technology mapping looks at the constructs available within the standard cell libraries and attempts to implement high activity nets inside of the cell. As a general rule, nets within a cell have less capacitance so they naturally draw less switching current.
Cell sizing can reduce power consumption when you have a low activity net on a critical timing path with multiple cells in series. The approach is to increase the size of the cell driving the low activity net and decrease the size of the cell being driven. Overall timing is maintained but input capacitance on the driven cell is reduced. This can also produce an internal power savings because lower capacitive loading translates to a higher edge rate on the low activity net.
Buffer insertion reduces capacitive load on the nets and sharpens transition time. This improves both switching current and internal current but it does require an increase in design area. For many applications, this area for power trade-off is acceptable.
Pin swapping makes use of the fact that not all cells have the same input capacitance on similar pins. Take the case of a 4-input NAND gate. Depending how the cell is laid out, different inputs can have drastically different input capacitance. Pin swapping uses activity information on the input nets to put the high toggle rate nets on the lowest capacitance inputs. Low power libraries
Perhaps one of the most critical enablers for ultra low power applications is the availability of a low power standard cell library. Most standard cell libraries are optimized for speed or area. Low power libraries seek to optimize the parameters that will most impact current consumption. One of the parameters most commonly altered is the threshold voltage. As the chart below denotes, raising the threshold voltage can reduce leakage drastically. The downside of raising Vt is it raises the minimum supply voltage requirement.
Cells within a low power library are also handled differently. The channel widths are minimized to reduce leakage. This certainly has a disadvantage with respect to speed but again, for many ultra low power applications this is an acceptable tradeoff. Digital Architecture techniques
The digital design can also have an impact on power, beginning at the architecture stage. As previously discussed, a counter implemented in Gray code will consume less switching current.
Another area that is somewhat controversial is the use of less synchronous design architectures. One of the simplest examples is the ripple counter. A ripple counter creates a new clock for every bit of the counter. Not only are these very fast counters but they are very efficient with respect to internal current. Each flop in the chain is clocked only when the previous flop “rolls over.” This technique is somewhat problematic. The first problem is that its outputs do not change together. The first flip flop in the chain will cause the second flop to be clocked, the second will clock the third, etc, etc. The design will see a delay approximately equal to a clock-to-out delay for each bit of the counter. Any decodes of these bits will be inherently glitchy. Ripple counters are also more difficult to handle during scan insertion. This can be minimized by muxing-in a scan clock so that in scan mode, all of the flops operate in the same clock domain. This is not ideal from a fault coverage standpoint because the clock mux creates a path that is not covered when scan is enabled. The asynchronous concept can be extended well beyond ripple counters. Digital designers should consider using this technique in limited cases and under tight control. Power gating
Power gating can be done at the cell level or the block level. Cell level power gating requires a library cell like the one shown below. Power gating cells have an internal state-saving latch that holds the state of the cell while the main power is shut off. This latch has it’s own power supply and can be made very small so the leakage of the latch is much less than that of the flip-flop itself. The cell has a sleep and wake-up signal that causes the state of the flip flop to be stored to or restored from the latch. The advantage of a power-gated cell is the state of the machine is automatically restored when power is restored.
Block level power gating is also possible and does not require that two power supplies be routed through out the design. The disadvantage is that the state of the machine must be restored from a memory or register bank that maintains its own power. Another concern with block level power gating is the interface between blocks. Care must be taken to make sure that current is not leaking through the interface IO. Voltage Scaling
Voltage scaling is a technique that requires a sophisticated voltage regulator but can provide a significant extension to battery life. Temperature, process and voltage all act together to affect the speed of a device. As temperature increases, speed decreases. Process may vary the speed of a cell by as much as plus or minus 50 percent, and cells slow down as voltage decreases. These relationships can be used to compensate for one another. If a temperature sensor is on-die, the device can read it’s own temperature and adjust its voltage, hopefully downward, to meet the application’s timing requirements. With the proper voltage regulator, this lower voltage will translate into power savings.
A more advanced technique is to incorporate a circuit that actually measures the speed of the device. This of course requires a stable timing source, usually an off-chip crystal, but most applications have this available. This approach has the benefit of taking into account both temperature and process variations.
It is important when using voltage scaling to have adequate timing margin along all of the voltage, temperature and process curves. This requires libraries that are characterized at many different corners. Also, estimating battery life can be very difficult when doing voltage scaling. This may not be as critical for consumer products but medical and industrial applications usually require some stated minimum battery life. Depending on the temperature range of the application, high process variations across device lots may make the product difficult to market. Conclusion
As application demands increase toward more power sensitive devices, new and novel approaches are needed to meet those demands. The techniques discussed in this paper can help the SoC designer to meet those goals. Several of these techniques can be used in unison to provide the lowest power solution possible. Sleep mode, clock gating, power gating, gate level optimizations, low power libraries, low power architectures and voltage scaling are all proven low power techniques and should be considered when architecting your application.
It is important to partner with a silicon vendor that has the extensive experience, expertise and IP portfolio required to successfully design and manufacture ultra low power devices. AMIS Semiconductor is committed to providing all of the above to their customers to help them meet the needs of the low power device market. Bio Ron Landry is the Manager of the Digital Product design group at AMI Semiconductor, Inc. He has a Bachelors of Science in Electrical Engineering from Oakland University, Rochester, MI and a Masters of Science in Telecommunications from Southern Methodist University in Dallas, TX. Ron has 17 years of experience in circuit board, FPGA and ASIC design.