Yi-Ping Zhang, Zhao-Yong Zhang, Jian-Bin Zheng - AiceStar Technology Corporation
Kuen-Di Lee2 - Innopower Technology Corporation
In this paper an optimized power gating design on a 55-nm Static Random Access Memory (SRAM) compiler is presented. Two low leakage modes: retention and sleep mode are discussed. The arrangement of power gating (P.G.) MOS is especially considered for the compiler design. The proposed method achieves an obvious advantage in leakage control of low leakage mode for memory compiler. Simulation data shows a 4× leakage reduction for retention mode, and a 50× leakage reduction for sleep mode for a 512k density instance compared to original design.
With the process scaling down, the leakage power consumed by transistors grows dramatically. Large area of SOC is dominated by embedded memory and leakage control challenge the memory design severely. Gating off the power to decrease rail to rail voltage when memory is idle is a good method for reducing not only sub-threshold leakage, but also gate leakage.
Lots of methods have been discussed on power gating design. In , SRB method is used. In sleep mode footer NMOS is cut off and in retention mode GND is bounced ηVt. It needs DC/DC converter to control gate bias and doesn’t share the gating MOS when retention and sleep which adds extra area. In , it also suffers no sharing of gating MOS issue. In , it uses NMOS both for retention and sleep mode. Although it has several options, it still needs careful sizing design to keep retention voltage under control. All of the designs are discussed for the case of the single macro and without considering the compiler design.
In this paper, the arrangement of power gating MOS has been discussed and compared. We implement a set of dual-port high density SRAM complier which has power gating feature, especially optimized for compiler design by applying our proposed method.
The rest of this paper is organized as follows. Section 2 discusses our proposed arrangement of the power gating design in a set of compiler. Section 3 shows the experiment result. Section 4 gives the summary.
Proposed power gating design
Fig. 1 shows the basic power gating element in this compiler. MP0 and MP1 are HVT and increase their length for further reducing gating off leakage. The compiler has three basic modes: standby, retention and sleep mode.
Standby mode: When input pin A, B (B active high) and C are low, instance goes to standby mode. There is little voltage difference between VCC and Vcell. It is the same between VCC and Vper. SRAM is ready for read access and write operation.
Retention mode: When A is high-z and B is high, MP0 formed a diode connection. Pin C is high to cut off MP1. SRAM is in low leakage mode, but data in the cell is still retained. Un-like in ~, it also cuts off periphery circuit power to achieve a further low leakage. The reason to configure this mode is that for compiler design, as small, wide or thin aspect ratio instances, periphery circuit leakage is more than array leakage.
Sleep mode: When A is high and B is low, MP0 is cut off and Vcell is decided by resistors of array and MP0. MP1 is also cut off to reduce periphery circuit leakage by set C high. Vper and Vcell are expected to be nearly GND for minimum leakage.
Figure 1. P.G. MOS operation truth table (a) in array (b) in periphery circuit
In  , Data Retention Voltage (DRV) is explicit discussed. In , it concentrates on extreme low VCC retention cell design while in  tracks global process variation to get the minimum voltage for retention. We should also pay attention to DRV so as to keep data retained in retention mode. Un-like in  , our retention voltage is controlled by using diode clamping. The retention voltage drop from power supply will be about one PMOS threshold voltage. MP0 is sized a little larger to avoid too much drop of retention voltage when PVT changes.
Figure 2. Three methods of arrangement of P.G. MOS
The layout arrangement of power gating MOS is not detailedly discussed in the previous work. We emphasize it here for compiler design. MP0 size must change with words and bits in compiler. The best case is to realize a constant ratio between array density and size of MP0 so as to keep stable Vcell voltage when in retention and sleep mode over the compiler configuration design range. The un-reasonable arrangement of the power gating MOS will cause a large variation of Vcell when in low leakage mode. Fig. 2 shows three kinds of arrangement. Rule (A) MP0 arranges at vertical direction, only increase constant size of P.G. MOS when words increase. Rule (B) MP0 arranges at both vertical and horizontal direction, increases constant size of P.G. MOS when words or bits increase. Rule (C) MP0 arranges at vertical direction and increases P.G. MOS when words increase. Its increasing size is decided by column number.
Figure 3. P.G. size increment change with column number for (C)
In this SRAM compiler, all aspect ratio instances should meet the minimum demand for the power gating MOS: 10um per kilo density which can guarantee the safe retention mode for 55-nm UMC process. So rule (A), (B) and (C) arrangements are determined as follows. For rule (A), it increases 20um P.G. MOS for every four rows. For (B), it increases 10um P.G. MOS for every four rows and 20um P.G. MOS for every four columns. For (C), it increases a variation size for every four rows and the variation size refers to Fig. 3. It has eight discrete steps trying to realize MP0 increment M∝ column number. To compare the three arrangements, four different aspect ratio instances (min, fat, thin, max) in the compiler are given out. The data in the Table 1 shows MP0 size of each instance and data in the bracket is the MP0 size normalized to per kilo density. It can be seen the MP0 size of max density instance is the same for fair compare of other aspect ratio of instances.
Table 1. P.G. MOS size of four aspect ratio instances
Table 2 shows Vcell voltage when retention and sleep mode over four aspect ratio instances. The simulation condition is FF corner (fast PMOS fast NMOS) 125℃. The power supply when retention mode is set to 0.8V while sleep mode set to 1.1V. It can be seen that (A) and (B) method can not have stable retention and sleep voltage.
The reason why (C) can get a comparable stable rail-rail voltage is: For (C) P.G. MOS size∝rows×M∝rows×columns. So its P.G. MOS size∝density. Meanwhile, both (A) and (B) don’t have this feather. After compare and analysis, method (C) is proposed and applied in the design for its more stable normalized power gating MOS size.
Table 2 Vcell voltage when retention and sleep mode of four aspect ratio instances
Table 3 Comparison of array leakage in retention mode (Ratio compare to C shown in the bracket)
Table 4 Comparison of array leakage in sleep mode (Ratio compare to C shown in the bracket)
Table 3 and Table 4 show array leakage in retention and sleep mode and the compare ratio to method (C). When in retention mode, compared to (C), (A) and (B) will have max 44% more unnecessary leakage. When in sleep mode, a 30× leakage is wasted for (B) in fat aspect ratio instance. The sub-threshold current in sleep mode is so greatly affected by drain voltage because of short channel effect DIBL (drain- induced barrier lowing). It can be assumed, this effect will be further worsen when scaling down to 40nm or 32nm process.
Fig. 4 shows the MP0 size normalized to per kilo density changes with column and row number using method (C). The peak value is at minimum column number due to discrete power gating size step than continuous step. An alternate method that the power gating MOS horizontally arranged with columns and its size increment changes with rows can also be employ to realize power gating size ∝ density. Using which method is decided by power plan of metals of array or the floor-plan of instance. Higher levels of metals are routed parallel to reduce IR drop of Vcell.
Figure 4. P.G size normalized to per kilo density change with column and row number for (C)
Figure 5. (a) 3-bit setting for P.G. MOS (b)Retention volatege versus 3-bit settings.
In this design, 3-bit settings are used to select power gating PMOS, as shown in Fig. 5(a). They are binary weighted to achieve 2N Vcell voltage level in retention and sleep mode to cover global process variation, as shown in Fig. 5(b). It is flexible when porting the design among different processes without further resizing and re-layout, so large work effort can be reduced. These options can be designed by simulation and re-adjust by test result to get a proper retention voltage.
Figure 6. Layout of the test-chip
A test-chip has been designed using UMC 55-nm standard performance CMOS logic technology with 14 memories which layout is showed in Fig. 6. Its fabrication is on going. To compare method (A) and (C), two 32K density instances have been put in the test-chip. One applies (A), and the other applies (C). Its simulation result shows (C) obtains 20% retention and 27% sleep leakage reduction than (A). Compare to original design, about 4× and 50× leakage reduction have been gotten for retention and sleep mode, respectively.
Arrangement of array power gating MOS for SRAM compiler is discussed here. The arrangement which power gating size is directly proportional to array density is recommended. By applying it, the retention and sleep voltage can be much more stable than not considering this so as to optimizing the leakage control in compiler design. 3-bit settings are applied to further increase the design flexibility. This method is already applied in 55-nm dual-port high density SRAM compiler and waiting for test-chip result for further confirmation.
It is our pleasure to thank You-Wei Yeh for help with the test-chip design. Sam, Tim and our excellent memory team for helpful discussion on the design.
 Bhavnagarwala, S. V. Kosonocky, M. Immediato, D. Knebel, and A.-M. Haen, “A pico-joule class, 1 GHz, 32 kB 64 b DSP SRAM with self reversed bias,” in Symp. VLSI Circuits Dig. Tech. Papers, pp. 251–252,Jun. 2003.
 Chung-Hsien Hua et al, ”Distribute Data-Retention Power Gating Techniques for Column and Row Co-Controlled Embedded SRAM”, IEEE International Workshop on Memory Technology, Design and Testing (MTDT), pp.780-785.2005.
 Yih Wang et al, ”A 1.1 GHz 12 _A/Mb-Leakage SRAM Design in 65 nm Ultra-Low-Power CMOS Technology With Integrated Leakage Reduction for Mobile Applications”, IEEE J. Solid-State Circuits, vol. 43, pp.172-179, 2008.
 Yih Wang et al, “A 4.0 GHz 291 Mb Voltage-Scalable SRAM Design in a 32 nm High-k + Metal-Gate CMOS Technology With Integrated Power Management”, IEEE J. Solid-State Circuits, vol. 45, pp.103-110, 2010.
 H. Qin, Y. Cao, D. Markovic, A. Vladimirescu, and J. Rabaey, “SRAM leakage suppression by minimizing standby supply voltage,” in Proc.Int. Symp. Quality Electronic Design (ISQED), pp. 55–60, 2004.
 J. Wang et al, ”Canary Replica Feedback for Near-DRV Standby VDD Scaling in a 90nm SRAM”, IEEE 2007 Custom Intergrated Circuits Conference (CICC), pp.29-32, 2007.