# Automated Selective Multi-Threshold Design For Ultra-Low Standby Applications

Kimiyoshi Usami, Naoyuki Kawabe, Masayuki Koizumi, Katsuhiro Seta, and Toshiyuki Furusawa\*

Toshiba Corporation Semiconductor Company \*Toshiba Microelectronics Corporation 580-1, Horikawa-cho, Saiwai-ku, Kawasaki 212-8520, JAPAN phone:+81(44)548-2344

kimiyoshi.usami@toshiba.co.jp

# ABSTRACT

This paper describes an automated design technique to selectively use multi-threshold CMOS (MTCMOS) in a cell-by-cell fashion. MT cells consisting of low-Vth transistors and high-Vth sleep transistors are assigned to critical paths, while high-Vth cells are assigned to non-critical paths. Compared to the conventional MTCMOS, the gate delay is not affected by the discharge patterns of other gates because there is no virtual ground to be shared. We applied this technique to a test chip of a DSP core. The worst path-delay was improved by 14% over the single high-Vth design without increasing standby leakage at 10% area overhead.

## **Categories and Subject Descriptors**

B.7.1 [Integrated Circuits]: Types and Design Styles – VLSI, DSP.

#### **General Terms**

Performance, Design, Experimentation.

#### Keywords

Automated design, Multi-Threshold, standby leakage current.

# **1. INTRODUCTION**

As the semiconductor process technology gets advanced, techniques to reduce leakage current of MOS circuits get more important. In particular, low standby leakage is strongly required in LSI's for cell phones to prolong the battery life. Moreover, high performance is required in DSP cores for the new generation cell phones such as W-CDMA. In the design of these LSI's, a tradeoff between performance and standby leakage current is very critical. Lowering threshold-voltage (Vth) is required to achieve high performance. However, it leads to exponential increase of sub-threshold leakage current [5]. Vth cannot be easily lowered when the requirement for leakage current is very critical. Another requirement is a short time-to-market. Techniques requiring major

*ISLPED'02*, August 12-14, 2002, Monterey, California, USA. Copyright 2002 ACM 1-58113-475-4/02/0008...\$5.00.

process-modification are not allowed for this reason.

Prior to a design of a DSP core for W-CDMA cell phones, we investigated conventional techniques that had been reported in papers. However, it was found that none of those techniques would satisfy the requirements of the chip in standby leakage current, performance and time-to-market.

This motivated us to develop a new design technique to reduce standby leakage current to the value equal to that of a high-Vth design while achieving the performance at low-Vth.

The rest of this paper is organized as follows. Section 2 presents conventional techniques. Section 3 describes a selective MT technique that we propose. Section 4 presents experimental results and Section 5 concludes the paper.

# 2. RELATED WORK

Papers have been reported so far describing techniques to reduce standby leakage current while maintaining high performance. A dual-Vt technique [6] is an approach to assign low Vth to cells on critical paths while to assign high Vth to several cells off the critical paths. This reduces leakage current while achieving high performance. However, the amount of reducing leakage current in this technique is limited because low-Vth cells still exist on the critical paths. In other words, as long as the number of low-Vth cells is not a negligible share, the leakage current would not be lowered to the value at the design with single high-Vth. It is very difficult in a dual-Vt technique to reduce leakage current to the value equal to that of a high-Vth design.

The following three techniques have capabilities of further reducing standby leakage current than the dual-Vt technique. A well-known technique is an approach to design the entire circuit with low Vth and shutdown the power at standby mode. However, this approach is accompanied by a significant overhead at every mode change from active to standby and vice versa. This is because saving the memory data before shutdown and restoring it after the wake-up are required. This overhead is not acceptable for cell phones in timing and power. At standby mode in cell phones, they intermittently page the base station to exchange information on the location, frequency, etc. This operation is repeated at a certain period. However, there is not enough time for saving and restoring the data within the period. In addition, the repetition of saving and restoring at every cycle leads to wasting power.

A variable threshold voltage CMOS (VTCMOS) [3] is another technique to achieve high performance and low standby leakage. In active mode the circuit operates at low Vth, while at standby

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

mode reverse body bias is applied so the effective threshold voltage be raised to reduce leakage. However, this approach requires a charge pump circuit to generate the body bias at standby mode. The charge pump consumes power, leading to an overhead as the standby current. Process modification to triplewell structure is required to avoid latch-up. Redesign to separate the substrate is required as well. A more critical problem has been pointed out in [2] that VTCMOS becomes less effective with technology scaling. As the technology gets advanced, larger reverse bias is required to minimize the subthreshold leakage. This leads to increasing in junction leakage, resulting in reducing effectiveness of VTCMOS.

Multi-threshold CMOS (MTCMOS) [4] is a technique to reduce the leakage current during idle modes by providing a high threshold "sleep" transistor in series with the low-Vth circuit transistors, as shown in Figure 1. In active mode, the high-Vth transistor is turned on, while in sleep mode it is turned off, providing a small subthreshold leakage current.



Figure 1. Conventional MTCMOS

However, the following issues on MTCMOS have been pointed out in [1]. Because the virtual ground line is shared among logic gates, the voltage of the virtual ground fluctuates depending on the discharge pattern of the gates. In other words, the circuit speed varies depending on input vectors. The size of the sleep transistor needs to be determined with close consideration of input vector patterns. In addition, timing verification taking into account the voltage fluctuation of the virtual ground line is required in order to guarantee the circuit timing. These problems make design and verification difficult.

# **3. SELECTIVE MT TECHNIQUE**

#### 3.1 Outline

We propose a novel automated design technique to selectively use multi-threshold (MT) cells in a circuit, leading to Selective MT. Basic structure of MT cells is shown in Figure 2.



Figure 2. Basic structure of MT cell

An MT cell has a control input "MTE" (MT Enable) to switch the operation between active mode and standby mode. In the active mode, MTE is set to '1', resulting in performing fast logic operation with low-Vth transistors in the MT cell. In the standby mode, MTE is set to '0'. The high-Vth sleep transistor is turned off, resulting in cutting off the subthreshold leakage path from VDD to ground. In addition to this structure, a circuit has been incorporated into an MT cell to avoid output-floating at MTE='0'. Details are presented in Section 3.2. In the Selective MT technique, we utilize MT cells in critical paths, while use high-Vth cells in non-critical paths, as shown in Figure 3. We developed an automated design technique to synthesize Selective MT circuits and built a design environment up to layout.



Figure 3. Selective MT circuit

The Selective MT technique has the following advantages. First of all, it does not have a virtual ground line appeared in the conventional MTCMOS. Hence, the circuit speed is not affected by the discharge pattern of other gates. The size of a sleep transistor can be independently determined to be optimal within an MT cell. Second, mode switching between active and standby can be performed at one clock cycle. Also, power consumption at the mode switching is lees than the conventional MTCMOS. Because sleep transistors exist only in MT cells on critical paths, the total transistor width of sleep transistors is much smaller than that of the conventional MTCMOS. Third, the data stored in flipflops and latches are maintained even at standby mode because high-Vth cells are used for them. Forth, static power in active mode is reduced as well in Selective MT. This is an advantage that the conventional MTCMOS or VTCMOS do not have. Lastly, standby leakage current is reduced to the value equal to that of a high-Vth design, irrespective of the ratio of low-Vth transistors.

# 3.2 MT cell and Library

MT cells do not exist in the conventional ASIC library. Hence, we developed a set of MT cells and registered as a library. It was not practical to develop MT cells corresponding to the entire set of the ASIC library due to limited resources and time. We instead selected MT cells to be developed under the following policies:

1. Exclude cells with small drive strength Since MT cells are employed in critical paths; cells with small drive strength are not likely to be used.

- 2. Exclude flip-flops and latches
- 3. Exclude high fan-in gates (e.g. 8-input gates)

This is because these can be realized by a combination of 2-input or 3-input gates.

4. Add complex gates expected to contribute to speed up critical paths

Finally, we developed fifty-six MT cells including inverters, buffers, NANDs, NORs and several complex gates with variations of drive strength.

As shown in Figure 3, MT cells and high-Vth cells are cascaded in a Selective MT circuit. In the structure shown in Figure 2, the output of an MT cell may become floating at MTE='0'. This may cause direct current path at high-Vt cells locating at the fan-out of MT cells. To avoid this problem, we added a hold circuit to an MT cell to maintain the output voltage at MTE='0'. We designed "latch-type" and "bypass-type" circuits shown in Figure 4.

High-Vth transistors are used in the latch and bypass portions. Minimum transistor size was chosen for them so they may not affect the circuit performance. Either of latch-type or bypass-type was implemented depending upon the original cell-types taking into account cell area.

As a circuit to hold the output voltage of an MT cell, we also considered an option to pull-up the output to '1' through an additional PMOS transistor or pull-down to '0' through an additional NMOS transistor. However, we did not adopt this option for the following reasons. First, the pull-up/pull-down type may cause futile signal transitions at every mode-change between active and standby. This leads to wasting dynamic power. Second, significant difference was not observed in cell area between the "latch/bypass" type and the "pull-up/pull-down" type. Clearly, transistor count is less in the "pull-up/pull-down" type than the "latch/bypass" type. However, it was found that the dominant factor to cell area was a sleep transistor instead of an output-hold circuit. The dimension of the sleep transistor is three times as large as low-Vth transistors. Meanwhile, the output-hold circuit is designed with the smallest dimension.



Figure 4. MT cell with output-hold circuit

We designed MT cells by using 0.55V as high-Vth and 0.35V as low-Vth. By choosing the optimal size for a sleep transistor, an MT cell is allowed to have almost equal performance to the original cell only with Vth=0.45V.

## **3.3 Selective MT Synthesis**

A Selective MT circuit shown in Figure 3 is not synthesized with commercially available logic synthesis tools even though we specify both MT cells and high-Vth cells as a target library. The main reason is that even if logic function is the same at both an MT cell and a high-Vth cell the number of pins is different from each other. In other words, a high-Vth cell for NAND2 has two input pins (A and B) and an output pin (Z), while an MT cell for NAND2 has three input pins (A, B and MTE) and an output pin (Z). Hence, conventional tools are not capable of synthesizing a circuit in such a way that MT cells are mapped to critical paths while high-Vth cells to non-critical paths.

To solve this problem, we have built the following design flow. First, we perform logic synthesis from RTL using a conventional tool with high-Vth cells. Next, we generate a Selective MT circuit shown in Figure 3 from the output of logic synthesis. We developed a tool identifying critical paths in a high-Vth circuit, and replacing high-Vth cells with MT cells so the entire circuit can meet the timing constraints. Cell replacement is performed in a backward fashion from primary outputs toward primary inputs.

We also provided the tool with a "MTE hook-up" capability. In the high-Vth netlist, the signal MTE exists only at the top module. In other words, MTE does not exist in sub-modules in a hierarchical netlist. It is not until an MT cell is assigned that the signal MTE is required at sub-modules. The hook-up function automatically adds necessary ports and wires for MTE to each level of the hierarchical netlist and hooks it up to the top module.

In the layout process of a Selective MT circuit, we paid a special attention to routing of the signal MTE. Because the signal MTE has a large number of fan-outs, connecting them only with metal wire and driving with a single buffer may cause problems such as

electro-migration. Hence, we adopted an approach to automatically generate a buffer-tree structure for the signal MTE in a clock-tree-synthesis (CTS) fashion. This enables us to build an optimal buffer-tree structure taking into account the location of MT cells. In practice, since close consideration of skew matching is not required for MTE, we only perform tree construction and buffer placement.

# 4. EXPERIMENTAL RESULTS

We applied the Selective MT technique to a test chip of a DSP core for W-CDMA cell phones. Selective MT synthesis was applied to a module containing 34K cells. This module cannot meet the timing constraints for 100MHz if we use only high-Vth (0.55V) at VDD=1.5V in 0.18um CMOS technology. By applying the Selective MT synthesis, 12% of high-Vth cells in the module were replaced with MT cells, as shown in Table 1.

| Table 1. | Cell count | before and | after | Selective | MT | synthesis |
|----------|------------|------------|-------|-----------|----|-----------|
|----------|------------|------------|-------|-----------|----|-----------|

|        | Cell count |                |               |  |  |
|--------|------------|----------------|---------------|--|--|
|        | Total      | High-Vt cells  | MT cells      |  |  |
| Before | 34204      | 34204          | 0             |  |  |
| After  | 34204      | 29978<br>(88%) | 4226<br>(12%) |  |  |

We analyzed the worst path-delay before and after the Selective MT synthesis. In the worst path containing 53 stages of gates, 30 stages of gates were replaced with MT cells, resulting in meeting the timing constraints. The worst path-delay was improved from 10.27ns to 8.85ns, leading to 14% improvement. The worst path was not changed between before and after the Selective MT synthesis.

Figure 5 shows a photograph of the test chip. We applied the Selective MT technique to a part of random logic located in the center of the chip. Area overhead was 10% in the part to which we applied the technique. The chip operated at 100MHz.



Figure 5. Photograph of test chip

We measured standby leakage current at the test chip. Power supply for the random logic part is separated from other portions

at the design in advance so the current of only the random logic part can be measured. Results on temperature dependence of the leakage current are shown in Figure 6. We measured leakage current at MTE='0' and at MTE='1'. At MTE='0', sleep transistors in MT cells are turned off. Hence, leakage current at standby in the Selective MT technique is observed. In contrast, at MTE='1', sleep transistors in MT cells are turned on. Hence, leakage current is observed in the state that the leakage flows at low-Vth transistors in critical paths and at high-Vth transistors in noncritical paths.



Figure 6. Results on leakage measurement

At 85 degrees, leakage current was 86 micro amps at MTE='1' while the current was reduced to 28 micro amps at MTE='0' in the sample A. We measured leakage current in another sample (sample B) that was fabricated in the same condition as the sample A. Similar results were observed in sample B, as shown in Figure 6. The measured leakage current at MTE='0' was in the range of our target, being almost equal to the estimated value of a high-Vth design.

In the test chip, we applied the Selective MT technique to a portion occupying approximately 30% of the entire random logic in gate count. In spite of the fact that the Selective MT technique was applied to only a portion, leakage current of the entire random logic at MTE='1' was reduced to 1/2-1/3 at MTE='0'. As described above, leakage current at MTE='1' can be interpreted as that of a design utilizing high-Vth and low-Vth transistors depending on path criticality. Compared to a technique to simply mix high and low Vth's depending on path criticality, the Selective MT technique enables to further reduce standby leakage.

Finding a good MTE signal is a key in the Selective MT technique. If the MTE signal toggles very frequently, it consumes large dynamic power, leading to power overhead. We chose the standby signal of the entire chip as the MTE signal in this experiment. In the application to cell phones, this approach not only minimizes standby leakage current but also allows the dynamic-power overhead to be negligible. In the intermittent operation of cell phones described in Section 2, the toggling between active and standby occurs with the period of order of 100ms or second. Meanwhile, normal logic signals operating

synchronously with a 100MHz clock toggle at every 10ns if switching activity is 1. Assuming the average switching activity of the logic signals to be 0.2, they toggle at every 50ns. Thus, the toggle between active and standby occurs  $10^{6}$ - $10^{7}$  times less frequently than that of the normal logic signals. Another factor affecting dynamic power is load capacitance. It should be noted that the MTE signal drives only MT cells instead of driving the entire circuit in the Selective MT technique. The MT cells occupy only 12% in the entire circuit. From these analyses, the approach to use the standby signal as the MTE signal is effective in the cell phone application to minimize standby leakage current with the negligible dynamic-power overhead.

For applications other than cell phones, there will be an approach to generate MTE signals on the chip and control in much finer ways. Research on algorithms to find appropriate MTE signals is future work.

# 5. CONCLUSIONS

We have proposed a Selective MT technique to reduce standby leakage current while achieving high performance. MT cells are selectively assigned to critical paths, while high-Vth cells are assigned to non-critical paths. We developed a set of MT cells as a library and built a design environment including Selective MT synthesis and layout. The effectiveness of the technique has been examined and proved by applying it to a test chip of a DSP core for W-CDMA cell phones. The worst path-delay was improved by 14% over the single high-Vth design without increasing standby leakage at 10% area overhead. Future work involves finding appropriate MTE signals for finer leakage optimization.

## 6. ACKNOWLEDGMENTS

The authors would like to thank K. Ochii, T. Yoshimori, S. Imai, K. Matsuo, M. Sahoda, S. Nishio, T. Mori, M. Yamada and H. Ohta for their support. They also would like to thank T. Ishikawa, H. Zama, D. Sonoda, K. Mori and M. Mushiga for their contribution.

# 7. REFERENCES

- J. Kao, A. Chandrakasan, D. Antoniadis, "Transistor Sizing Issues and Tool For Multi-Threshold CMOS Technology", DAC-97, pp.409-414, 1997.
- [2] A. Keshavarzi, et al, "Technology Scaling Behavior of Optimum Reverse Body Bias for Standby Leakage Power Reduction in CMOS IC's", ISLPED'99, pp.252-253, 1999.
- [3] T. Kuroda, et al, "A 0.9V 150MHz 10mW 4mm<sup>2</sup> 2-D discrete cosine transform core processor with variablethreshold-voltage scheme", *IEEE J. Solid-State Circuits*, vol.31, pp.1770-1779, Nov. 1996.
- [4] S.Mutoh, T. Douseki, Y. Matsuya, T. Aoki, S. Shigematsu, J. Yamada, "1-V Power Supply High-Speed Digital Circuit Technology with Multithreshold-Voltage CMOS", IEEE JSSC, vol.30, no.8, pp.847-854, Aug. 1995.
- [5] K. Roy and S. Prasad, "Low-Power CMOS VLSI Circuit Design", pp.214-222, John Wiley & Sons, Inc., 2000.
- [6] L. Wei, Z. Chen, M. Johnson, K. Roy, "Design and Optimization of Low Voltage High Performance Dual Threshold CMOS Circuits", DAC-98, pp.489-494, 1998.