# Energy-efficient Dynamic Circuit Design in the Presence of Crosstalk Noise

Ganesh Balamurugan and Naresh R. Shanbhag

VLSI Information Processing Systems (VIPS) Group Coordinated Science Laboratory/ECE Department University of Illinois at Urbana-Champaign 1308 West Main Street, Urbana, IL 61801
E-mail : [balamuru, shanbhag]@uivlsi.csl.uiuc.edu

#### Abstract

This paper describes the impact of crosstalk noise on low power design techniques based on voltage scaling. It is shown that this power saving strategy aggravates the crosstalk noise problem and reduces circuit noise immunity. A new energy-efficient, noise-tolerant dynamic circuit technique is presented to address this problem. In a  $0.35\mu m$  CMOS technology and at a given supply voltage, the proposed technique provides an improvement in noiseimmunity of 1.8X(for an AND gate) and 2.5X(for an adder carry chain) over domino at the same speed. We use this fact to operate the noise-tolerant circuit at a lower supply voltage to obtain energy savings of about 30%, while expending 30% more area. Also, to achieve a given noise immunity, the proposed technique consumes 40% less energy compared to existing noise-tolerance techniques.

### 1. Introduction

Over the last few years, much work has been done to develop low-power techniques at various stages of the design cycle [1]. Most of this work considered the tradeoffs/interplay among delay, power dissipation, and area [2, 3, 4]. Today, as a result of the relentless scaling of device and interconnect dimensions, in addition to the above, noise has emerged as an important design parameter [5].

Deep submicron noise is the general term used to designate any phenomenon that causes the voltage at a nonswitching node to deviate from its nominal value [5]. It thus includes [6] power supply noise caused by circuit switching, crosstalk noise due to capacitive coupling between neighboring interconnects, and fluctuations in device parameters due to process variations [7]. For high speed dynamic logic circuits, charge-sharing and leakage [8] are additional noise sources. While these noise phenomena have always existed, it is only recently that technology scaling and aggressive design practices have brought them to the forefront. In this paper, we will be concerned with crosstalk noise and its effect on low-power dynamic logic circuits, which are especially susceptible to noise [9].

Crosstalk noise between interconnects is expected to become increasingly significant with the growing interconnect aspect ratios [10], that lead to a larger fraction of the wire capacitance being due to lateral coupling capacitance. Capacitive coupling causes a voltage glitch on a quiet wire, usually called the victim, due to the switching of a neighboring wire, termed the aggressor. This voltage glitch, appearing at the input of a logic gate can cause false switching, if it has sufficient amplitude and persists for a sufficient amount of time. Hence, in this paper, we term this effect of crosstalk as input noise.

The principal contributions of this paper are: (1) to show that the crosstalk noise problem is aggravated by voltage scaling, and (2) to propose an energy-efficient dynamic circuit technique that addresses this problem. In section 2, we present some analysis and simulation results to prove the first point. The increased sensitivity to input noise caused by voltage scaling needs to be addressed by using special noise-tolerant circuit techniques. Section 3 briefly reviews existing techniques for noise-tolerant dynamic circuit design. In section 4, a new noise-tolerant circuit technique is described. Simulation results are presented to show that this technique is more energy efficient than existing techniques.

## 2. Impact of $(V_{dd}, V_T)$ scaling on Crosstalk Noise

A very popular low power strategy is the scaling of the supply voltage  $V_{dd}$  [1, 2, 3, 4]. One can obtain large savings

in power using this approach, due to the quadratic dependence of power dissipation on  $V_{dd}$ . However, for this to be practical, another technique that compensates for the increased delay needs to be employed concurrently. One way to achieve this compensation is to use reduced threshold voltages [11]. Using the alpha power law model [12], the following expression for delay  $\tau_d$  can be obtained:

$$\tau_d \propto \frac{C_L \cdot V_{dd}}{\left(V_{dd} - V_T\right)^{\alpha}} , \qquad (1)$$

where  $C_L$  is the capacitance that is switched, and  $V_{dd}$  and  $V_T$  are the supply and transistor threshold voltages, respectively, and  $\alpha$  is the velocity saturation index that assumes a value close to 2 for long-channel transistors. For shorter channel lengths however,  $\alpha$  is close to 1 due to velocity saturation [13]. For simplicity, we will assume that  $\alpha=1$  in the following analysis.

Let the unscaled voltages be denoted by  $V_{dd_U}$ ,  $V_{T_U}$  and the scaled voltages be denoted by  $V_{dd_S}$ ,  $V_{T_S}$ . From equation(1), in order to achieve the same delay after voltage scaling, the following equation must be satisfied:

$$\frac{V_{T_U}}{V_{dd_U}} = \frac{V_{T_S}}{V_{dd_S}} = r.$$
 (2)

Thus, the ratio  $V_T/V_{dd}$  should be constant. To determine the impact of this type of voltage scaling on crosstalk noise, let us consider the circuit model for a typical crosstalk scenario shown in Fig. 1, where we have a domino gate whose input is strongly coupled to a neighboring wire, while the voltage at the input node A is being held nominally at ground.



Figure 1: Circuit model used to obtain the behavior of crosstalk noise with voltage scaling and its effect on dynamic logic circuits.

If  $V_N(t)$  is the noise pulse at node A, and  $C_x$  is the dynamic node capacitance, we have,

$$C_x \cdot \frac{dV_x}{dt} \approx k \cdot (V_N(t) - V_T), \qquad (3)$$

where we have assumed a negligible on-resistance of Mclk, and that the transistor Ma is in saturation. This will result in a slight overestimation of the voltage drop at the dynamic node due to input noise, but simulation results show that equation(3) correctly predicts the qualitative behavior of input noise under the above low power strategy. From equation(3), we find that the voltage degradation of the dynamic node due to input noise is proportional to the average current due to input noise denoted by  $I_{n,ave}$ .

$$I_{n,ave} \propto \int (V_N(t) - V_T) \cdot dt, \qquad (4)$$

where the integration is performed over the time interval when  $V_N(t) > V_T$  (sub-threshold conduction is assumed to be negligible). We will henceforth refer to  $I_{n,ave}$  as the noise current.



Figure 2: HSPICE simulation results showing the  $V_{dd}$ normalized waveforms at the input and output of a 2-input domino AND gate for three  $(V_{dd}, V_T)$  combinations that achieve the same delay. Transistor parameters for the  $0.35\mu m$  process HPCMOS10QA were used in the simulations. It can be seen that sensitivity of the circuit to input crosstalk noise increases with voltage scaling.

In order to determine the dependence of the noise current  $I_{n,ave}$  on  $V_{dd}$ , let us consider the simple case of a step waveform at the aggressor. If the buffer driving the input A is modeled as a resistor  $R_v$  to ground, the input noise is given by

$$V_N(t) = k_c \cdot V_{dd} \cdot e^{-t/\tau_c}, \qquad (5)$$

where  $\tau_c = R_v \cdot (C_c + C_v)$  and  $k_c = \frac{C_c}{(C_c + C_v)}$ ,  $C_c$  being the (lumped) coupling capacitance, and  $C_v$  being the capacitance to ground at the victim node A.

Substituting equation (5) in equation (4), we obtain

$$I_{n,ave} \propto \tau_c (k_c \cdot V_{dd} - (1 + \ln \frac{k_c V_{dd}}{V_T}) \cdot V_T).$$
 (6)

Assuming  $R_v$  is inversely proportional to  $(V_{dd} - V_T)$  and employing equation(2), we have

$$I_{n,ave} \propto \frac{k_c - r \cdot (1 + \ln \frac{k_c}{r})}{1 - r}.$$
 (7)

All terms in this equation are independent of  $V_{dd}$  and  $V_T$ . Thus, we see that the noise current  $I_{n,ave}$  stays approximately constant with voltage scaling.

Let  $I_{crit}$  be defined as the value of  $I_{n,ave}$  required to bring the voltage of the dynamic node below the inverter threshold  $V_{th,inv}$  and cause false switching. Thus  $I_{crit}$ , which is a measure of the noise margin, is proportional to  $V_{dd}$  and decreases as  $V_{dd}$  is reduced. From equation(3),

$$I_{crit} \propto (V_{dd} - V_{th,inv}) \propto V_{dd}.$$
 (8)

Thus the tolerable noise current  $I_{crit}$  decreases with voltage scaling while the actual noise current remains constant. Together, these facts imply an increased sensitivity of dynamic circuits to input crosstalk noise at lower voltages. Simulation results shown in Fig. 2 corroborate the conclusions drawn from the above analysis using first order models. Fig. 2(a) shows the  $V_{dd}$ -normalized crosstalk noise waveforms at the input node A of a 2-input domino AND gate for three different  $(V_{dd}, V_T)$  combinations that achieve the same delay. Transistor parameters from the MOSIS  $0.35\mu m$  process HPCMOS10QA were used in the simulations. The coupling capacitance is 20fF, and  $C_v$  is about 10fF. Fig. 2(b) shows the effect of this input noise at the output of the domino gate.

Thus we conclude that the impact of input crosstalk noise becomes greater at the lower supply voltages. However, operating at these low voltages is desirable from a low-power perspective - hence the need for energy-efficient noise-tolerant circuit techniques.

# 3. Noise-tolerant Dynamic Circuit Design - Existing Techniques and Noise Immunity Metrics

The switching threshold of a dynamic logic gate is approximately  $V_T$ , the transistor threshold voltage. For a  $0.35 \mu m$  MOSIS process, this is about 0.6V, and may not be adequate to prevent false switching due to input crosstalk noise. From Fig. 2, we can see that for strongly coupled lines, input noise can be a significant fraction of  $V_{dd}$  and last for sufficient time to cause false switching. Several techniques have been proposed to increase this threshold [14, 15, 16]. Fig. 3(a) shows the CMOS inverter technique proposed in [14] that uses additional PMOS transistors at the gate inputs to adjust the switching threshold. However, this technique is not suitable for OR/NOR type logic, since in this configuration, multiple PMOS transistors drive the dynamic node. Fig. 3(b) shows the *PMOS* pull-up technique [15] that increases the switching threshold by an amount depending on the size of the PMOS



Figure 3: Existing techniques to improve the noise immunity of dynamic logic circuits using (a) an inverter [14] and (b) a PMOS pull-up transistor [15].

pull-up device. This technique however suffers from static power dissipation.

Another noise-tolerance technique, which we will refer to as the *mirror technique* has been proposed recently [16]. A 2-input AND gate implemented using this technique is shown in Fig. 5(a).

This technique uses the principle of a Schmitt trigger to increase the dynamic switching threshold using a mirror NMOS network. A first order approximation of the new



Figure 4: A simplified circuit used to calculate the switching threshold of a dynamic gate using an NMOS mirror.

switching threshold  $V_{sw}$  can be obtained by a dc analysis of the simplified circuit shown in Fig. 4 and is given by,

$$V_{sw} = V_T + \frac{V_{dd} - V_T}{1 + K_r}.$$
(9)

The noise margin is thus a function of  $K_r = \frac{(W/L)_{M2}}{(W/L)_{M3}}$ , decreasing towards  $V_T$  for large  $K_r$ , and increasing towards  $V_{dd}$  for small  $K_r$ . The principal drawback of this technique is the increase in the number of series transistors in the pull-down path. To compensate for the increased delay, transistors will have to be sized up, resulting in more power dissipation and larger area. In spite of this, it was shown in [16] that the mirror technique provides the highest noise immunity per unit energy consumed. Variations of this technique involve simplifications of the upper or

lower NMOS networks so that there is just one additional transistor per series path. We will investigate these in section 4.



Figure 5: A noise-tolerant dynamic circuit technique using a mirror NMOS network [16]. The lower NMOS network may be a duplicate of the upper nmos structure as in (a) [16] or simplified as shown in (b).

In order to compare the performance of various noisetolerance techniques, we will use a noise metric - Average Noise Threshold Energy (ANTE) - proposed in [16]. This metric is based on noise immunity curves, which are plots of noise amplitude  $V_{noise}$  Vs duration  $T_{noise}$  that cause logic errors. Each point on a noise immunity curve defines a noise pulse of a certain amplitude  $V_{noise}$  and width  $T_{noise}$ that is sufficient to cause switching. ANTE is the average of the noise energy represented by all such points and is defined by the following equation:

$$ANTE = E(V_{noise}^2 \cdot T_{noise}), \qquad (10)$$

where E() denotes the expectation value. A related parameter that quantitatively describes energy-efficiency is the  $ANTE - normalized \, energy(EANTE)$ , given by the ratio of the average energy consumption of a circuit to its ANTE measure.

$$EANTE = \frac{energy}{ANTE}.$$
 (11)

The goal of all noise-tolerance techniques is to increase ANTE, while reducing ANTE - normalized energy.

## 4. Proposed Noise-tolerant Dynamic Circuit Design Technique

In this paper, we propose the *twin-transistor technique*, illustrated in Fig. 6, for the simplest dynamic circuit an inverter. For every transistor in the NMOS network, an additional cross-coupled NMOS transistor - the twin transistor - is provided for improved noise immunity. The gate of the twin transistor is connected to the dynamic node, the drain to the input and the sources of the two transistors are shorted as shown in Fig. 6. In Fig. 6, the twin transistor M2 pulls up the source of M1 until



Figure 6: A simplified circuit used to calculate the switching threshold of a dynamic gate using twin transistors at the gate inputs.

the voltage at A exceeds a certain critical value (larger than  $V_T$ ). This critical voltage  $V_{sw}$  can be found to be approximately,

$$V_{sw} = V_T + \frac{V_{dd} - V_T}{1 + 2 \cdot K_r \cdot (\frac{V_{dd}}{V_T} - 1)},$$
 (12)

where  $K_r = \frac{(W/L)_{M3}}{(W/L)_{M2}}$ . This equation was derived by equating the currents of M2 and M3 in Fig. 6 when the gatesource voltage of M1 is equal to the threshold voltage. As in the derivation of equation(9), we have assumed a linear dependence of the drain current on the gate-source voltage due to velocity saturation. The important design variable is the ratio  $K_r$ , whose value can be adjusted to tradeoff speed for noise immunity. For most cases, a minimum sized twin transistor M2 should be sufficient. Although this technique does not increase the number of series transistors in the pull-down path, it has an adverse effect on delay due to the increased capacitance at internal circuit nodes. This can however be compensated for by careful transistor sizing.

All techniques that improve the noise-tolerance of dynamic circuits do so at the expense of additional delay and/or power. Thus the performance advantage of dynamic circuits over their static counterparts is diminished. However, except for circuits with long pull-down chains, dynamic circuits using twin transistors should be faster than static circuits due to the absence of the PMOS pull-up network, which increases the input and node capacitance in static circuits.

A 2-input dynamic AND gate is employed as a test circuit for comparing the performance of different noise-tolerance techniques. It has been shown [16] that the mirror approach is more energy-efficient than other existing techniques. Hence, we will use this technique for comparison purposes. As noise immunity and energy-efficiency metrics, we will use ANTE and ANTE - normalized energy respectively, as defined in section 3.

A 2-input AND gate designed using twin transistors is shown in Fig. 7. All the circuits were designed to operate at a maximum speed of 1.8 GHz at 3.3V supply voltage. The load is a True Single Phase Clocked (TSPC) latch.



Figure 7: A 2-input AND gate implemeted using twin transistor domino logic.

HSPICE simulations were performed using the model parameters for the  $0.35\mu m$  MOSIS process HPCMOS10QA. A performance comparison in terms of energy, ANTE, ANTE - normalized energy, and area is shown in Table 1.

Table 1: Performance comparison of various noise-tolerance techniques for a 2-input AND gate at  $V_{dd}$ =3.3V

|                 | Area          | Energy<br>(f1) | ANTE $(V^2 - \pi a)$    | EANTE |
|-----------------|---------------|----------------|-------------------------|-------|
| Conventional    | $(\mu m)$ 120 | 64             | $\frac{(v - ns)}{0.72}$ | 8.9   |
| Mirror          | 238           | 110            | 1.3                     | 8.5   |
| Reduced Mirror  | 180           | 260            | 1.16                    | 22.2  |
| Twin Transistor | 160           | 75             | 1.33                    | 5.64  |

The reduced mirror entries correspond to a variation of the NMOS mirror circuit where the lower network is simplified as in Fig. 5(b). It is seen that both the mirror and twin transistor techniques offer improved noise immunity (1.8X over conventional domino) at the expense of energy and area. This is also seen in Fig. 8 which shows the noise immunity curves for three cases. However, the twin tran-



Figure 8: Noise immunity plots showing the noise amplitude and duration required to cause erroneous switching in a 2-input AND gate implemented using three different circuits - a conventional domino logic circuit, and two noise-tolerant circuits, one using an NMOS mirror [16] and the other using twin transistors.

sistor approach is more energy efficient as seen from the

lower value of ANTE - normalized energy. To achieve the same level of noise immunity, the twin transistor technique expends 45% less energy. The reduced NMOS mirror circuit suffers from static power dissipation and is hence energy inefficient.

Since the twin transistor technique achieves higher noise immunity than a conventional domino logic circuit, one can operate at a smaller supply voltage to achieve the noise immunity of the conventional circuit at a higher voltage. Of course, the threshold voltage should also be scaled to ensure a constant delay. We used three  $(V_{dd}, V_T)$  combinations that achieve the same delay and simulated the conventional domino circuit at these voltages. The resulting ANTE and energy consumption are plotted in Fig. 9. This



Figure 9: A comparison of the energy consumption (per clock cycle) and ANTE of a 2-input AND gate implemented using conventional domino and twin transistor domino circuits.  $V_{dd}$  assumes values of 3.3V, 2.5V and 2.0V and in each case  $V_T$  is scaled to ensure that the maximum operating frequency is 1.8 GHz. The energy consumption includes energy consumed by the clock and input buffers.

plot shows that to achieve a certain noise immunity (quantified by ANTE), it is more energy-efficient to use the twin transistor domino circuit at a lower  $V_{dd}$  than the conventional circuit at a higher  $V_{dd}$ . For example, to achieve an ANTE of about 700  $V^2$ -ns, one can either operate a simple domino circuit at 3.3V or the twin transistor domino circuit at 2.5V ( $V_T$  is about 0.4V). However, Fig. 9 shows that the latter approach uses about 30% less energy.



Figure 10: The carry-generate section of a dynamic full adder.

Another circuit to illustrate the twin transistor technique is shown in Fig. 10, which could be the carry-generate section of a dynamic full adder implemented in NP-CMOS. Once again, we compare the performance of this circuit with the mirror and reduced mirror noise-tolerance techniques. All circuits were designed to operate at maximum speed of 1.2 GHz ( $0.35\mu m$  CMOS technology,  $V_{dd}=3.3$ V). A comparison of the performance of various designs is presented in Table 2. As before, we see an improvement in noise immunity, quantified by a 2.5X increase in the ANTE. The twin transistor technique however consumes 30% less energy to obtain this improvement.

Table 2: Performance comparison of various noise-tolerance techniques for the carry-generate section of a full adder at  $V_{dd}$ =3.3V

|                 | Area        | Energy | ANTE         | EANTE              |
|-----------------|-------------|--------|--------------|--------------------|
|                 | $(\mu m^2)$ | (fJ)   | $(V^2 - ns)$ | $(\times 10^{-4})$ |
| Conventional    | 192         | 153    | 0.71         | 2.15               |
| Mirror          | 366         | 288    | 1.71         | 1.68               |
| Reduced Mirror  | 313         | 366    | 1.18         | 3.1                |
| Twin Transistor | 330         | 216    | 1.71         | 1.26               |

### 5. Conclusions

We have shown, by analysis using simple models and by simulations, that the sensitivity of dynamic circuits to crossstalk noise increases when the voltage scaling approach to low power design is employed. A new noise-tolerant dynamic circuit technique has been presented to address this problem. Through simulations of a 2-input AND gate at various  $(V_{dd}, V_T)$  combinations, we have shown that it is more energy-efficient to operate a noise-tolerant dynamic circuit at a low supply voltage as opposed to a conventional dynamic circuit at a higher supply voltage. Also, the new technique has been shown to be more energy-efficient than existing noise-tolerance techniques.

### 6. Acknowledgements

This work was supported by NSF CAREER award MIP-9623737.

#### References

- [1] A. Chandrakasan and R. Brodersen, Low Power CMOS Design. IEEE Press, 1998.
- [2] R. Gonzalez, B. M. Gordon, and M. A. Horowitz, "Supply and threshold voltage scaling for low power cmos," *IEEE Journal of Solid-State Circuits*, vol. 32, pp. 1210-1216, August 1997.

- [3] T. Kuroda, K. Suzuki, S. Mita, T. Fujita, F. Yamane, F. Sano, C. Akihiko, Y. Watanabe, M. Yoshinori, K. Matsuda, T. Maeda, T. Sakurai, and F. Tohru, "Variable supply-voltage scheme for low-power highspeed cmos digital design," *IEEE Journal of Solid-State Circuits*, vol. 33, pp. 454-462, March 1998.
- [4] D. Liu and C. Svensson, "Trading speed for low power by choice of supply and threshold voltages," *IEEE J. Solid-State Circuits*, pp. 10–17, January 1993.
- [5] K. L. Shepard, "Design methodologies for noise in digital integrated circuits," *Proceedings of DAC'98*, 1998.
- [6] K. Shepard and V. Narayanan, "Noise in deep submicron digital design," *IEEE/ACM International Con*ference on Computer-Aided Design, Digest of Technical Papers, pp. 524-531, 1996.
- [7] C. Murthy and et al, "Process variation effects on circuit performance: Tcad simulation of 256-mbit technology," *IEEE Transactions on Computer-Aided De*sign of Integrated Circuits, vol. 16, November 1997.
- [8] N. Weste and K. Eshragian, Principles of CMOS VLSI Design - A Systems Perspective. Addison-Wesley, 1992.
- [9] P. Larsson and C. Svensson, "Noise in digital dynamic cmos circuits," *IEEE Journal of Solid State Circuits*, vol. 29, pp. 655–662, June 1994.
- [10] SIA, National Technology Roadmap for Semiconductors. SEMATECH, Inc., 1997.
- [11] T. Kuroda and T. Sakurai, "Low voltage technology and circuits (invited)," in Low Power CMOS Design (A. Chandrakasan and R. Brodersen, eds.), IEEE Press, 1998.
- [12] T. Sakurai and A. R. Newton, "Alpha-power law mosfet model and its application to cmos inverter delay and other formulas," *IEEE Journal of Solid-State Circuits*, vol. 25, pp. 584–593, April 1990.
- [13] R. S. Muller and T. I. Kamins, Device Electronics for Integrated Circuits. John Wiley and Sons, 2 ed., 1986.
- [14] J. J. Covino, Dynamic CMOS Circuits with Noise Immunity. US Patent 5650733, 1997.
- [15] G. P. D'Souza, Dynamic Logic Circuit with Reduced Charge Leakage. US Patent 5483181, 1996.
- [16] L. Wang and N. R. Shanbhag, "Noise-tolerant dynamic circuit design," ISCAS, 1999.