# Efficient Macromodeling for On-Chip Interconnects

Qinwei Xu and Pinaki Mazumder EECS Dept. University of Michigan, Ann Arbor, MI 48109 Email: qwxu@eecs.umich.edu, mazum@eecs.umich.edu

# Abstract

The improved T and improved  $\Pi$  models are proposed for onchip interconnect macromodeling. Using global approximations, simple approximation frames are derived and applied to modeling of on-chip distributed RC interconnects. The applications lead to equivalent circuit models for on-chip interconnects, which are represented by the improved T and improved  $\Pi$  models. By matching the first three moments of an open-ended interconnect, the improved  $\Pi$  model with AWE is consequently obtained, which retains the symmetric structure. The new models for distributed RC interconnects are independent of CMOS gates, and therefore can be directly incorporated into SPICE frames. Numerical experiments show that for current feature sizes, the improved T and improved  $\Pi$ modeling methods can be used to accurately evaluate on-chip interconnect effects, while the computational costs are comparable to the original T and original  $\Pi$  modeling. The presented macromodeling approaches are useful for quick simulation and layout optimization.

# **1** Introduction

Due to VLSI feature size shrinking in CMOS and GaAs technologies, gate delays decrease and interconnect delays increase to such an extent that the interconnect delays have dominated overall path delays for the current integrated circuits and systems. Meanwhile, the on-chip interconnects present transmission line effects when the operating frequency reaches several Giga Hertzs [1]. As the interconnect effects previously estimated by simple lumped capacitances results in unsatisfied accuracy, they need more accurate modeling for synthesis and layout optimization. On the other hand, the existing accurate modeling approaches generally represent the distributed parameters by cascaded RC networks, which makes the circuit sizes very large [2], therefore they are not efficient to be used in the iterative process of synthesis and layout optimization. Hence, accurate yet efficient modeling is needed as demanded by VLSI design.

The simplest approximation of the interconnect tree is the total capacitance of the tree (see Fig. 1), which is a first-order approximation [3]. For submicron technologies, the total interconnect resistance is large and comparable to the driver output resistance, and cannot be neglected in estimating the gate delay. The actual delay is much smaller than that derived from the lumped capacitance model because the interconnect resistance

tance acts as a shield to reduce the load capacitance seen by the gate driver. The lumped L-type RC model gathers the total interconnect resistance and total capacitance as a simple lumped RC segment, and yields an optimistic delay estimate because the total interconnect resistance is lumped together and shields the total capacitance. The one-segment  $\Pi$  model was proposed to approximate the load interconnect at the gate by matching the first three moments of the driving point admittance of the gate [4], giving better accuracy. An effective load capacitance method including the shielding effect of interconnect resistors was proposed by calculating the effective capacitance that gives the same average current as the RC  $\Pi$  model load, assuming that the driving gates can be represented by piecewise linear devices [5]. As the deep submicron technique is coming to prevail in VLSI design, more accurate modeling approaches are required.

Currently the reduced-order macromodels are popularly used for simulating complicated interconnects. Asymptotic Waveform Evaluation (AWE) is the most well-known method to approximate general linear networks using momentmatching technique [6]. However, higher order moments lead to undesirable conditions when increasing the order of moments does not guarantee a better approximation. Furthermore, AWE may lead to unstable poles although the original network is stable. As the extension to AWE, the multipole expansion[7] and Krylov subspace[8] techniques are available for model reduction. There is another class of reduction techniques based on congruence transformations [9]. These techniques can give high accuracy, given considerable computational costs. However, the processes of layout optimization and high-level synthsis in VLSI design require efficient modeling tools, because the processes need many iterations, in which the computational efficiency is a critical bottleneck. In this paper, the approximation frames with compound-order approximations are derived for on-chip RC interconnect modeling. The approximation frames are based on the global approximations, which gives higher order accuracy than local approximation does. The application of this approximation frames to interconnects leads to the improved T and improved II macromodels for on-chip RC interconnects, which have the similar simple forms as one-segment T and  $\Pi$  models, yet have higher accuracy. The new models for distributed RC interconnects are independent of CMOS gates, and can be directly incorporated into SPICE frames.



Figure 1: Simple models: lumped capacitance model, L model, T model, and  $\Pi$  model.

# 2 RC Interconnect Modeling

For different requirement of accuracy, the interconnects are traditionally modeled as cascades of L, T or  $\Pi$  elements[10], each of which is a small fraction of the entire interconnects. Although these approaches can provide high accuracy if the number of cascaded elements is large enough, they are not efficient for iterative optimization. Instead, simple models like one-segment T or  $\Pi$  elements are desired. For on-chip interconnects, one-segment T and  $\Pi$  models can give satisfied accuracy if the operating frequency is relatively low. At high operating frequency, the simple one-segment T and  $\Pi$  models suffer from accuracy loss.

#### 2.1 Improved T model

In Laplace domain, the normalized diffusion equation governing a distributed RC interconnect can be written as

$$\frac{1}{R}V''(x,s) = sCV(x,s) \tag{1}$$

where V(x, s) is the distributed voltage, and ' denotes the derivative with respect to x.  $x \in [0, 1]$  is the normalized interconnect length, R is the normalized distributed resistance and C the normalized distributed capacitance over the interconnect. By introducing the distributed current I(x, s), the diffusion equation can be written in the form:

$$V'(x,s) = -RI(x,s) \tag{2}$$

$$I'(x,s) = -sCV(x,s) \tag{3}$$

Along the line, we select three grid points:  $x_0 = 0$ ,  $x_1 = 1/2$  and  $x_2 = 1$ , then take three voltage variables  $V_0 = V(x_0, s)$ ,  $V_1 = V(x_1, s)$  and  $V_2 = V(x_2, s)$ , and two current variables  $I_1 = I(x_0, s)$  and  $I_2 = I(x_2, s)$ . By applying the global approximation to computing the current and voltage difference, we obtain the following approximation frames [11]:

$$I(x_2, s) - I(x_0, s) = aI'(x_1, s)$$
(4)

$$V(x_1,s) - V(x_0,s) = b_1 V'(x_0,s) + b_2 V'(x_2,s)$$
(5)

$$V(x_2,s) - V(x_1,s) = b_3 V'(x_0,s) + b_4 V'(x_2,s)$$
 (6)

where a,  $b_1$ ,  $b_2$ ,  $b_3$  and  $b_4$  are coefficients to be determined by using fitting functions. By using the generalized Galerkin's method [12], we choose I(x, s) = x as the fitting function to determine a in Eqn. 4, that is

$$x_2 - x_0 = ax'|_{x = x_1} \tag{7}$$

then a = 1 is obtained. Similarly, using V(x, s) = x and  $V(x, s) = x^2$  as fitting functions to determine  $b_1, b_2$  in Eqn. 5 obtains

$$x_1 - x_0 = b_1 x'|_{x = x_0} + b_2 x'|_{x = x_2}$$
(8)

$$x_1^2 - x_0^2 = b_1(x^2)'|_{x=x_0} + b_2(x^2)'|_{x=x_2}$$
(9)

which gives  $b_1 = 3/8$  and  $b_2 = 1/8$ . Doing the same operations to Eqn. 6 results in  $b_3 = 1/8$  and  $b_4 = 3/8$ . Taking Eqns. 2-3 into consideration, the approximation frame can be written as:

$$I_1 - I_2 = sCV_1 (10)$$

$$V_0 - V_1 = 3/8 R I_1 + 1/8 R I_2$$
(11)

$$V_1 - V_2 = 1/8 R I_1 + 3/8 R I_2 \tag{12}$$

Eqn. 7 has an accuracy order of  $O(x^2)$ , while Eqns. 8-9 have an accuracy order of  $O(x^3)$ . Therefore, the approximation frame, given by Eqns.10-12 which are determined by Eqns. 7-9 has compound order accuracy.

A process of mathematical manipulations shows that Eqns. 10-12 represent an equivalent circuit having the modified T type topology as shown in Fig. 2.

Figure 2: Improved T model for RC interconnect.

The difference between the model in Fig. 2 and the original T model is that there is an additional negative resistance in the derived T model, which is called improved T model.

#### **2.2** Improved $\Pi$ model

Eqns. 2-3 present the relationship of duality, i.e., if replacing V(x, s) with I(x, s) and R with sC, then Eqn. 2 becomes Eqn. 3, and vice versa. Based on this, the duality also applies to Eqns. 4-12. The duality of Eqns. 4-12 give rise to another approximation frame,

$$V_1 - V_2 = RI_1 \tag{13}$$

$$I_0 - I_1 = 3/8 \, sCV_1 + 1/8 \, sCV_2 \tag{14}$$

$$I_1 - I_2 = 1/8 \, sCV_1 + 3/8 \, sCV_2 \tag{15}$$

The equivalent circuit model of Eqns. 13-15 is the improved  $\Pi$  model as shown in Fig. 3. The difference between the improved  $\Pi$  model and the original  $\Pi$  model is that the former includes an additional negative capacitance.



Figure 3: Improved  $\Pi$  model for RC interconnect.

#### **2.3** Improved $\Pi$ model with 3rd order AWE

By using AWE technique, a  $\Pi$  model presented in [13] matches the first three moments of the driving point admittance at the output of a CMOS gate. It depends only on the total interconnect tree parameters, and therefore has high efficiency for interconnect delay modeling. However, unlike the original  $\Pi$  model, the model in [13] has an asymmetric structure, i.e., it is an anisotropic model, though the original interconnect is a symmetric one. This results in difficulty when modeling the bi-directional interconnects which demands same transfer properties in two directions. By varying the improved  $\Pi$  model so as to match the first three moments of the analytical driving point admittance, we derive new  $\Pi$  models having symmetric structures.

From Eqns. 2-3, the admittance of an open-ended RC line can be obtained from the 2-port parameters as [14]

$$Y(s) = \frac{tanh(\sqrt{RCs})}{\sqrt{R/Cs}}$$
  
=  $Cs - \frac{1}{3}RC^2s^2 + \frac{2}{15}R^2C^3s^3 + \dots$  (16)

For an open-ended interconnect, the driving point admittance of a  $\Pi$  model as shown in Fig. 3 can be expanded as follows,

$$Y_{eq}(s) = (C_1 + C_2)s - -R_1C_2^2s^2 + R_1^2C_2^2(C_2 + C_3)s^3 + \dots$$
(17)

By equating the first three items of Eqns. 16 and 17 and maintaining the symmetric form of the  $\Pi$  model, i.e.,  $C_1 = C_2$ , we obtain the parameters of the  $\Pi$  model with 3rd order AWE:  $C_1 = C_2 = C/2$ ,  $R_1 = 4R/3$ , and  $C_3 = -C/5$ . The improved  $\Pi$  with 3rd order AWE has the a similar form but has different parameters, as shown in Fig. 4.

### **3** Passivity and Accuracy

As the derived models shown in Figs. 2-4 include the negative resistance or capacitance, there seemingly will generate energy inside the models. However, the original interconnects are passive devices, which requires that no energy be generated inside the models. Therefore, there arises a question: do the models in Figs. 2-4 preserve passivity as the original interconnects do? The following theoretical analysis shows that all



Figure 4: Improved  $\Pi$  model with AWE for RC interconnect.

the models preserve passivity. In order to do this, the following definitions and results are referred to [15].

*Lemma 1:* Necessary and sufficient conditions for a transfer function  $n \times n$  matrix  $\mathbf{Y}(s)$  to be passive is that  $\mathbf{Y}(s)$  is positive-real, i.e.:

(1) each element of  $\mathbf{Y}(s)$  is analytic in  $\Re(s) \ge 0$ , (2)  $\mathbf{Y}(s^*) = \mathbf{Y}^*(s)$  and

(3)  $(\mathbf{Y}^*)^T(s) + \mathbf{Y}(s)$  is non-negative definite for all  $\Re(s) \ge 0$ .

*Lemma 2:* An *n*-port network is passive if and only if its admittance matrix  $\mathbf{Y}(s)$  is positive-real.

*Lemma 3:* If  $\mathbf{A}(s)$  is positive-real, then  $\mathbf{A}^{-1}(s)$  is positive-real, if it existed.

*Lemma 4:* If  $\mathbf{A}(s)$  is positive-real and  $\mathbf{B}$  is real, then  $\mathbf{B}^T \mathbf{A}(s)\mathbf{B}$  is positive-real.

Consider the 2-port model shown in Fig. 2 and its constitute equations Eqns. 10-12. If we think of  $V_0$  and  $V_2$  as input independent voltages, the MNA equations can be obtained as

$$\begin{bmatrix} sC & -1 & 1\\ 1 & 3/8R & 1/8R\\ -1 & 1/8R & 3/8R \end{bmatrix} \begin{bmatrix} V_1\\ I_1\\ I_2 \end{bmatrix} = \begin{bmatrix} 0\\ V_0\\ -V_2 \end{bmatrix}$$
(18)

Then the admittance matrix is obtained by deriving  $I_1$  and  $-I_2$ , setting both  $V_0$  and  $V_2$  to 1's:

$$\mathbf{Y}(s) = \mathbf{B} \begin{bmatrix} sC & -1 & 1\\ 1 & 3/8R & 1/8R\\ -1 & 1/8R & 3/8R \end{bmatrix}^{-1} \mathbf{B}^T$$
(19)

where

$$\mathbf{B} = \begin{bmatrix} 0 & 1 & 0\\ 0 & 0 & -1 \end{bmatrix}$$
(20)

By *Lemmas 1-4*,  $\mathbf{Y}(s)$  in Eqn. 19 can be easily verified to be positive real; therefore, the model shown in Fig. 2 is passive.

Noting that Eqns. 13-15 and Eqns. 10-12 are dual to each other, then the model shown in Fig. 3 has the following impedance matrix:

$$\mathbf{Z}(s) = \mathbf{B} \begin{bmatrix} R & -1 & 1\\ 1 & 3/8 \, sC & 1/8 \, sC\\ -1 & 1/8 \, sC & 3/8 \, sC \end{bmatrix}^{-1} \mathbf{B}^T$$
(21)

Eqn. 21 also proves to be positive real by *Lemmas 1-4*, therefore, the model in Fig. 3 is also passive. Similarly, the model in Fig. 4 can be proved to preserve passivity.

In order to show the accuracy of the derived modeling, consider a practical RC interconnect in a CMOS circuit with 0.18  $\mu m$  feature size as shown in Fig. 5 (a). For obtaining the frequency domain properties of the interconnect, the driving inverter is simplified as a voltage source with an internal resistor of  $R_d$ , and the driven inverter is represented by a load capacitance  $C_L$  [14] (see Fig. 5 (b)). The interconnect has distributed parameters  $R = 110 \ \Omega/mm$  and  $C = 130 \ fF/mm$ , with the length of 2 mm. Assuming that the load capacitance is  $C_L = 100 \ fF$  and that the internal resistance is  $R_d = 100 \ \Omega$ . The relative errors of the driving point admittance in the frequency domain are shown as in Fig. 6, using the models shown in Figs. 2-4.



Figure 5: CMOS circuits: (a) inverters with distributed interconnect load, and (b) equivalent circuit.



Figure 6: Relative error of improved T model for distributed RC interconnect.

As shown in Fig. 6, the improved T and  $\Pi$  models are more accurate than the original T and  $\Pi$  models over the frequency domain of interest, as compared to the  $\Pi$  model with 3rd order AWE which is less accurate than the improved T and  $\Pi$  models, although its relative error gets smaller at higher frequency.

### 4 Circuit Applications

The *first* example is the practical CMOS inverter circuit shown in Fig. 5 (a) which is laid out using MOSIS/TSMC .18 $\mu m$ feature size. The distributed RC line (metal 2) has the length of 2 mm, and its width is 4 $\lambda$ . The distributed resistance and capacitance are 222  $\Omega/mm$  and 123 fF/mm, respectively. The two identical inverters have the same parameters: channel length 0.18  $\mu m$  for both FETs, channel width  $Wp = 18 \ \mu m$ for PMOS and channel width  $Wn = 9 \ \mu m$  for NMOS. Using the original T model, original II model, improved T model, improved II model, II model with AWE, and improved II model with AWE to represent the interconnect, we incorporate the models into HSPICE as subcircuits [16]. The delayed waveforms at points A and B are shown in Fig. 7.



Figure 7: Responses at points A and B of CMOS circuit with length of 2 mm composed of RC interconnect.

The exact simulation results are obtained by using 10 segments of T models to represent the interconnect. Fig. 7 shows that at the near end point A, both AWE models give high agreement to the exact values; However, both AWE models have more errors than other models do at the far end point B: the II model with AWE gives optimistic delay and the improved II model with AWE gives pessimitic delay. On the other hand, the improved T and II models are more agreeable than other models at the far end point B.

Based on the above simulation results, further numerical experiments show the error distribution of applying the improved T model to modeling of different practical CMOS layouts whose schematics are shown as in Fig. 5 (a). The feature size is retained as .18  $\mu m$ , and the channel width of the FETs varies from 1  $\mu m$  to 10  $\mu m$ . The length of interconnect varies from 0.1 mm to 5 mm; the load capacitance varies from 30 pF to 200 pF; the distributed capacitance over interconnect varies from 20 fF to 200  $\Omega$ . We randomly select 1000 examples to compute their transient responses, and obtained the error dis-

tribution of their gate delays as shown in Fig. 8 (a). The results show that the error distribution of the improved T model has a well-formed standard distribution, as compared to the error distribution of the original T model shown in Fig. 8 (b). Fig. 8 (a) shows that the maximum relative error of the improve T model is in 6% of the exact values.



Figure 8: Error distribution for (a) improved T model and (b) original T model.

The *second* example is the same as the first example except that the interconnect length is 4 mm. The response waveforms as shown in Fig. 9 demonstrates that both the II model with AWE and the improved II model with AWE give better accuracy at the near end point A and worse accuracy at the far end point B. At the far end point B, the original T model gives the optimistic delay estimations, while the original II model gives the pessimistic results. The results based on improved T and II models are more agreeable to the exact values at the load end.

The *third* example is an H-shaped clock tree whose schematic is shown in Fig. 10, with the feature size  $.25 \ \mu m$  of MOSIS/TSMC technique. All the inverters have the channel width  $10 \ \mu m$  for the PMOS and the channel width  $5 \ \mu m$  for the NMOS, and each of the inverters at leaf has a load capacitance of  $200 \ fF$ . The distributed parameters for the interconnects are shown in Table. 1. Assuming that the input is



Figure 9: Responses at points A and B of CMOS circuit with length of 4 mm composed of RC interconnect.



Figure 10: H-shaped clock tree. Each path from root to leaf is composed of 5 metal interconnects.

an impulse with  $t_r = t_f = 50 \ ps$ , the responses at a leaf point G are calculated by incorporating the models in Figs. 2-4 into the HSPICE frames [16], as shown in Fig. 11.

Fig. 11 shows that the original T model gives the optimistic delay estimation, and that the original  $\Pi$  model gives the pessimistic results, while the improved T and improved  $\Pi$  models give more accurate results, which agrees with the analysis in the first and second examples. The accuracy of 3rd-order AWE models is lower than those of the improved T and improved  $\Pi$  models, and also lower than the original T and original  $\Pi$  models.

In the third example, the run times based on the improved T and improved  $\Pi$  models are comparable to those based on the original T and original  $\Pi$  models, each of which is 1 *second* at the time step of 10 *ps*.

| Line | metal | length<br>(mm) | width $(\lambda)$ | area/fringe cap. $(aF/\mu m^2)$ | resistance $(\Omega/sq)$ |
|------|-------|----------------|-------------------|---------------------------------|--------------------------|
| AB   | M4    | 6              | 8                 | 9/42                            | 0.07                     |
| BC   | M3    | 3              | 8                 | 13/56                           | 0.07                     |
| CD   | M4    | 3              | 8                 | 9/42                            | 0.07                     |
| DE   | M3    | 1.5            | 8                 | 13/56                           | 0.07                     |
| EF   | M2    | 1.5            | 8                 | 19/60                           | 0.07                     |

Table 1: Parameters for the clock tree with MOSIS/TSMC  $.25 \ \mu m$  technique.



Figure 11: Transient responses at a leaf (point G) of RC clock tree with .25  $\mu m$  feature size. Among the output curves, from left to right, they are the results of  $\Pi$  model with 3rd order AWE, original T, improved T, exact, improved  $\Pi$ , original  $\Pi$ , and improved  $\Pi$  model with 3rd order AWE models.

## 5 Conclusions

The improved T and improved  $\Pi$  models are proposed for on-chip interconnect macromodeling. Efficient approximation frames with compound-order accuracy are obtained by using global approximations. Applying the approximation frames to distributed RC interconnects leads to simple equivalent circuit macromodels, which resemble original T and original  $\Pi$  models. Based on the improved  $\Pi$  model, the symmetric-structured  $\Pi$  model with AWE is derived by matching the first three moments of the driving point admittance of an open-ended interconnect. Although introducing negative resistance or negative capacitance in the equivalent models, the presented modeling approaches are theoretically proved to preserve passivity. The new models for on-chip distributed RC interconnects are independent of CMOS gates, and can be directly incorporated as subcircuits into SPICE simulators. By incorporating the presented models into HSPICE frames, circuits with the feature sizes of .18  $\mu m$  and .25  $\mu m$  are simulated. Numerical results show that the computational costs of the presented models are

comparable to those of original T and original II models. The improved T and improved II model have higher accuracy to evaluate the interconnect delay than the II models with AWE. The presented macromodel is useful for efficient simulation and layout optimization.

### References

- [1] A. Deutsch, "Electrical characteristics of interconnections for high-performance systems," *Proc. IEEE*, vol. 86, no. 2, pp. 315–355, 1998.
- [2] A. R. Djordjevic, T. K. Sarkar, and R. F. Harrington, "Analysis of lossy transmission lines with arbitrary nonlinear terminal networks," *IEEE Trans. Microwave Theory Tech.*, vol. 34, no. 6, pp. 660–666, 1986.
- [3] P. Penfield and J. Rubinstein, "Signal delay in RC tree networks," in *Proc. DAC*, pp. 613–617, 1981.
- [4] P. R. O'Brien and T. L. Savarino, "Modeling the driving point characteristic of resistive interconnect for accurate delay estimation," in *Proc. ICCAD*, pp. 512–515, 1989.
- [5] J. Qian, S. Pullela, and L. Pillage, "Modeling the effective capacitance for the RC interconnect of CMOS gates," *IEEE Trans. Computer-Aided Design*, vol. 13, no. 12, pp. 1526–1535, 1994.
- [6] L. T. Pillage and R. A. Rohrer, "Asymptotic waveform evaluation for timing analysis," *IEEE Trans. Computer-Aided Design*, vol. 9, no. 4, pp. 352–377, 1990.
- [7] E. Chiprout and M. S. Nakhla, "Analysis of interconnect networks using complex frequency hopping," *IEEE Trans. Computer-Aided Design*, vol. 14, no. 2, pp. 186– 200, 1995.
- [8] P. Feldman and R. W. Freund, "Efficient linear circuit analysis by Pade approximation via the Lanczos process," *IEEE Trans. Computer-Aided Design*, vol. 14, no. 5, pp. 639–649, 1995.
- [9] K. J. Kerns and A. T. Yang, "Stable and efficient reduction of large, multiport RC networks by pole analysis via congruence transformations," *IEEE Trans. Computer-Aided Design*, vol. 16, no. 7, pp. 734–744, 1997.
- [10] T. Dhaene and D. D. Zutter, "Selection of lumped element models for coupled lossy transmission lines," *IEEE Trans. Computer-Aided Design*, vol. 11, no. 7, pp. 805– 815, 1992.
- [11] C. Canuto, *Spectral methods in fluid dynamics*. New Youk: Springer-Verlag, 1988.
- [12] R. F. Harrington, *Field Computation by Moment Methods*. Macmillan, NY, 1962.
- [13] A. B. Kahng and S. Muddu, "Efficient gate delay modeling for large interconnect loads," in *Proc. MCMC*, pp. 202–207, 1996.
- [14] H. B. Bakoglu, Circuits, Interconnections, and Packaging for VLSI. Addison-Wesley Publishing Company, 1990.
- [15] R. W. Newcomb, *Linear Multiport Synthesis*. New York: McGraw Hill, 1966.
- [16] Avant!, Star-HSPICE Manual. Avant! Corporation, 2000, 46871 Bayside Parkway, Fremont, CA 94538.