# Signal Integrity Optimization on the Pad Assignment for High-Speed VLSI Design\*

Kai-Yuan Chao<sup>†</sup> Intel Corporation Hillsboro, Oregon, 97124

#### Abstract

Pad assignment with signal integrity optimization is very important for high-speed VLSI design. In this paper, an efficient method is proposed to effectively minimize both simultaneous switching noise and crosstalk that are inevitably caused by package inductance and capacitance during the design of high-speed/high-bandwidth circuits. Due to its efficiency, our algorithm can be incorporated into existing circuit floorplanning and placement schemes for the co-design of VLSI and packaging. For a set of industrial circuits/packages tested in our experiment, on the average, our method achieves a 16.8% reduction of total electrical noise when compared with the conventional design rule of thumb popularly used by circuit designers.

### **1** Introduction

As a result of dramatic increases of input/output (I/O) bandwidth, circuit speed, and transistor count, the signal delay and degradation due to packaging are so significant that they become the most demanding design challenges [5]. Higher bandwidth calls for a larger number of pin counts while faster clock rate requires shorter signal transition time. On the other hand, almost all packages, when used at high speeds, suffer from problems with lead inductance and lead capacitance [8] while the driving current passing through a package lead is not decreasing due to the trend of system integration. Thus, the simultaneous switching noise is a very important problem since numerous I/Os switch at the same time with very high transient current. Simultaneous switching noise can cause inadvertent logic transition faults because of the reference voltage fluctuation on ground lines and can decrease the device performance because of the reduction of driving power [1]. Moreover, the coupling noise (or crosstalk) for high-speed circuits in the very-fine-pitch high-lead-count packages should be considered due to adjacent long signal traces inside the packages [9]. Similarly, crosstalk can cause logic failure because of the false signal induced on quiet lines and can increase delay due to higher effective capacitance and inductance on D. F. Wong

Department of Computer Sciences University of Texas at Austin, Austin, Texas, 78712

signal lines. In the worst scenario, crosstalk and simultaneous switch noise can be added together as shown in Fig. 1. These electrical noises inevitably affect packages for very-high-bandwidth circuits. Additionally, both coupling and simultaneous switching noises become more critical during packaging on account of the ever decreasing noise margin caused by using scaling down technologies [15] and low-voltage schemes [16] for the high-performance/dense or low power requirements.



Fig. 1 Coupling and Simultaneous Switching Noise

For a high-performance VLSI design without considering its housed package, the chip may fail to deliver its expected performance [9] or may fail to function in the worst case situation [5]. Concurrent design of chip packaging and VLSI systems can satisfy system specification and optimize the design cost [2], and therefore it is indispensable to investigate the interface problem such as the pad assignment in-between chips and their packages during the early design stages. In this paper, a novel and very important pad assignment problem is studied and an effective algorithm is proposed to minimize the electrical noises. Our efficient method can be integrated into existing floorplanners or placement schemes for the co-design of high-speed VLSI circuits and packaging.

Few pad assignment studies are available in the CAD area except for single-layer routing of power nets [12] and wire-length/delay minimization before or after placement [14, 20]. However, none of them has considered the signal integrity and interactions during packaging.

<sup>\*</sup> This work was partially supported by Texas Advanced Research Program under Grant No. 003658459.

<sup>&</sup>lt;sup>†</sup> This work was performed when the author was with the Department of Electrical and Computer Engineering, University of Texas at Austin.

For the surface mount packages such as the quad flatpack packages (QFPs) popularly used in mid-end machines and chip-on-board packages such as the tapeautomated bonds (TABs) used in mid-end to high-end designs, one signal lead-frame is used to connect chip pads to package pins<sup>1</sup> without extra ground or routing planes in most cases. Accordingly, the assignment of pads of a chip uniquely defines the signals allocated to package leads on account of the fixed lead traces. Because the non-uniform layout of leads of such packages (e.g., more than 100%) inductance variation in inductance values can be found in Plastic OFPs and TABs), the assignment of off-chip pads affects the final noise level of a chip significantly. Therefore, during the early stages of co-design of circuits and packaging, the layout of I/O pads should be indubitably evaluated to avoid product failures and to optimize design costs. Recently, a noise estimation and reduction scheme is proposed for single-layer surface mount packaging [21]. Without considering signal coupling noise, simultaneous equations are used to estimate the ground noise for a given lead assignment while the interactions with chip floorplans or placements are neglected. Furthermore, due to the high time complexity for solving the system equations of each assignment generated by a simulated annealing scheme, their method is not suitable for large packaging designs (only a small 44-pin package data is actually reported in their paper). It is even impossible to incorporate such a time-consuming method into a circuit floorplanner or placement where a huge number of possible layouts need to be evaluated and hence is already difficult enough without considering signal integrity requirements. Accordingly, in this paper, an efficient algorithm is provided to solve this newly formulated noise minimization pad assignment problem.

### **2 Problem formulation**

Simultaneous switching noise and crosstalk are the most prevalent internal noises of chips [1]. Crosstalk is caused by the fast signal transition on an active line through mutual capacitive and inductive coupling to its adjacent passive lines. Crosstalk noise depends not only on the geometrical parameters such as the adjacency between the lines, the effective parallel length between the lines, and the distance to their reference plane, but also depends on the signal transition speed on the active line. For a given package lead-frame, the geometrical information for every pair of signal leads are fixed and available. Hereafter, we use the conventional first derivative signal transition model [1, 8] to analyze the signal integrity. Consequently, the (capacitive and inductive) crosstalk induced on a quiet lead *j*,  $\sigma_{ii}$ , by the active lead *i* is defined as follows [8].

$$\sigma_{ij} = C_{ij} \frac{dV}{dt} Z_j + L_{ij} \frac{dI_i}{dt} = C_{ij} \frac{\Delta I_i Z_i Z_j}{T_r} + L_{ij} \frac{\Delta I_i}{T_r}$$
(1)

where  $V_i$ ,  $I_i$ , and  $Z_i$  are the voltage, current, and impedance on an active lead *i* respectively,  $Z_i$  is the impedance on lead j,  $T_r$  is the signal rise time,  $C_{ij}$  is the mutual capacitance between lead *i* and lead *j*, and  $L_{ij}$  is the mutual inductance between two leads. Note that the mutual inductance  $L_{ii}$  and the mutual capacitance  $C_{ii}$  are complicated functions of materials and the geometrical parameters. Simultaneous switching noise is caused by inductances inherent in the power distribution lines [1]. With the return current (induced by a sudden signal transition) passing through unavoidable package inductances of power/ground (P/G) leads connecting between chip pad to circuit board (or MCM) P/G planes, the reference voltage level is shifted. When many drivers switch simultaneously, the current supplied by the power lines can change rapidly and the resulting voltage glitch is significant. Due to the trend of increases in circuit speed, bus width (I/O bandwidth), and system integration, the simultaneous switching noise has been an overwhelming concern for high-speed circuits. The induced noise depends on the number of drivers switched simultaneously, the amount of transient current, and the effective inductance of the power lines. The following equation describes the induced switching noise,  $\omega_i$ , on a ground (power) lead<sup>2</sup> i.

$$\omega_i = L_i \frac{dI_i}{dt} = L_i \frac{\Delta I_i}{T_r}$$
(2)

where  $L_i$  is the effective inductance on the lead *i*,  $I_i$  is the return current on lead *i*, and  $T_r$  is the signal rise time. For the worst case noise analysis, the lead's self inductance can be used as the lead's effective inductance. Note that the return current can be treated as the active current with respect to the noise effects. Henceforth,  $I_i$  represents the current or return current through lead *i*.

There are two types of pads on a P/G network, namely signal pad and P/G pad. The set of signal pads are denoted as S, and the set of P/G pads are denoted as PG, hereafter. In order to reduce the total effective inductance, many P/G pads are required to connect a P/G network to the off-chip P/G planes in the high-speed circuit design. Moreover, a good rule of thumb for designing a high-speed system is to keep the |PG|/|S| ratio around 1/3 [1]. Therefore, it is important to assign these I/O pads appropriately for avoiding electrical noises among a huge number of various I/O pad layouts. Let the set of package leads be L, and the subset of leads assigned to signal pads be  $l_S$ , and the subset of package leads assigned to P/G pads be  $l_{PG}$ , i.e.,  $L = l_S \cup l_{PG}$  and  $l_S \cap l_{PG} = \emptyset$ . Note that the coupling noise on a susceptive signal line is the cumulative noise induced by all other active signal leads. Total crosstalk

<sup>&</sup>lt;sup>1</sup> For avoiding confusion, in this paper, we used *lead* for the pin of a package and *pad* for the off-chip connecting pin (driver/receiver or power/ground) where *pin* can mean lead, pad, or both depending on the context.

 $<sup>^2</sup>$  In high frequency domain, the power and ground are treated the same regarding the induced transient current.

 $(\Phi_S \text{ in Eq. (3)})$  is the sum of coupling noise on every signal line that is both susceptive and active for the worst case analysis. The simultaneous switching noise of a P/G network,  $\Phi_{PG}$  in Eq. (3), is the sum of all switching noises on P/G leads when signals switch at the same time. Therefore, the total noise,  $\Phi$ , over the package leads can be defined by the following equation since the simultaneous switching noise and crosstalk are independent and additive.

$$\Phi_{S} = \sum_{i \in l_{S}} \sum_{\substack{j \in l_{S} \\ j \neq i}} \sigma_{ij} \qquad \Phi_{PG} = \sum_{i \in l_{PG}} \omega_{i}$$

$$\Phi = \alpha \Phi_{S} + \beta \Phi_{PG} = \alpha \sum_{i \in l_{S}} \sum_{\substack{j \in l_{S} \\ j \neq i}} \sigma_{ij} + \beta \sum_{i \in l_{PG}} \omega_{i} \qquad (3)$$

where  $\alpha$  and  $\beta$  are the scaling coefficients of coupling noise and simultaneous switching noise, respectively.

For a P/G network with *n* I/O pads and corresponding transient currents (n=|S|+|PG|), we assign these I/O pads to the given *n* package leads. Note that for a redundant (non-used) lead, a redundant signal (or P/G) pad with zero transient current is given. Let *f*: {1, 2, ..., *n*}  $\rightarrow$  {1, 2, ..., *n*} be a 1-1 function representing the pad assignment. We have f(i) = j if I/O pad *i* is assigned to package lead *j*. Without loss of generality, the signal pad *i* is indexed by 1, 2, ..., *N*. The total noise  $\Phi$  can be rewritten in the following.

$$\Phi = \alpha \sum_{\substack{i=1\\j\neq i}}^{|S|} \sum_{\substack{i=1\\j\neq i}}^{|S|} \iota_i X_{f(i)f(j)} + \beta \sum_{\substack{i=|S|+1}}^n \iota_i X_{f(i)}$$
(4)

where  $t_i = \frac{\Delta I_{f(i)}}{T_r}$  as appeared in Eq. (1) and Eq. (2) is the

*transient current* passed through pad *i*,  $X_{ij} = C_{ij}Z_iZ_j + L_{ij}$  as appeared in Eq. (1) is the coupling noise coefficient between lead *i* and lead *j*, and  $X_i = L_i$  as appeared in Eq. (2) is the *lead inductance coefficient*. For convenience, we define an  $n \times n$  inductance/capacitance induced noise *coefficient matrix*, [X], of a package's lead-frame with  $X_{ii}$  =  $X_i$  (and note that  $X_{ij} = X_{ji}$ ). To minimize the total noise, the noise minimization pad assignment problem (NMPAP) can be formally addressed as follows: given a P/G network,  $N=S \cup PG$ , with the transient currents,  $t_1, t_2, ...,$  $t_n$ , on its *n* I/O pads, and given the set of *n* package leads, L, with its corresponding noise coefficient matrix  $[X]_{n \times n}$ , to be allocated to this network, then the noise minimization pad assignment problem is to find a one-to-one assignment function f:  $N \rightarrow L$ , such that the total noise  $\Phi$  (total coupling plus total simultaneous switching noises) defined in Eq. (4) is minimized.

With the observation from Eq. (3) that once the set of leads,  $l_S$  ( $l_{PG}$ ), to be assigned for signal pads (P/G pads) is known, the assignment of signal pads (P/G pads) to leads in  $l_S$  ( $l_{PG}$ ) that minimizes total crosstalk (simultaneous switching noise) is reduced to a linear assignment problem as described in Section 4. It is therefore natural to further

divide NMPAP into two optimization problems: *lead* partition problem (LPP) in the first phase and *lead* assignment problem (LAP) in the second phase as shown in Fig. 2. In the first phase, due to the unknown arrangement of pads over package leads, the average transient current is hence used. Accordingly, the LPP is to partition the given lead set, *L*, with *n* package leads for the P/G network *N* into two subset of leads,  $l_S$  and  $l_{PG}$ , where  $|l_S| = |S|, |l_{PG}| = |PG|$ , and  $l_S \cap l_{PG} = \emptyset$ , such that the cost  $\phi$  defined in the following equation is minimized.

$$\phi = \alpha \sum_{\substack{i \in I_S \ j \in I_S \\ j \neq i}} \sum_{\substack{I_S X_{ij} + \beta \\ i \in I_{PG}}} \iota_{PG} X_i$$

$$= \alpha' \sum_{\substack{i \in I_S \ j \in I_S \\ i \neq i}} X_{ij} + \beta' \sum_{\substack{i \in I_{PG}}} X_i$$
(5)

where  $\iota_S$  and  $\iota_{PG}$  are the average transient currents of signal pads and P/G pads, respectively,  $\alpha' = \alpha \iota_S$ , and  $\beta' = \beta \iota_{PG}$ . The LAPs is to assign signal and P/G pads to the given set of leads,  $l_S$  and  $l_{PG}$ , respectively, i.e., finding the two assignment functions, g and h, such that the cost  $\Phi_S$  and  $\Phi_{PG}$  defined in Eq. (6) (derived from Eq. (3) and Eq. (4)) are minimized.

$$\Phi_{S} = \sum_{\substack{i=1\\j\neq i}}^{|S|} \sum_{\substack{i=1\\j\neq i}}^{|S|} \iota_{i} X_{g(i)g(j)}, \quad g(i) \in l_{S}$$

$$\Phi_{PG} = \sum_{\substack{i=|S|+1\\i=|S|+1}}^{n} \iota_{i} X_{h(i)}, \quad h(i) \in l_{PG}$$
(6)



rig. 2 Trivit rif and the Related Err) and r

## **3** Package lead partition

We formulate the LPP into a minimum (edge-)weighted k-clique problem (MWKCP) which is to find a k-clique with the minimum sum of its edge weights from a given complete graph with *n* vertices, 1 < k < n. First, we construct a complete graph with *n* vertices named lead-graph, G=(V, E), for the given *n* leads and let the vertex *i*,  $i \in V$ , be the lead *i* as shown in Fig. 3. Then, the edge weight w(i, j), for an edge  $(i, j) \in E$ , is given by  $\alpha' X_{ij} + \beta' (2A \cdot X_i - X_j)/k$ , where *A* is a constant,  $A \ge \max{X_i, 1 \le i \le n}$ , for providing positive weights.



Fig. 3 A Lead Partition Problem with |*S*|=4, |*PG*|=2 and its Constructed Lead Graph (where vertices {1, 3, 4, 5} form a minimum weighted 4-clique for signal leads)

| Algorithm $Swap(l_s, l_{PG})$                                                                                                                                                                                                                                                                                                                                                  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <ol> <li>do while (improve==true)<br/>for <i>i</i>=1 to min( <i>l<sub>s</sub></i> ,  <i>r<sub>pc</sub></i>)<br/>swap the most profitable lead pair<br/>from the unlocked lead;<br/>lock the swapped lead pair and<br/>update the cost and gain;<br/>get the largest partial sum of gain;<br/>if(gain&gt;0)<br/>move the swap pairs; update cost;<br/>improve=flase;</li> </ol> |
| Algorithm $NMPAP(L, [X], S, PG)$                                                                                                                                                                                                                                                                                                                                               |
| 1. construct the lead-graph $G=(V, E, w)$<br>2. $l_s=V\_Add(G, [S], w); l_{PG}=L \backslash l_s;$<br>3. $Swap(l_s, l_{PG});$<br>4. $Assign(l_s, S); Assign(l_{PG}, PG);$                                                                                                                                                                                                       |
|                                                                                                                                                                                                                                                                                                                                                                                |

Fig. 4 Algorithms

**Theorem 1** Let *C* be a minimum weighted k-clique of the lead graph *G*, where k=|S|, is the number of signal pads. Then,  $l_S = \{\text{lead } i \mid \text{vertex } i \in C\}$  and  $l_{PG} = L \setminus l_S$ , is the optimal solution (partition) of the corresponding LPP.

Theorem 1 guarantees that if a minimum weighted |S|clique can be found in the constructed lead-graph from LPP, the optimal partition of package leads can be determined [3]. However, finding the minimum weighted *k*-clique of a lead-graph constructed from LPP is an NPhard problem since the well known (maximum) clique problem is reducible to the decision-version MWKCP [3].

The algorithm to solve MWKCP contains a vertex adding procedure called  $V_A dd$ , and a K-L type improvement procedure called *Swap* as shown in Fig. 4. The  $V_A dd$  is to pickup a vertex with the least sum of edge weights with respect to all the current *m* vertices in the clique *C* that is the designated minimal weighted *m*-clique, and add it to *C*, iteratively till m=k. This efficient heuristic, however, provides a good sub-optimal solution to MWKCP and LPP by the following lemma and theorem [3]. Also, some definitions are provided due to the probabilistic analysis used in the theorem.

**Definition** [11] Let  $Y_n$  and  $W_n$  be sequences of real numbers. For any *n* and  $\varepsilon$ ,  $\varepsilon > 0$ , consider two propositions: (1)  $Y_n \le W_n + \varepsilon |Z_n|$ , and (2)  $Y_n \ge W_n - \varepsilon |Z_n|$ . If  $\forall \varepsilon > 0$ , both (1) and (2) hold except for finitely many *n*, we denote  $Y_n \sim W_n$ . Now let  $Y_n$  and  $W_n$  be sequences of random variables with the assumption that variables with different indices are independent. Note that now (1) and (2) are events rather than simple predicates. Let  $P_1(n, \varepsilon)$  be the probability that (1) fails and  $P_2(n, \varepsilon)$  are the probability that (2) fails. If both  $P_1$  and  $P_2$  approach zero for all  $\varepsilon$ >0 as *n* approaches infinity, we write  $Y_n \sim W_n$  "in probability" and use "pr." for abbreviation.

**Lemma 1** [11, 17] Given the (edge) weight random variable of an n-vertex complete graph with a normal distribution,  $N(0, \sigma)$ , the cost (sum of edge weights) of a minimum edge-weighted k-clique,  $W_{opt}$ , k < n, is  $-k\sigma\sqrt{(k-1)\log n}$  in probability when  $n \to \infty$  (i.e.  $W_{opt} \sim$  $-k\sigma\sqrt{(k-1)\log n}$  (pr.)), and the V\_Add procedure returns a sub-optimal clique with cost  $W_{V_Add} \sim$ 

$$-(2+\sum_{i=2}^{k-1}\sqrt{2i})\sigma\sqrt{\log n} \ (pr.).$$

**Theorem 2** Let the  $X_i$  and  $X_{ij}$  be two independent random variables under normal distributions, V\_Add can asymptotically derive a k-clique with optimal cost (i.e. minimum weighted k-clique for a given lead-graph) in probability if the mean of random variable  $X_{ij}$  is not zero. In the worst case, V\_Add can achieve a result with cost less than 6% off the optimal value when n and  $k \rightarrow \infty$ .

The initial partition algorithm is very effective (asymptotically optimal) since the coupling coefficients,  $X_{ij}$ 's, are positive in a real package. In spite of the effectiveness and efficiency of this initial partition procedure guaranteed by Theorem 2, we can still improve the result by using a group migration scheme similar to the regular K-L procedure used in circuit partition [10] due to the finite size of our problem (e.g. the maximum pin count of a package is less than thousands at present). The procedure Swap in Fig. 4 uses the result derived from *V\_Add* as its initial partition, where  $l_S = \{i \mid i \in C, C \text{ is the }$ minimal clique derived from  $V\_Add$  and  $l_{PG} = L \setminus l_S$ . The gain for swapping two leads is the difference of the cost function in Eq.(5) after and before swapping. With iteratively swapping the most profitable lead pair (i, j),  $i \in l_S$ ,  $j \in l_{PG}$ , and locking lead *i* and *j*, a list of swapping gains can be derived. The first p swaps with the largest non-negative sum of first p lead pairs of the gain list are performed to achieve a better solution. By executing the swap-iteration mentioned above for several times, the final lead partition can be found and used as the solution of LPP. Because of the effective initial partition, the number of swapping iterations is no larger than 3 for all the tested circuits in our experiment. V Add takes  $O(n^2)$  time while Swap can be implemented in  $O(n^2)$  time if one lead is moved at a time in each iteration. Accordingly, this proposed algorithm solves LPP in  $O(n^2)$  time.

### 4 Pad assignment algorithm

After partitioning the package leads, the remaining task is to assign signal pads and P/G pads to leads in set  $l_S$  and set  $l_{PG}$  respectively, such that the cost function in Eq. (6) is minimized. Here, the leads and pads are re-indexed from 1 to *m* without loss of generality, where m = |S|,  $l_S$  is the set of leads for assigning signal pads, and *S* is the set of pads for signal LAP (m = |PG|,  $l_{PG}$  is the set of leads for P/G pads, and PG is the set of pads for P/G LAP). With being formulated as a bipartite graph  $G=(S, l_S, E), E=\{(i, j)\}$ 

$$| i \in S, j \in l_S$$
, and the edge weight  $w(i, j) = \iota_i \sum_{k=1, k \neq j}^m X_{jk} \},$ 

the signal LAP is indeed a linear assignment problem which can be solved by a technique such as the network flow method or the bipartite weighted matching with the time complexity  $O(n^3)$  [10]. The same method also optimally solves the P/G pad assignment case with  $w(i, j) = t_i X_j$ . However, by the following theorem [3], the LAP, a special case of linear assignment problems derived from the NMPAP, can be solved optimally in much less time complexity.

**Theorem 3** *LAP can be solved optimally in* O(nlogn) *time complexity by assigning the pad with the p-th largest*  $t_i$  *to the lead with the p-th smallest*  $x_i$  *accordingly, where*  $x_i$ 

$$= \iota_i \sum_{k=1, k \neq j}^m X_{jk} \text{ in signal LAP and } x_j = X_j \text{ in } P/G \text{ LAP.}$$

After combining the algorithms for lead partition problem and two corresponding lead assignment problems, the noise pad assignment problem can be solved by the procedure NMPAP shown in Fig. 4.

### **5** Extensions

Since our algorithms solve LPP in  $O(n^2)$  time and both LAPs in  $O(n\log n)$  time, the noise minimization pad assignment problem can be solved in  $O(n^2)$  efficiently. Thus, this proposed algorithms can be incorporated into existing floorplanning or placement schemes to evaluate the induced package noise for each desired layout topology since the locations of pads interact with the floorplan/placement significantly. For the single P/G network chip design which means all the circuit modules are on the same P/G network, the assignment of off-chip pads from NMPAP can be used as the input for circuit placement schemes such as forced-directed and mathematical programming approaches where the assignment of off-chip I/Os must be known first for avoiding that internal modules collapse to the center [14].

For the purpose of reducing power line noise on both chip and package levels (for both DC and AC noises), there are multiple P/G networks on advanced highbandwidth or noise-sensitive designs [1]. In order to minimize the on-chip DC noise, area of AC return current loop on power lines, and routing resources used for P/G networks on dedicated P/G layers, a floorplanner is proposed to group modules on the same P/G network together for all evaluated slicing topologies [4]. Therefore, the remaining task to minimize the most important off-chip AC noises (i.e. crosstalk and simultaneous switching noise) caused by the package leads used in P/G networks, which is the noise minimization pad assignment discussed in this paper. Given a floorplan tree, the regions of P/G networks can be defined. The corresponding pads used by each P/G network are defined as adjacent pads to each P/G network due to minimization of wiring delay, routing resource, and off-chip AC current loop [3]. The assignment of pads from NMPAP gives the evaluation of minimum AC noises caused by a given floorplan topology among numerous possible floorplans required to be appraised together with other design objectives such as area and wire-length. On account of the efficiency of our algorithm for solving NMPAP, the overall order of floorplanning time complexity does not increase [3].

For pad assignment after placement or floorplanning, the I/O pins on circuit modules are fixed and therefore the wiring distance between a pin and a pad can be measured [20]. Consequently, the interaction with both off-chip packaging noise and on-chip signal delay approximated by pin-to-pad distance can be considered. After lead partition, the assignment of signal pads in signal LAP should be formulated as a regular linear assignment problem as described in Section 4 with the edge weight

$$v(i, j) = \iota_i \sum_{k=1, k \neq j}^m X_{jk} + d_{ij}$$
 in the constructed bipartite

graph, where  $d_{ij}$  is the distance from signal pin *i* to the pad connecting to lead *j*. Then, the network flow or bipartite weighted matching method can optimally solve this signal LPP in O( $n^3$ ) for post-assignment noise/delay optimization. The proposed method can also be extended to consider current polarity and feedback effect of P/G networks efficiently [3].

### 6 Experimental results and conclusions

The programs are written in C and tested on a SUN IPX machine configured with 16MB memory. Several types of packages from the industry including shrink small-outline package (SSOP), thin quad-flat package (TOFP), and metric quad-flat package (MQFP) for surface mount designs, as well as TAB for direct die-mounting designs are tested in our experiment are shown in Table 1. MQFP160a and MQFP160b are two different packages for the same circuit while TAB456a and TAB456b are the same package contains different circuits. The SSOP, TQFP, MQFP, and TAB are package data from the industry [18, 19]. The coupling coefficient is derived from experimental data in [7, 13] and data from package venders. The transient current data are generated based on the data in [1] to resemble a high-speed CMOS circuit. The scaling coefficients  $\alpha$ =1.0 and  $\beta$ =0.2 are based on the industrial experimental data [16]. For comparison, the uniform P/G distribution which is a popular design rule of thumb to minimize package simultaneous switching noise [1, 8] as well as coupling noise [6, 22], and a random assignment are tested to demonstrate the effectiveness of our method.

| circuits | # leads/pkg | lead L (nH) | # P/G ntw | max. # pads/ntw |
|----------|-------------|-------------|-----------|-----------------|
| SSOP28   | 28          | 2.75-15.08  | 1         | 28              |
| SSOP56   | 56          | 3.07-8.62   | 1         | 56              |
| TQFP100  | 100         | 4.84-6.08   | 3         | 50              |
| MQFP160a | 160         | 5.4-20.0    | 3         | 80              |
| MQFP160b | 160         | 5.1-14.0    | 3         | 80              |
| MQFP208  | 208         | 6.0-12.0    | 4         | 80              |
| TAB344   | 344         | 9.7-14.1    | 1         | 344             |
| TAB416   | 416         | 8.4-12.1    | 6         | 200             |
| TAB456a  | 456         | 12.5-18.3   | 3         | 350             |
| TAB456b  | 456         | 12.5-18.3   | 1         | 456             |

Table 1 Benchmark Circuits/Packages

| circuits | random    | uniform  | NMPAP     |           |
|----------|-----------|----------|-----------|-----------|
|          | noise     | noise    | noise     | time(sec) |
| SSOP28   | 1195.51   | 1037.13  | 882.65    | 0.22      |
|          | (+15.27%) | (0%)     | (-14.89%) |           |
| SSOP56   | 3284.91   | 2876.91  | 2365.54   | 0.59      |
|          | (+14.18%) | (0%)     | (-17.77%) |           |
| TQFP100  | 5209.81   | 4548.80  | 4035.10   | 1.21      |
|          | (+14.53%) | (0%)     | (-11.29%) |           |
| MQFP160a | 11167.51  | 9296.53  | 6555.95   | 2.74      |
| -        | (+20.13%) | (0%)     | (-29.48%) |           |
| MQFP160b | 8404.31   | 7018.63  | 5198.66   | 2.62      |
| -        | (+19.74%) | (0%)     | (-25.93%) |           |
| MQFP208  | 11708.04  | 9953.94  | 8565.49   | 4.19      |
| -        | (+17.62%) | (0%)     | (-13.95%) |           |
| TAB344   | 36354.44  | 30520.23 | 26896.58  | 13.36     |
|          | (+19.12%) | (0%)     | (-11.87%) |           |
| TAB416   | 41725.31  | 34872.80 | 30228.52  | 15.08     |
|          | (+19.65%) | (0%)     | (-13.32%) |           |
| TAB456a  | 51946.25  | 44627.66 | 38967.16  | 20.30     |
|          | (+16.40%) | (0%)     | (-12.68%) |           |
| TAB456b  | 70447.45  | 59298.50 | 49589.92  | 21.30     |
|          | (+18.80%) | (0%)     | (-16.37%) |           |

Table 2 Comparison of Different Assignment Methods

Table 2 shows the results obtained by applying different methods on various benchmark circuits. Traditional rule of thumb that placed P/G pads uniformly, i.e., signal pads are sandwiched by P/G pads evenly, is indeed a simple but good guideline to reduce packaging or plane-connecting electrical noises from designers' experiences (e.g., 17.5% less than that by assigning I/O pads randomly as shown in our experimental results). However, our efficient pad assignment algorithm not only formally solves the noise minimization problem for packaging, but also significantly outperforms the traditional design rule of thumb by further reducing 16.8% of total noise. In addition to its effectiveness, this proposed algorithm is efficient enough to be incorporated into circuit floorplanning or placement schemes for the co-design of VLSI and packaging. For instance, in our experiment, our algorithm takes only 21.3 second for a large TAB with 456 pins on a low-end workstation (slower than current PCs) with system speed rated at 21.8 SPECint92 and 21.5 SPECfp92.

For the first time, the important noise minimization pad assignment problem that interacts with both layout design and packaging design is formally addressed in this paper. Without using conventional design rule of thumb, we propose an efficient and effective pad assignment method to minimize both crosstalk and simultaneous switching noise unavoidably encountered in the high-speed/highbandwidth VLSI circuit design. Due to its efficiency, our algorithm can be combined with existing circuit floorplanning and placement schemes for the co-design of VLSI and packaging.

### References

[1] H. B. Bakoglu, *Circuits, Interconnections, and Packaging for VLSI*, Addison-Wesley Publ. Co., 1990.

[2] L. P. Cao and J. P. Krusius, "Concurrent Packaging Architecture Design," IEEE Trans. on Compon., Pkg, and Manufact. Tech. (CPMT), Part B, v. 18, n. 1, pp. 66-73, 1995.

[3] K.-Y. Chao, *Physical Design of VLSI with Electrical and Power Considerations*, Ph.D. Dissertation, UT-Austin, 1995.

[4] K.-Y. Chao and D. F. Wong, "Floorplanning for Low Power Designs" Proc. of ISCAS, v. 1, pp. 45-48, 1995.

[5] H. L. Davidson, "VLSI Packaging Requirements for New and Evolving Workstations," Proc. the 42nd Elec. Compon. & Tech. Conf. (ECTC), pp. 28-30, 1992.

[6] L. V. Hauwermeiren, M. Herreman, M. Botte, and D. D. Zutter, "Characterization and Modeling of Packages by Time-Domain Reflectometry Approach," IEEE Trans. on CHMT, v. 15, n. 4, pp. 478-482, 1992.

[7] P. Hoffman and D, Liang, "Development of a High Performance TQFP Package," Proc. of the 44th ECTC, pp. 57-62, 1994.

[8] H. W. Johnson and M. Graham, *High-Speed Digital Design*, PTR Prentice-Hall, Inc., pp. 66-82, 1993.

[9] R. Kaw, D. Yanaga, and F. Matta, "Mulitmetal DTAB Packages for a 125-MHz Computer," Proc. the 42nd ECTC, pp. 450-457, 1992.

[10] T. Lengauer, *Combinatorial Algorithms for Integrated Circuit Layout*, John, Wiley & Son Publ., Inc., 1990.

[11] G. S. Leuker, "Optimization Problems on Graphs with Independent Random Edge Weights," SIAM J. Comput., v. 10, n. 2, pp. 338-351, 1981.

[12] M. Marek-Sadowska, "Pad Assignment for Power Nets in VLSI circuits," IEEE Trans. CAD, v. 6, n. 4, pp. 550-560, 1987.

[13] A. Nakamura, J. Mano, T. Nagata, H. Shimizu, et. al., "Handling of Mutual Inductance in Simulation of Simultaneous Switching Noise," Proc. of the 44th ECTC, pp. 663-668, 1994.

[14] M. Pedram, K. Chaudhary, and E. S. Kuh, "I/O Pad Assignment based on the circuit Structure," Proc. of ICCD, pp. 314-318, 1991.

[15] R. Senthinathan and J. L. Prince, "Effect of Device and Interconnect Scaling on the Performance and Noise of Packaged CMOS Device," Proc. of CICC, pp. 11.3.1-11.3.5, 1990.

[16] R. Senthinathan, A. Mehra, M. Mahalingam, Y. Doi, et. al., "Electrical Packaging Requirements for Low-Voltage ICs-3.3V High-Performance CMOS Devices as a Case Study," IEEE Trans. on CPMT, v. 17, n. 4, pp. 493-504, 1994.

[17] W. Szpankowski, "Optimal Solution to Some Problems not Only on Graphs: A Unified Approach to Heuristic Algorithms through Order Statistics," Tech. Rep. TR-872, Dept. of CS, Purdue Univ., 1988.

[18] Advance Bus Interface SPICE I/O Models Data Book, Texas Instruments Inc., 1994.

[19] Package Selection Guide, VLSI Technology, Inc., 1994.

[20] D. C. Wang, "Pad Placement and Ring Routing for Custom Chip Layout," Proc. of the 27th DAC, pp. 193-199, 1990.

[21] J. Williamson, K. Kalaichichevlvan, M. Nakhla, Q. J. Zhang, et. al., "Ground Noise Estimation and Minimization in Integrated Circuit. Proc. of Electromegnetic

Integrated Circuit Packages," Proc. of Electromagnetic Compatibility, pp. 425-428, 1993.
[22] H. You and M. Soma, "Crosstalk Analysis of High-Speed

Interconnects and Packages," Proc. of CICC, pp. 11.2.1-11.2.4, 1990.